I am working on a CUDA project for my university major, and I’m quite new to the whole scene, so please bear with me!
This forum’s been very helpful in determining a direction for my solution to the problem, but I still had a few questions regarding handling large arrays.
My problem entails parallelisation of a calculation, where each loop only depends upon a few rows of an 100000*100 array.
Furthermore, I have structured the array such that the memory calls are localized. By this I mean
Loop 1 reads rows 1:50 (for example)
Loop 2 reads rows 1:53
Loop 3 reads rows 3:55
etc etc. Furthermore, each subsequent loop’s starting and ending row is greater than or equal to the previous loop.
My initial plan was to keep this array in Texture memory. Am I on the right track?
Thanks in advance