Hi,
I’m using CUDA to accelerate a medical physics algorithm and I have been told that using textures for read-only arrays would be a good way to optimize it.
In a first attempt using only global memory, I decided to linearize the 3D array in regard to the x coordinate, thinking that by using larger block sizes in x would make the threads fetch stuff in the memory that are close by. It did not work as expected, probably because there is A LOT of divergence in the kernel anyway…
Since the array I am trying to use does not need to change during computation, I was told that texture memory would be a good bet. I tried putting it in 1D memory, but it’s probably way too big for that… 128x128x128…
I would like to know what would be the best choice for me, since my array is already linearized. If it would be better for me to put it in a 2D array and use the simple texture example as a reference or use a more fancy approach using pitched pointer,channel and extent stuff.
Obviously I’m more of a physicist than a programmer so something simple would be great.
Thanks
edit: To be more precise, it’s a radiation raytracing algorithm that needs to read a (3D) density map along the way…