Hi all,
looking for your opinions on this:
I’m writing an image processing pipeline and I would like to use CUDA’s texturing features. The question is if I should cudaArray’s or pitch linear texturing.
Looking at the simplePitchLinearTexture example on my 9800 GT gives me these results implying cudaArray’s are faster:
[simplePitchLinearTexture.exe] starting...
Bandwidth (GB/s) for pitch linear: 4.06e+001; for array: 4.29e+001
Texture fetch rate (Mpix/s) for pitch linear: 5.07e+003; for array: 5.36e+003
[simplePitchLinearTexture.exe] test results...
PASSED
I cannot write to cudaArrays from kernels on my old hardware. I want to use texturing in future kernels so I need to use cudaMemcpy2DToArray to prepare the next texture. Modifying simplePitchLinearTexture to include the time of the copy changes the numbers significantly:
[simplePitchLinearTexture.exe] starting...
Bandwidth (GB/s) for pitch linear: 4.08e+001; for array: 2.18e+001
Texture fetch rate (Mpix/s) for pitch linear: 5.10e+003; for array: 2.72e+003
Seems like the copy really costs a lot. I guess if the kernel was doing something more complex the copy time would take a smaller fraction.
Am I missing something in my analysis? Seems like linear pitch texturing is the way to go.