as I know the texture memory is cached and the global memory not.
What does make more sense?
- load data into texture memory, perform operation, output is in global memory because texture memory is read only. Then copy from global memory into texture memory for the next step and so on …
- load data into texture memory once and then only operate on global memory.
- don’t use texture memory if I don’t need features like interpolating or reading uchars as normalized floats?
I can’t estimate how much different this will make in speed.
But if I understand this right: If I copy data from global memory to texture memory I have to access every pixel twice because I need to get the data out of the texture memory again.
So it would be better to avoid device to device copy?