How to choose the good memory

I know that constant and texture have a cache. Texture is better when we use 2D-array.

In which case is it better to use texture, constant or global ? If you can illustrate your exemple with an example.

Can the global be as rapid as texture or constant ? In which case ?


Try searching this forum. There are several interesting benchmarking results published.

In general, it depends on the situation. Best way ti find is to experiment.
Constant memory is best when all threads in a warp access same variable.

Constant memory is almost always fastest, although there are some people that got a speedup by putting data in shared memory at the start of the kernel.
This assumes that you manage to put everything into constant memory (it’s a limited resource), of course. If not, you should try to put the part of the data required by the block in shared memory, and read it from global memory with coalesced reads at the beginning of the kernel.
If that isn’t possible, for example due to an unpredictable access pattern, use textures. Texture raw bandwidth is somewhat slower than global, but it does a kind of local caching, making it ideal for some access patterns. Also, you get bilinear filtering and edge extension/wrapping for free.