Main input data for my kernel is a simple int* array that a bind to the texture. The I use tex1dfetch to access the data.
Will it be beneficial to implement the texture reference not as a texture of ints:
texture<int, 1, cudaReadModeElementType> texInput;
but as a texture of, say, long ints:
texture<long int, 1, cudaReadModeElementType> texInput;
so reducing the number of memory accesses in two times (with consequent extraction of actual ints from the single long int with bit shifts or bit masks).
Also, is it possible to work with 128 bit ints (texture<long long, 1, cudaReadModeElementType> texInput) ?
Thanks in advance.