I am using driver API cuTexRefSetAddress2D() for binding to texture. I get correct results as long as I bind to 32 bit accesses, but get wrong results if I bind to 16 or 8 bit accesses. The value read is in the vicinity of the correct location, but off by one or two bytes.
In an parallel setup, where I use the runtime API cudaBindTexture2D() for texture binding, I get correct results for all accesses (32, 16 and 8 bit).
Any pointers regarding this?
I am compiling for compute architecture 3.0 with CUDA 5.5 and running on GTX 650 Ti card, with Ubuntu on host PC.
Sample code and further details are posted here http://stackoverflow.com/questions/19315548/issues-with-cuda-texture-read-using-driver-api.