I was looking at some ptx code today and noticed something odd. A texture reference declared with
texture<float, 1, cudaReadModeElementType> tex when read with “float a = tex1Dfetch(tex, coord)” generates a tex.1d.v4.f32.s32 instruction in the ptx file for the read. This is a 4-component texture read! The unused 3 elements from the read are never referenced. Is this normal behavior??
Platform: CUDA 1.0 on AMD64 gentoo (using SUSE Linux Enterprise Desktop 10 binaries)
edut: I get the same behavior under winXP and CUDA 1.0