Problems using shared and texture memory confronting with same values taken form device memory

I use the texture memory like this:

texture<int, 2, cudaReadModeElementType> texType;

texType.addressMode[0] = cudaAddressModeClamp;

texType.addressMode[1] = cudaAddressModeClamp;

texType.filterMode = cudaFilterModePoint;

texType.normalized = false;

cudaBindTextureToArray (texType, typeCuArray, channelDesc_V);

SiteType temp2 = (SiteType)tex2D(texType, index%widthTex, index/widthTex);

and then:

X[tdx] = temp2;

where X is shared memory:

extern shared char shared_memory;

int K=(int)shared_memory;

VecR B=(VecR)&K[ntd] ;

SiteType X=(SiteType)&B[ntd];

and the shared memory dimensions are passed to the kernel as a configuration input together with number of blocks and number of threads for each block:

size_t sharedDim=(3*sizeof(float)+sizeof(int)+sizeof(SiteType))*n_t
hreads;

because VecR is a structure of 3 floats. SiteType is an enum structure:

typedef enum { WATER_TYPE = 0,

           CHOLINE_TYPE = 1,

          PHOSPHATE_TYPE = 2,

           GLYCEROL_TYPE = 3,

           ESTER_TYPE = 4,

          TAIL_TYPE = 5,

           SOLUTE_TYPE = 99

} SiteType;

and typeCuArray:

cudaArray *typeCuArray;

cudaMallocArray (&typeCuArray, &channelDesc_V, widthTex, widthTex);

cudaMemcpyToArray(typeCuArray, 0, 0, typeDev, sizeof(SiteType)*(nSites), cudaMemcpyDeviceToDevice);

[b]and typeDev is a structure with nSites elements of type SiteType.

The results I obtain using the shared and texture memory like this, differ and have some suspicious values with respect to the version of the program where I use directly the typeDev structure for obtaining the values I need which resides in the device memory.[/b]

[b]Does anybody have any idea on why using shared and texture memory like this is wrong?

Thank you in advance for your answers.[/b]

I use the texture memory like this:

texture<int, 2, cudaReadModeElementType> texType;

texType.addressMode[0] = cudaAddressModeClamp;

texType.addressMode[1] = cudaAddressModeClamp;

texType.filterMode = cudaFilterModePoint;

texType.normalized = false;

cudaBindTextureToArray (texType, typeCuArray, channelDesc_V);

SiteType temp2 = (SiteType)tex2D(texType, index%widthTex, index/widthTex);

and then:

X[tdx] = temp2;

where X is shared memory:

extern shared char shared_memory;

int K=(int)shared_memory;

VecR B=(VecR)&K[ntd] ;

SiteType X=(SiteType)&B[ntd];

and the shared memory dimensions are passed to the kernel as a configuration input together with number of blocks and number of threads for each block:

size_t sharedDim=(3*sizeof(float)+sizeof(int)+sizeof(SiteType))*n_t
hreads;

because VecR is a structure of 3 floats. SiteType is an enum structure:

typedef enum { WATER_TYPE = 0,

           CHOLINE_TYPE = 1,

          PHOSPHATE_TYPE = 2,

           GLYCEROL_TYPE = 3,

           ESTER_TYPE = 4,

          TAIL_TYPE = 5,

           SOLUTE_TYPE = 99

} SiteType;

and typeCuArray:

cudaArray *typeCuArray;

cudaMallocArray (&typeCuArray, &channelDesc_V, widthTex, widthTex);

cudaMemcpyToArray(typeCuArray, 0, 0, typeDev, sizeof(SiteType)*(nSites), cudaMemcpyDeviceToDevice);

[b]and typeDev is a structure with nSites elements of type SiteType.

The results I obtain using the shared and texture memory like this, differ and have some suspicious values with respect to the version of the program where I use directly the typeDev structure for obtaining the values I need which resides in the device memory.[/b]

[b]Does anybody have any idea on why using shared and texture memory like this is wrong?

Thank you in advance for your answers.[/b]

I also tried using only shared memory and not the texture memory (I loaded data directly from structure typeDev of the device memory, in the shared memory), and the results were correct. I guess the problem stands on texture memory.

Any idea to identify the problem please?

I also tried using only shared memory and not the texture memory (I loaded data directly from structure typeDev of the device memory, in the shared memory), and the results were correct. I guess the problem stands on texture memory.

Any idea to identify the problem please?

I also tried using only shared memory and not the texture memory (I loaded data directly from structure typeDev of the device memory, in the shared memory), and the results were correct. I guess the problem stands on texture memory.

Any idea to identify the problem please?