I was trying to incorporate texture into my kernel (only for cache purpose) but I ran into problems.
I am restricted to linear memory, as opposed to Cuda Array because of my large array (1million elements) and:
“For a one-dimensional texture reference bound to a CUDA array, the maximum width is 2^13;”
where as
“For a one-dimensional texture reference bound to linear memory, the maximum width is 2^27;”
Since I am new to this, I wrote a simple program to verify that I can fetch texture data but even that is not succeeding:
texture<float, 1, cudaReadModeElementType> tex1Ref;
texture<float, 1, cudaReadModeElementType> tex2Ref; // not used in this example but will have many texture references in the final program
global void reflect(float* dTest){
dTest[threadIdx.x] = tex1Dfetch(tex1Ref,threadIdx.x);
}
…
float* dRed;
vector hRed(1000000);
float hTest[1000000];
// hRed is populated here…
cudaMalloc( (void**) &dRed, 1000000sizeof(float));
cudaMalloc( (void**) &dTest, 1000000sizeof(float));
cudaMemcpy(dRed, &hRed[0], 1000000sizeof(float), cudaMemcpyHostToDevice);
cudaBindTexture(NULL,tex1Ref, dRed);
reflect<<<1,100>>>(dTest);
cudaMemcpy(hTest, dTest, 1000000sizeof(float), cudaMemcpyDeviceToHost);
// Compare the values hTest and hRed…
Is there something I am missing in the texture setup?
I am using a GTX2xx series GPU.
Regards