Basic CUDA texture issue

First, I’ll give a general description of my problem. In my application I have a 1D float texture of length n. This texture is called “depthBufferTex_d”. This texture is bound to a 1D cudaArray called “cu_array.” I also have a 1D float array in global memory of length n. This is called “depthBufferGlobal_d”. After every kernel call, I want to take the values stored in “depthBufferGlobal_d”, and put them in “depthBufferTex_d”. Does anyone know how to do this?

So far I have tried two approaches, but have had no luck.

I have my texture declared at file scope as follows:

[codebox]texture<float, 1, cudaReadModeElementType> depthBufferTex_d;[/codebox]

Here is the first way I am trying to do this. I am using a “device to device” copy.

[codebox]host

void swapBuffers1( float* depthBufferGlobal_d, int width, int height, cudaArray* cu_array ){

//First, unbind the texture

cudaUnbindTexture( depthBufferTex_d );

//Copy data from device to device

cudaMemcpyToArray( cu_array , 0 , 0 , gData , width * height * sizeof(float) , cudaMemcpyDeviceToDevice );

//Now, finally, rebind the texture

cudaBindTextureToArray(depthBufferTex_d , cu_array );

}[/codebox]

The second way I am trying to do this has me copying data from the device back to the host, and then from the host back onto the device again! It still doesn’t work though :(

[codebox]host

void swapBuffers2( float* depthBufferGlobal_d, int width, int height, cudaArray* cu_array ){

//First, unbind the texture

cudaUnbindTexture( depthBufferTex_d );

//Copy the buffer from global mem to the texture array

//

// First, copy back to the host

// Next, copy from host to texture

// then, deallocate host memory

//Allocate space on host

float* depthBuffer_temp_h;

depthBuffer_temp_h = (float*) malloc( sizeof(float) * width * height );

//Copy from device->host

cudaMemcpy( depthBuffer_temp_h , depthBufferGlobal_d , sizeof(float) * width * height , cudaMemcpyDeviceToHost );

//Copy from host->device

cudaMemcpyToArray( cu_array , 0 , 0 , depthBuffer_temp_h , sizeof(float)*width*height , cudaMemcpyHostToDevice );

//Free host mem

free( depthBuffer_temp_h );

//Now, finally, rebind the textuer

cudaBindTextureToArray(depthBufferTex_d , cu_array );

}[/codebox]

Note that I am using CUDA 2.2 on Windows XP with a GTX 280 and 4GB of RAM.

Also, I want to say thanks to the people who post on the forum regularly, this place is a great source of CUDA knowledge.

~Colin

The device to device memcpy path is definitely the way to go, but I guess you meant to write

cudaMemcpyToArray( cu_array , 0 , 0 , depthBufferGlobal_d , width * height * sizeof(float) , cudaMemcpyDeviceToDevice );

instead of

cudaMemcpyToArray( cu_array , 0 , 0 , gData , width * height * sizeof(float) , cudaMemcpyDeviceToDevice );

N.

Thanks for your help Nico! I had thought that the device to device thing was correct from the beginning, but after it wouldn’t work I tried the device to host to device solution.

As for the “gData” varaible I had, that was just a mistake I had from pasting my code into here and forgetting to rename variables. The code actually is:

[codebox]host

void swapBuffers1( float* depthBufferGlobal_d, int width, int height, cudaArray* cu_array ){

//First, unbind the texture

cudaUnbindTexture( depthBufferTex_d );

//Copy data from device to device

cudaMemcpyToArray( cu_array , 0 , 0 , depthBufferGlobal_d , width * height * sizeof(float) , cudaMemcpyDeviceToDevice );

//Now, finally, rebind the texture

cudaBindTextureToArray(depthBufferTex_d , cu_array );

}[/codebox]

in my real code. Unfortunately, it still dosen’t seem to work :(

I’m going to try to recreate my problem in a smaller scale example so I can post the full code so that someone can help me track down this problem.

Thanks,

Colin

Hi again,

Did you check for errors? Also, where’s the channel descriptor for the texture? I believe you’re supposed to provide it to the cudaBindTextureToArray call.

N.

Nico,

I checked for errors at the end of the function by using the code:

[codebox]cout << cudaGetLastError( ) << endl;[/codebox]

and I didn’t get any errors.

As for the channel descriptor, I used one when I did the original texture binding. However, I tried to add in a the same descriptor to the call, and it still didn’t work. I also tried just using a cudaBindTexture(…) as opposed to a cudaBindTextureToArray(…) call and still had no luck.

Thanks again for you help Nico!

~Colin

I think it is incorrect to bind 1D texture to 2D Array.
You should declare the texture as texture<float, 2, cudaReadModeElementType>
bind it to 2D array and access it using tex2D. How the memory of 2D array is organized is implementation dependent (might be that zig-zag pattern for better spatial locality). And texture<float, 1> assumes linear organization.

If you dont need filtering and normalized coordinates, use 1D texture bound to linear memory (allocated by cudaMalloc). Bind it directly to your depthBufferGlobal_d, don’t need no copying.

I also suspect 2D texture size is something limited, but don’t see it in the reference right now…