problems with surf2DLayeredread/surf2DLayeredwrite

Hi,

I’m not able to get layered 2D surfaces to work. I could only find a sample reading from a surface within the CUDA SDK samples and I did not find a single example on the web…

Some explaination about what I’m trying to do: dach pixel represents a component of a velocity vector of an associated moving object. Since I only need the velocity data within my kernel (I only need the resulting position data for my GLSL vertex shader later on), I need some sort of read/write texture there – so I’m trying to use surfaces. If any of you has recommendations about how to do this differently please don’t hesitate to tell me, I’m new to CUDA.

I’m initializing like this:

//global:

surface<void, cudaSurfaceType2DLayered> velocitySurface;

#define BUFFER_WIDTH		512

#define BUFFER_SIZE		( BUFFER_WIDTH * BUFFER_WIDTH )

#define VELOCITY_COMPONENTS	3

//....

desc = cudaCreateChannelDesc<float>();

extent = make_cudaExtent( BUFFER_WIDTH, BUFFER_WIDTH, 3 );

handleCudaError( cudaMalloc3DArray( &( this->velocityArray ), &desc, extent, cudaArrayLayered | cudaArraySurfaceLoadStore ) );

float *initBuffer = new float[BUFFER_SIZE * VELOCITY_COMPONENTS];

std::fill( initBuffer, initBuffer + BUFFER_SIZE * VELOCITY_COMPONENTS, 0.0f );

params.dstArray = this->velocityArray;

params.srcPos = make_cudaPos( 0, 0, 0 ); 

params.dstPos = make_cudaPos( 0, 0, 0 ); 

params.extent = extent;

params.kind = cudaMemcpyHostToDevice;

params.srcPtr = make_cudaPitchedPtr( (void*)( initBuffer ), BUFFER_WIDTH * sizeof( float ), BUFFER_WIDTH, BUFFER_WIDTH );

handleCudaError( cudaMemcpy3D( &params ) );

handleCudaError( cudaBindSurfaceToArray( &velocitySurface, this->velocityArray, &desc ) );

delete [] initBuffer;

In my kernel I got this code:

//...

int x = threadIdx.x + blockIdx.x * blockDim.x;

int y = threadIdx.y + blockIdx.y * blockDim.y;

int offset = x + y * blockDim.x * gridDim.x;

float3 vel = make_float3(

	surf2DLayeredread<float>( velocitySurface, x, y, 0, cudaBoundaryModeClamp ),

	surf2DLayeredread<float>( velocitySurface, x, y, 1, cudaBoundaryModeClamp ),

	surf2DLayeredread<float>( velocitySurface, x, y, 2, cudaBoundaryModeClamp ) );

//modifying velocity vector

//....

surf2DLayeredwrite( vel.x, velocitySurface, x, y, 0 );

surf2DLayeredwrite( vel.y, velocitySurface, x, y, 1 );

surf2DLayeredwrite( vel.z, velocitySurface, x, y, 2 );

posX[offset] += dt * vel.x;

posY[offset] += dt * vel.y;

posZ[offset] += dt * vel.z;

I’m using void instead of float as Type parameter since I saw it like this in the simpleSurfaceWrite sample and I did not get it to compile with float. The compiler says that it would not find appropriate overloads of surf2DLayeredread and surf2DLayeredwrite then → why is that?

Okay, like this it compiles and it runs. The problems I’ve got with this code however is that the values read from the surface do not seem to be correct, since the resulting positions are wrong. Also, when I quit my application, a cudaFree call on a completely unrelated buffer fails with #001E - “unknown error”. If I remove the surf2DLayeredread/write calls, there is no problem, so there really seems to be something bad with this code.

Please help, thanks!

-iko