cudaBindTexture2D Problem? can't find the reason for this bahavior

Hi all,

sorry that I posted this message accidently before I was finished!

I have a problem with cudaBind Texture2D. I try to use CUDA for

image processing and texture access on the image seems to be a

good way, but unfortunately I have some problems with it and I

don’t get any error but it doesn’t work :-(.

I allocate memory for the image on the device (320 x 240 * sizeof(char))

The image is grayscale.

This is the code to allocate memory on the gpu:

	// malloc memory on gpu

	CUDA_SAFE_CALL(cudaMalloc((void**) &d_data1, mem_size));

Now I’m trying to bind this memory to a texture. The (test) kernel seems to stop

working after I bind this texture. ( it only sets all pixels to black == 0)

The kernel works again when I uncomment the texture binding.

Where is the error? Might the memory pitch in the call

// cudaBindTexture2D(size_t *offset,const struct textureReference *tex,const void *devPtr, const struct cudaChannelFormatDesc *desc,size_t width, size_t height, size_t pitch);

have something to do with it?

Thank you in advance!!!

#include <cutil_inline.h>

texture<uchar, 2, cudaReadModeElementType> tex;

__global__ void kernel(uchar* in, uchar* out, int width)

{

	unsigned int x = blockIdx.x*blockDim.x + threadIdx.x;

	unsigned int y = blockIdx.y*blockDim.y + threadIdx.y;

	out[y*width + x] = 0; // tex2D(tex, x, y) * 0.1;

}

extern "C" void

cudaStartKernel(char* in, char* out, IplImage* image_info)

{

cudaChannelFormatDesc channelDesc;

	

	if (image_info->nChannels == 1) {

		channelDesc = cudaCreateChannelDesc<unsigned char>();

	} else {

		channelDesc = cudaCreateChannelDesc<uchar4>();

	}

	   // set texture parameters

	tex.addressMode[0] = cudaAddressModeClamp;

	tex.addressMode[1] = cudaAddressModeClamp;

	tex.filterMode = cudaFilterModePoint;

	tex.normalized = false;	// access with normalized texture coordinates

	const textureReference* texRefPtr=NULL;

	cudaGetTextureReference(&texRefPtr, "tex");

	CUDA_SAFE_CALL(cudaBindTexture2D(0, texRefPtr, (void*)in, &channelDesc, image_info->width, image_info->height, 1));

							 //	cudaBindTexture2D(size_t *offset,const struct textureReference *tex,const void *devPtr, const struct cudaChannelFormatDesc *desc,size_t width, size_t height, size_t pitch);

	dim3 dimBlock(8, 8, 1);

	dim3 dimGrid(image_info->width / dimBlock.x, image_info->height / dimBlock.y, 1);

	kernel<<< dimGrid, dimBlock >>>((uchar*)in, (uchar*)out, image_info->width);

	CUDA_SAFE_CALL(cudaThreadSynchronize());

	CUDA_SAFE_CALL(cudaUnbindTexture(tex));

}

I use now cudaMallocPitch((void**) &d_data1, &pitch, input->image()->widthStep, input->image()->height) instead of cudaMalloc and now it works.

Sorry if Im’ OffTopic, but I have to modify an openCV algorithm (an algorithm that unwarps an omnidirectional camera image into a cylinder) making it with CUDA and, looking at your code, I see you pass an IplImage pointer to the kernel. Is that possible?

EDIT: Sorry I misunderstood your code, you just pass the image parameters to the kernel