1D Texture Experiment ; why is texture not bound?

after running the following code (which initializes a 1d array of size 16 floats to the threadid, then copies that bcak to host, then to a cuda array which is then bound to a texture, which is then called in a kernel to double the vlaues by thread), I get the following error message;

“cannot fetch from a texture that is not bound”

So even though the code contains the statement

cudaBindTextureToArray(tex_rho,cu_rho,channelDesc);

which is made after the new rho on the host ahas been copied to the cuda array, why is the texture tex_rho not bound to the cuda array cu_rho which should contain the initialized array?

#include <math.h>

#include <stdio.h>

#include <cutil.h>

#define NUMTHREADS 16

#define NUMBLOCKS 1

#define NTOTAL NUMBLOCKS*NUMTHREADS

__global__ void sumdensity(float* d_rho);

__global__ void kernel1(float* d_rho);

void allocateArray(void **devPtr,size_t size);

void freeArray(void *devPtr);

texture<float, 1, cudaReadModeElementType> tex_rho;

FILE* outfile;

int main(int argc, char** argv)

{

	int  i;

	//host variables

	float	h_rho1[NTOTAL];

	float	h_rho2[NTOTAL];

	int  size_rho;

	

	//device variables

	float*	d_rho1;

	float*	d_rho2;

	//CUDA arrays

	cudaArray*	cu_rho;

	cudaSetDevice(1);

	CUT_DEVICE_INIT(argc, argv);

	cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc(64, 0, 0, 0, cudaChannelFormatKindFloat);	

    CUDA_SAFE_CALL( cudaMallocArray( &cu_rho, &channelDesc, NTOTAL, 1 )); 

    

	size_rho = NTOTAL*sizeof(float);

	allocateArray((void**)&d_rho1,size_rho);

	allocateArray((void**)&d_rho2,size_rho);

	//set up texture for new rho

	sumdensity<<<NUMBLOCKS,NUMTHREADS>>>(d_rho1);    //write new rho into d_rho

	cudaMemcpy(h_rho1,d_rho1,size_rho,cudaMemcpyDeviceToHost);	//get d_rho back from device into h_rho

	CUDA_SAFE_CALL( cudaMemcpyToArray( cu_rho, 0, 0, h_rho1, size_rho, cudaMemcpyHostToDevice));	//write h_rho into CUDA array cu_rho

	// set texture parameters?

    	

	printf("\nstart binding...");

	cudaBindTextureToArray(tex_rho,cu_rho,channelDesc);	//bind cu_rho to tex_rho, kernel1 should now refer to tex_rho, not rho

	//end set up of new rho as texture

	printf("\nend binding...");

	kernel1<<<NUMBLOCKS,NUMTHREADS>>>(d_rho2);

	CUT_CHECK_ERROR("kernel error");

	printf("\n\n%s\n", cudaGetErrorString(cudaGetLastError()));

	cudaMemcpy(h_rho2,d_rho2,size_rho,cudaMemcpyDeviceToHost);

	for(i=0; i<NTOTAL; i++) printf("\n%i\t%.5f",i,h_rho2[i]);

	freeArray(d_rho1);

	freeArray(d_rho2);

	cudaFreeArray(cu_rho);

	CUT_EXIT(argc, argv);

	return 0;

}

/////////////////////

__global__ void sumdensity(float* d_rho)

{

	int	i = blockDim.x * blockIdx.x + threadIdx.x;

	d_rho[i] = i;

}

///////////////////

__global__ void kernel1(float* d_rho)

{

	int	i = blockDim.x * blockIdx.x + threadIdx.x;

	

	float	rho = tex1D(tex_rho,i);

	rho = 2*rho;

	d_rho[i] = rho;

	//fprintf(outfile,"\n%i\t%.5f",i,rho);

}

///////////////////////////////

void allocateArray(void **devPtr,size_t size)

{

	cudaMalloc(devPtr,size);

}

//////////////////////////////

void freeArray(void *devPtr)

{

	cudaFree(devPtr);

}

//////////////////////////////

well, maybe put a CUDA_SAFE_CALL around your binding call ??

Any reason for using wrappers around cudaMalloc and cudaFree?

Also you’re not setting texture reference properties. Maybe that’s the problem.

Yeah, CudaMalloc and free might fail as fas as I know, so it is always smart.

Personally, whenever I left these kinds of checks out, I got bitten by it in the end, resulting in long debugging-sessions and in the end hitting myself in the head…

After adding a few GetLastError statements the trouble starts with cudaMallocArray which is reported to have an invalid channel descriptor, so the cuda array cu_rho isn’t declared fully.

I haven’t a clue what the channel descriptor does so I’m gonna read up on that now.

I’ve just tried to run the simpleTexture example that comes with Cuda 2.0 and that failed with the cudaMallocArray.

with
channelDesc = cudaCreateChannelDesc(32, 0, 0, 0, cudaChannelFormatKindFloat);

the error is “feature is not yet impemented”,

and with
channelDesc = cudaCreateChannelDesc(64, 0, 0, 0, cudaChannelFormatKindFloat);

the error is invalid channel descriptor.

I’ve had a look at the documentation and there is bot that much on channel descriptors, in the PG or reference.

My system is 64 bit SUSE enterprise 10.1

I’ve just tried to run the code supplied by MrAnderson42 in the thread http://forums.nvidia.com/index.php?showtopic=51690 which looked at tex1D, and guess what? I get the error message “feature is not yet implemented” for the channel descriptor!

What is going on/wrong? What do I need to get these very simple texture examples working?

Wrong/old driver?

that’s what I’m starting to think.

Yep. Looks like driver is wrong. Unfortunately I don’t have admin rights on the Tesla I use. So its either ask admin to change back from 2.0 to 1.1 or to install the latest driver for Linux64/SUSE E10.

See http://forums.nvidia.com/lofiversion/index.php?t68676.html