Problem with texture memory

Floow · February 29, 2012, 10:47am

Hi all,

I am trying to use texture memory on an easy example but I have some troubles.

Here is my code

#include <stdio.h>

#include <stdlib.h>

extern texture<float, cudaTextureType1D, cudaReadModeElementType> tex;

/** Kernel function **/

__global__ void KernelTest(float* tab1_D, float* tab2_D)

{

	// Thread index

	int idx = threadIdx.x;

	

	// Global memory

	tab2_D[idx] = tab1_D[idx];

	

	// Reading texture memory

	//tab_D[idx] = tex1Dfetch(tex, idx);

}

void test(){

	

	/** Initialisation **/

	int N = 128;

	float* tab1_H;

	float* tab1_D;

	float* tab2_H;

	float* tab2_D;

	cudaError_t erreur;

	

	tab1_H = (float*) malloc( N*sizeof(*(tab1_H)));

	if( tab1_H == NULL ){

		printf("ERREUR: ProblÃ¨me de malloc de tab_H\n");

		exit(1);

	}

	

	tab2_H = (float*) malloc( N*sizeof(*(tab2_H)));

	if( tab2_H == NULL ){

		printf("ERREUR: ProblÃ¨me de malloc de tab_H\n");

		exit(1);

	}

	

	erreur =  cudaMalloc((void**)&(tab1_D), N*sizeof(*(tab1_D)));

	if( erreur != cudaSuccess ){

		printf("ERREUR: ProblÃ¨me de cudaMalloc de tab_D\n");

		exit(1);        

	}

	

	erreur =  cudaMalloc((void**)&(tab2_D), N*sizeof(*tab2_D));

	if( erreur != cudaSuccess ){

		printf("ERREUR: ProblÃ¨me de cudaMalloc de tab_D\n");

		exit(1);        

	}

	

	for( int i=0; i<N; i++ )

		tab1_H[i] = i;

	

	

	/** Sending of the tab in the device **/

	

	erreur = cudaMemcpy(tab1_D, tab1_H, N*sizeof(*(tab1_D)), cudaMemcpyHostToDevice);

	if( erreur != cudaSuccess ){

		printf( "ERREUR: ProblÃ¨me de copie tab_D\n");

		printf( "Nature de l'erreur: %s\n",cudaGetErrorString(erreur) );

		exit(1);

	}       

	

	/** Initialisation of the texture memory **/

	erreur = cudaBindTexture(NULL, tex, tab1_D, N*sizeof(float));

	if( erreur != cudaSuccess ){

		printf( "ERREUR: ProblÃ¨me de bindTexture dans test\n");

		printf( "Nature de l'erreur: %s\n",cudaGetErrorString(erreur) );

		exit(1);

	} 

	

	dim3 blockSize(N,1);

	KernelTest<<<1, blockSize>>>(tab1_D, tab2_D );

	

	erreur = cudaMemcpy(tab2_H, tab2_D, N*sizeof(*(tab2_D)), cudaMemcpyDeviceToHost);

	if( erreur != cudaSuccess ){

		printf( "ERREUR: ProblÃ¨me de copie tab_D\n");

		printf( "Nature de l'erreur: %s\n",cudaGetErrorString(erreur) );

		exit(1);

	}       

	

	/** Final result **/

	for( int i=0; i<N; i++ ){

		printf("tab1_H[%d] = %f\n",i,tab1_H[i]) ;

		printf("tab2_H[%d] = %f\n",i,tab2_H[i]) ;

	}

	

	cudaUnbindTexture(tex);

	cudaFree(tab1_D);

	cudaFree(tab2_D);

	free(tab1_H);

	free(tab2_H);

}

int main(){

	test();

}

As a result, I have all value of tab2_D egal to zero.

EDIT: I HAD. Magic??

Do you have any idea how to correct my code?

Thanks.

tera · February 29, 2012, 11:21am

I can’t see anything wrong with the code you give, but there are a few parts missing to make an executable program.
How is [font=“Courier New”]tex[/font] defined? How are you combining host.cu and device.cu into one file?

Floow · February 29, 2012, 1:01pm

I am sorry for the incomplete previous post. I changed it. I created a single new file with only the interested code and now it works. I don’t know why. It stills incomprehensible to me.

tera · February 29, 2012, 1:27pm

Texture declarations in different compilation units refer to different objects even if they have the same name.

Floow · March 1, 2012, 3:06pm

Ok and thanks. I will be careful with it.

With my first tests, I think that the usual way of using global memory to read the data (I mean only do a cudaMalloc) is faster. I thought texture memory was the faster one in read-only. Is that always true?

To test the fastest way I switch between comment Global Memory way or texture memory way in the kernel and then check the execution time by calling “time ./MyProgramme”. Is it the correct way to do?

pasoleatis · March 1, 2012, 5:21pm

The L1 and L2 cache are faster than the texture so it will depend on problem. In some random access patterns texture might give speed up.

Floow · March 2, 2012, 2:38pm

I am doing some test to find the fastest way, thanks pasoleatis for your remarks.

But in my test, all the values of tex are equal to zero. In fact I am working on a big project and I need to use the same texture in two different files. The first one where I make cudaBindTexture() (in host.cu for host functions) and the other one where I read the texture (in device.cu for device functions). So I defined texture in a file common.h such as

extern texture<float, cudaTextureType1D, cudaReadModeElementType> tex;

and I included common.h in both host.cu and device.cu.

I guess the problem come from the compilation units that tera mentionned. But I don’t know how to figure out this problem. Do you have some sugestions that may help me?

I hope to be clear enough, ask for more explanation if not.

tera · March 2, 2012, 3:19pm

There is no way around it - they just have to be in the same compilation unit.

Floow · March 2, 2012, 3:54pm

I checked a little bit on internet about compilation unit but I don’t know exactly how it works. So in other terms, the declaration and the call have to be at least in the same file.

Thanks again for your answer

Topic		Replies	Views
Textures CUDA Programming and Performance	2	1628	July 22, 2008
Texture Cache Startup Issue Simple Texture Cache Starter example CUDA Programming and Performance	8	4015	March 17, 2010
Texture vs. Global Memory CUDA Programming and Performance	4	2011	August 6, 2009
CUDA Texture Memory Example for Beginners CUDA Programming and Performance	6	4223	July 10, 2023
Using texture memory over iterations causes incorrect read/write of some lines CUDA Programming and Performance cuda	2	511	September 2, 2020
invalid texture reference CUDA Programming and Performance	10	3449	June 16, 2009
Global memory reads optimization with texture cache CUDA Programming and Performance	2	1400	August 2, 2009
Confusion on using texture? CUDA Programming and Performance	14	4929	September 4, 2009
CUDA texture object with linear memory seems not to be updated when fetching CUDA Programming and Performance cuda	4	212	June 17, 2024
Simple texture problem Code will not compile. CUDA Programming and Performance	8	4134	February 4, 2010

Problem with texture memory

Related topics