texture didn't win global memory texture cache

Hi, I use a texture of 1k int (4kbytes) and expect the 8kB-cache would be useful, but no performance difference from using global memory at all. I access texture totally randomly. What’s the reason cache didn’t work? thanks!

//main.cu:

texture< AttrType, 1, cudaReadModeElementType> tex;

..

tex.addressMode[0] = cudaAddressModeClamp;

tex.addressMode[1] = cudaAddressModeClamp;

tex.filterMode = cudaFilterModePoint;

    tex.normalized = 0;

	cudaChannelFormatDesc desc = cudaCreateChannelDesc(32, 0, 0, 0, cudaChannelFormatKindUnsigned);

	CUDA_SAFE_CALL(cudaBindTexture("tex", d_, &desc, sizeof(int) * 1024, 0));

//kernel.cu:

extern texture< AttrType, 1, cudaReadModeElementType> tex;

...

res = texfetch(tex, offset);

Does your access pattern have any 2D locality? Do you use cudaMallocArray?

Mark

Thanks, no and no. My tex is 1D, and I use this for the “d_” in my original post:

CUDA_SAFE_CALL(cudaMalloc((void**) &d_, sizeof…));

CUDA_SAFE_CALL(cudaMemcpy(d_, h_, sizeof…, cudaMemcpyHostToDevice));

My question is, since the texture < 8kbyte, it can reside in cache, no matter what type (1D/2D, linear/array). isn’t it? However, apparently my texture is not cached.

Hi, I think you need cudaMallocArray for this. See this comment by John including code snipet for mallocing and using textures:

http://forums.nvidia.com/index.php?showtop…ndpost&p=170507

Thanks, I found Mark Harris’s words in few posts below the link you gave:

“Note also that unlike CUDA Array texture references, linear memory texture references will not benefit as well from the texture cache as CUDA Arrays due to memory layout.”

So who will use malloc instead of array then?

And, cudaMallocArray(&cu_Array, &desc, size, 1); would lead to error when size is 512k.

What’s the limitation of size? Thanks!

(I thought it automatically warp for me)

[quotenogradi,Mar 30 2007, 11:59 PM]

Hi, I think you need cudaMallocArray for this. See this comment by John including code snipet for mallocing and using textures:

http://forums.nvidia.com/index.php?showtop…ndpost&p=170507

[snapback]177907[/snapback]

[/quote]

See http://forums.nvidia.com/index.php?showtopic=30934

A 2D texture array has max size (64K, 32K) elements.
A 1D texture in linear memory has max size 2^27.

But apparently there are some bugs preventing allocating too large memory blocks.

Thanks a lot nogradi, then my new post is a waste:)