Getting address of cudaArray

wumpus · September 13, 2007, 12:52pm

For an application I’m writing I need to generate an image on the GPU, then use it as a texture in the next kernel. For maximum speed it is recommended to use CUDA arrays for this.

It seems that I need to allocate the texture then use cudaMemcpyToArray to do a copy, given the current API, even though the texture is already in GPU memory.

Why isn’t it possible to get the address of an array directly so that I can write to it in a kernel? It seems this shouldn’t be a problem unless someone writes and reads from a texture in the same kernel, and would save a redundant copy in this case.

MisterAnderson42 · September 13, 2007, 1:12pm

Because cudaArray’s store data in a special way which makes them fast for 2D caches. The documentation doesn’t state what this special order is, so you can’t access it. Even if we knew how they are stored, you wouldn’t be able to coalesce writes to it anyways.

memcpyToArray is fast (70GB/s), so is it really that huge of a bottleneck for you?

wumpus · September 13, 2007, 1:26pm

Indeed, the copying is not a bottleneck, as it is very fast. The extra memory used might be a problem, but I don’t think so.

I was just wondering why and this seems to be a good reason. Although I hope NVidia will make public hardware details like the texture format in the future.

MisterAnderson42 · September 13, 2007, 3:06pm

The 2D cache is probably some kind of trade secret. If I had to guess, I’d say they are using a Hilbert, or some other space-filling curve approach to store the data since my research into multi-dimensional data locality for use in my work turned these techniques up as being the best.

wumpus · September 17, 2007, 10:37am

Well some other graphics hardware (R300) uses tile based texturing. Each tile being 16x16 or 32x32, whatever is easy to load, and these tiles are laid out in normal left-to-right, top-to-bottom order. Hilbert and other space filling curves are very nice and elegant, but less practical for hardware implementation (as far as I know), as querying where a certain x,y is located is quite an expensive operation.

I don’t think secrecy really is the reason, as patents provide sufficient protection.

Topic		Replies	Views
Texture / Array Access CUDA Programming and Performance	20	21663	April 19, 2008
Avoiding a device write using textures and arrays. CUDA Programming and Performance	3	2843	August 7, 2008
The secret of cudaArray CUDA Programming and Performance	8	10734	September 26, 2008
cudaArrays: how to use them? CUDA Programming and Performance	3	5500	December 19, 2009
Question about texture memory CUDA Programming and Performance	3	4509	May 27, 2009
2D Array CUDA Programming and Performance	16	77323	January 20, 2012
Writing to an array through kernels CUDA Programming and Performance	1	5395	May 15, 2007
Writing in to cudaArray CUDA Programming and Performance	7	5775	July 2, 2010
cuTexRefSetAddress vs cuTexRefSetArray CUDA Programming and Performance	3	7271	May 7, 2007
Newbie question: Texture writes How to write to texture memory CUDA Programming and Performance	2	3027	July 4, 2008

Getting address of cudaArray

Related topics