Why there is no cudaBindTexture3D? It would be nice to have this ...

CapJo · December 8, 2009, 2:47pm

I’m working with voxel volumes and it would be nice to have also a cudaBindTexture3D.

There exist a cudaBindTexture2D call, but the cudaBindTexture3D call is missing.

An alternative would be CUDA arrays but you can’t write them External Image.

Would it be possible to add cudaBindTexture3D to CUDA or make CUDA arrays also writable?

I already considered to write a routine myself to write to CUDA arrays, but for that I need
to know the indexing scheme of CUDA arrays and it might be much work.

If you are asking why do I need such stuff … I have an segementation algorithm for voxel
volumes that access groups elements (19 voxel 3D neighborhood) without a predictable access pattern,
so shared memory is not a good solution for this, but caching would improve the performance.

The new Fermi architecture would be very suited for this algorithm, but we have to wait some time … and
it would be nice to have a solution for the current architecture.

At the moment I’m using normal memory accesses which isn’t cached at all.

CapJo · December 8, 2009, 3:11pm

A possibility would be the usage of a 1D texture and calculating the indexes myself, but in this case I couldn’t use the nice features of automatic boundary checking (clamping mode) of the texture unit, but this would be probably better than no caching at all.

I have to pay attention that every memory line starts at 64-byte address. cudaMalloc3D would do that.

Simon_Green · December 9, 2009, 11:07am

Current hardware can’t bind linear memory as a 3D texture, hence no cudaBindTexture3D. As you mention the only real solution possible today is to bind the memory as a 1D texture and do the addressing and interpolation yourself.

CapJo · December 9, 2009, 3:01pm

I have implemented the 1D tex access and achieved a performance gain of about 25 % without any other modification.

flare5009 · December 10, 2009, 9:47am

Hello.

I have just started with CUDA, and i am also trying to use 3D textures. I try to use them for volume rendering.

There is this example from NVIDIA, that uses 3D textures (volumeRender uses tex3D). What they do is allocate 3D memory and then bind it with cudaBindTextureToArray(…).

What would be interesting to know is that why does that work, and do they get proper interpolation in 3D.

Simon_Green · December 10, 2009, 11:47am

Yes, to be clear, CUDA does have full support for 3D textures with interpolation, it’s just that they must be stored in memory in the special CUDA 3D array format.

We don’t support 3D textures bound to regular linear memory.

flare5009 · December 10, 2009, 1:11pm

Thank you for quick reply!

As I am opposing ‘hey lets try if it works’ approach, i am here again asking, please forgive me. I have not found answer from programming manual (or mayby you can direct me to to correct manual and/or read again those sections and/or existing thread ?) how the textures can be modified.

What i am trying to do, is do a 3D volume and render that (so 3D arrays are out of question, since those cannot be modified, right). Then modify the volume and re-render. To avoid transferring lots of data from host to device, i had in mind to handle the modification on the device.

How about following solution:

Allocate linear memory
1.1) Copy memory from host to allocated memory
Bind that memory to 2D texture
Do tex2D(…)
Unbind
Modify linear memory
Goto 2.

Is unbinding needed? Texture cache needs to flushed for correct result? Or is the texture cache flushed when different thread starts to execute?

Another option using 3D textures and 3D-array would be to copy from linear memory to 3D-Array (device to device) but that would require double the space – but it should be possible?
So it would be something like this:
0. Allocate and initialise etc.

Bind 3D-Array to 3D texture and render
Modify linear memory
Update 3D-Array with memcpy device to device from linear memory
Unbind (?!)
Goto 1.

Thanks!

CapJo · December 10, 2009, 4:27pm

Why do you want to use tex2D when your actual data is 3D? I would recommand the same solution as i used, tex1Dfetch on linear memory and caltulating the index yourself.

The second solution will also work and you have the advantage of a better cache locality of your data.

It depends on your algorithm which of these solutions will be faster. This means how often you need to modify your data and how much time do you save with 3D textures. It will be only faster if time savings will compensate the time for coping between linear memory and the CUDA array.

Probably there will be only one way - you have to implent both solutions and try them.

flare5009 · December 11, 2009, 7:03pm

My bad for inaccurate writing. I want to do real 3D - interpolation on point (x,y,z) – that is value on point x,y,z depend on six surrounding voxels (corners).

What i ment by writing 2Dtex is that i would save data in XY-plane wise, and then when calculating value at point x,y,z i would do (get level INT_FLOOR(z) texture at point x,y):

v1 = tex2D( texture_reference[ INT_FLOOR(z)], x, y ) and

v2 = tex2D( texture_reference[ INT_FLOOR(z) + 1], x, y ).

then final value would be:

v = (z - INT_FLOOR(z)) * v1 + (INT_CEIL(z) - z)*v2.

I am not sure if i understand correct your proposal of using 1D textures. What came to my mind is that you suggest doing something like index = INT_FLOOR(x) + INT_FLOOR(y)*Xsize + INT_FLOOR(z)XsizeYsize and then take corresponding value beteen index and index + 1.

Then to get real 3D-interpolation one needs still to do the same for y_index + 1, and z_index + 1 (total of 4x tex1D calls) and then do ‘manual’ interpolation three times. This might certainly still be faster way.

I am afraid you are correct that only way to figure out what is fastest is to do test program for all the cases…

Still, i would like to know if i need to unbind the texture before modifications. ‘it seems to work’ - solution is not what i would like to do…

Cheers and thanks!

CapJo · December 12, 2009, 8:59pm

At the moment I would remark only one thing … the number of textures is limited to 128 … so you can only have 128 XY-Planes … if it’s enough it might be a good solution.

So I have to go … to be in time.

Topic		Replies	Views
An array of texture references? CUDA Programming and Performance	30	29722	October 29, 2007
3D Geographic Interpolation too inaccurate How to best deal with poor texture interpolation? CUDA Programming and Performance	9	1384	December 19, 2024
3D Texture and memory writes Write memory bound to 3D texture CUDA Programming and Performance	3	9428	July 5, 2010
How to handle a set(array) of 2D textures CUDA Programming and Performance	4	4737	September 15, 2015
Help with some CUDA concepts CUDA Programming and Performance	7	1448	August 16, 2009
Repeated 1D interpolation with type promotion CUDA Programming and Performance	3	572	October 12, 2021
Textures CUDA Programming and Performance	2	1631	July 22, 2008
CUDA vs DX execution times DX GPGPU code --> CUDA = slower CUDA Programming and Performance	15	13319	January 30, 2008
I am trying to compare the performance of texture fetch and usual memory fetch CUDA Programming and Performance	10	2255	July 19, 2010
CUDA texture object with linear memory seems not to be updated when fetching CUDA Programming and Performance cuda	4	217	June 17, 2024

Why there is no cudaBindTexture3D? It would be nice to have this ...

Related topics