OpenGL & CUDA

kaplox · November 14, 2008, 9:18am

Hello,

First of all, thank you a lot for all the things I’ve learnt in these forums.
I want to draw an image using opengl and cuda. My image is calculated with a kernel in cuda and stored in an 1D float array and each position contains a value which determines the colour of this pixel. Once the image is calculated I need to draw it in the screen using the gpu as well. Do you know what is the best way to put it in the screen? I don’t know if I need to use texture memory or vertex buffer objects or something else or just use the opengl primitives like glVertex, etc… , so I’m confused about it and I hope you can help me. If you need more details, here I am. Thanks you very much.

Greetings

_Big_Mac · November 14, 2008, 3:38pm

The simpleGL example in the SDK does what you want. The function you’re probably most interested in is this:

void runCuda( GLuint vbo)

{

	// map OpenGL buffer object for writing from CUDA

	float4 *dptr;

	CUDA_SAFE_CALL(cudaGLMapBufferObject( (void**)&dptr, vbo));

	// execute the kernel

	dim3 block(8, 8, 1);

	dim3 grid(mesh_width / block.x, mesh_height / block.y, 1);

	kernel<<< grid, block>>>(dptr, mesh_width, mesh_height, anim);

	// unmap buffer object

	CUDA_SAFE_CALL(cudaGLUnmapBufferObject( vbo));

}

But go through the whole code to get the hang of it.

As you can see, we map the VBO to a float4 pointer and then we can treat the data behind this pointer as a normal 1D array of type float4 in the kernel. We can access vertices positions (XYZW) or any other data such a buffer object represents (for example, pixel colours in RGBA)

The void display() function shows how to display such a VBO afterwards.

Only be advised, the kernel supplied there has uncoalesced writes!

//naive addressing in the pos array

pos[y*width+x] = make_float4(u, w, v, 1.0f);

You’ll probably want to find a smarter way to address writes to memory if you’re concerned about performance and optimization.

Also, from what I’ve read here openGL interoperability is quite slow currently in CUDA. It works but it’s slower than with DirectX.

You’ll most likely want to stay away from using primitives if you’re aiming for performance. They are simple to use but inefficient. If you wanted to use them, you could in principle copy data back to the CPU, extract primitives’ coordinates from it and call a loop of glRectf or whatnot but it’s about the slowest way to do it :)

You don’t have to use texture memory. Texture memory has nothing to do directly with displaying things, it’s only a way of addressing read-only arrays with some neat features like free interpolation and a bit of caching etc. Of course there’s nothing stopping you from reading data from textures and using it to write into a VBO.

kaplox · November 17, 2008, 12:45pm

Hi again!

Thanks a lot for your response Big Mac.
In fact, I think it’s just what I want. But one question more, once I’ve calculated the image and done the unmap to the vbo, can I access to vbo as an array? Because if I want to know the value of each pixel I need to read it with a sentence for, for instance, and then set te color apropiately in each position of the grid…or there’s function that having two arrays (one with the vertex of the image (the grid) and other with the colour of each pixel) draw it in the screen? Sorry if you don’t understand my question.

Thanks a lot!
Greetings

_Big_Mac · November 17, 2008, 6:32pm

AFAIK you can bind any type of buffer object for CUDA. It could be a Pixel Buffer Object instead of a Vertex Buffer Object. I don’t know much about OpenGL but if I understand the concept of PBOs correctly, they store data of pixels as they appear on screen (or as they enter the framebuffer) so in a sense, modifying pixel buffers would let you directly “paint” pixels on screen. I might get this wrong so you better read about PBOs to confirm.

kaplox · November 18, 2008, 7:56am

Ok, thank you very much. I’m going to read these papers and choose the best option…

Bye!

kaplox · November 18, 2008, 9:58am

Hi again, now I’m thinking to use texture memory from reading the data I used to create the image and try to increase the perfomance. But I have some problem. Here I write some parts of my code:

[codebox]

// Global variable for allocating 1D array

texture<float, 1> textdata;

…

// In main(), I bind texture with data previously allocated with cudaMalloc

cudaBindTexture(0, textdata, data, sizedata * sizeof(float));

…

// In kernel to access data, I use

tex1D(textdata,index);

[/codebox]

But I only obtain valid data when index = 0. Rest of data return 0 value instead the real value. Do you know what’s the problem?

Thank you.

kaplox · November 18, 2008, 11:43am

Yeah! I found the problem! The function to access is [font=“Courier New”]tex1Dfetch(textdata, index)[/font] instead of [font=“Courier New”]tex1D(texdata, index)[/font].
Thanks anyway!

kaplox · November 18, 2008, 3:19pm

Ooops! I forgot making the last question…do you know what is the size of the texture memory?
My card has 512MB of memory, but I’m not sure if this corresponds to global memory or is shared between diferent memory spaces…

thanks!

_Big_Mac · November 18, 2008, 6:39pm

Texture memory and global memory are physically the same thing and both sit in GPU’s VRAM. Think of it this way: you can allocate an ordinary array in the VRAM and this becomes global memory. You can then tell the GPU to treat this array as a texture and from now on it will use a different method (and hardware unit) to access it. Keep in mind, optimized data access is very important for performance so this can make a huge difference.

JohnW · January 16, 2009, 5:43am

Hi,

Equally in awe of some of the comments on these forums. A great resource.

Did you work out how to set the color for individual pixels in a VBO/PBO?

Thanks,

John

kaplox · January 16, 2009, 8:33am

Hi, i’m not using pbo’s yet but soon I will do it. For getting access to the buffer object, you can use something like this:

// map the PBO to process its data by CPU

glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, pbo_id);

GLubyte* ptr = (GLubyte*) glMapBufferARB(GL_PIXEL_PACK_BUFFER_ARB, GL_READ_ONLY_ARB);

if(ptr) {

	// You can set a color for each position of the buffer

	processPixels(ptr, ...);

	glUnmapBufferARB(GL_PIXEL_PACK_BUFFER_ARB);

}

// back to conventional pixel operation

glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, 0);

Anyway, if somebody knows more information about this question I would be very glad to read from you.

Greetings!

JohnW · January 16, 2009, 9:40pm

Actually I’m still on VBO’s too. I guess it really is a OpenGL question rather than CUDA.

My code is based on the simpleGL example using a VBO with four elements x, y, z, color.

I can access the verticies also thus:

[codebox]glBindBuffer(GL_ARRAY_BUFFER, vbo);

glVertexPointer(3, GL_FLOAT, 4*sizeof(float), 0); //extract the x, y, z coords

glEnableClientState(GL_VERTEX_ARRAY);

glDrawArrays(GL_POINTS, 0, mesh_width * mesh_height);

glDisableClientState(GL_VERTEX_ARRAY);[/codebox]

But then I don’t know how to use the fourth value as color (or colored texture).

I’ve tried to extend the vbo to six elements (after defining float6 with xyz and rgb elements):

[codebox]glBindBuffer(GL_ARRAY_BUFFER, vbo);

glVertexPointer(3, GL_FLOAT, 6*sizeof(float), 0);

glColorPointer(3, GL_FLOAT, 6sizeof(float), 3sizeof(float));

glEnableClientState(GL_VERTEX_ARRAY);

glEnableClientState(GL_COLOR_ARRAY);

glDrawArrays(GL_POINTS, 0, mesh_width * mesh_height);

glDisableClientState(GL_VERTEX_ARRAY);

glDisableClientState(GL_COLOR_ARRAY);[/codebox]

but it crashes with a memory overrun half way throught the first pass at the data.

Any ideas? I’m really hating OpenGL.

JohnW · January 16, 2009, 11:35pm

I’ve got it doing something close to what I would like to see in by following this tutorial.
[post=“0”]http://www.evl.uic.edu/aej/525/lecture06.html[/post]
However it has doubled the size of my vbo. I’m far more concerned with performance than looks.

Is there a more efficient way?

Thanks,
John

Topic		Replies	Views
display a buffer openGL/cuda question CUDA Programming and Performance	11	8110	May 13, 2008
OpenGL cuda textures CUDA Programming and Performance	0	5113	June 10, 2009
The best way to copy OpenGL texture to CUDA CUDA Programming and Performance	6	17503	January 13, 2008
CUDA & OpenGL FrameBuffer Object. CUDA Programming and Performance	3	9611	September 8, 2011
bind texture reference to raw linear data or array CUDA Programming and Performance	4	10513	July 4, 2007
Draw PBO into the screen : performance OpenGL	10	5115	June 21, 2013
Pass openGL data to CUDA. Question about speed. CUDA Programming and Performance	4	1855	August 22, 2016
CUDA and OpenGL data transfer CUDA Programming and Performance	9	21267	October 6, 2007
How to send the CUDA results to OpenGL texture? CUDA Programming and Performance	2	10223	August 17, 2007
Process texture in CUDA and display result CUDA Programming and Performance	10	2039	August 19, 2010

OpenGL & CUDA

Related topics