I’m currently developing a CUDA kernel to perform RayTracing over voxel data. The kernel works fine and so far I have managed to get it to write the image in a simple array which is then copied back to the ‘CPU’ and displayed with glDrawPixels().
Obviously all this memory copying/moving ruins performance so I want to generate the image on the GPU and leave it in the ‘graphics’ memory to be displayed later.
In the CUDA By Example book chapter 8 includes a basic example of this which I tried to copy. However, when I run the code I get a segmentation fault and no decent error messages.
According to the CUDA documentation the OpenGL Interoperability code is now deprecated and it doesn’t describe a situation where this might happen.
Has anyone had this problem before and can share a solution? Or is there a better solution to linking CUDA and OpenGL code together?
I have a google for a solution but I couldn’t find one especially as its just this one function causing problems. Also I can’t find the newer API for CUDA interoperability with OpenGL.
Nvidia 9800 GTX+ GPU
nvcc: NVIDIA ® Cuda compiler driver
Copyright © 2005-2010 NVIDIA Corporation
Built on Wed_Nov__3_16:16:57_PDT_2010
Cuda compilation tools, release 3.2, V0.2.1221
build command: nvcc -c cuda_code.cu -arch sm_11
opengl : 2.1
OS : Ubuntu Linux 64bit
pixels = new unsigned char[FLAGS_screenw*FLAGS_screenh]; // for copying GPU generated frame into memset(pixels, 0, FLAGS_screenw*FLAGS_screenh); cudaDeviceProp prop; HANDLE_ERROR(cudaGetDevice(&dev)); LOG("ID of current CUDA device: %d", dev); memset(&prop, 0, sizeof(cudaDeviceProp)); prop.major = 1; prop.minor = 1; HANDLE_ERROR(cudaChooseDevice(&dev, &prop)); HANDLE_ERROR(cudaGLSetGLDevice(dev)); // <---- from book LOG("ID of CUDA device closest to revision %d.%d: %d", 1,1,dev); HANDLE_ERROR(cudaSetDevice(dev)); LOG("Allocating memory in GPU"); HANDLE_ERROR(cudaMalloc((void**)&dev_world, sizeof(VOXEL) * kVOXELLENGTH*kVOXELHEIGHT*kVOXELWIDTH)); HANDLE_ERROR(cudaMalloc((void**)&dev_screen, sizeof(unsigned char) * FLAGS_screenw*FLAGS_screenh)); LOG("Allocating GL buffers"); glGenBuffers(1, &pixel_buffer); LOGGLERROR(); // Macro for giving opengl errors glBindBuffer(GL_PIXEL_UNPACK_BUFFER, pixel_buffer); LOGGLERROR(); glBufferData(GL_PIXEL_UNPACK_BUFFER, sizeof(unsigned char) * FLAGS_screenw * FLAGS_screenh, NULL, GL_DYNAMIC_DRAW); LOGGLERROR(); if (resource == NULL) LOG("NULL"); HANDLE_ERROR(cudaGraphicsGLRegisterBuffer(&resource, pixel_buffer, cudaGraphicsMapFlagsNone)); LOG("CUDA Init complete");
The output I get is:
[cuda_code.cu:234] ID of current CUDA device: 0
[cuda_code.cu:240] ID of CUDA device closest to revision 1.1: 0
[cuda_code.cu:242] Allocating memory in GPU
[cuda_code.cu:247] Allocating GL buffers
As you can see the cudaGraphicsGLRegisterBuffer() function fails and causes a segmentation fault but I can’t seem to get any error codes out of it.
Thank you in advance for your help,