OpenACC and OpenGL pointers


I’m currently working on a project involving fluid simulation computed using OpenACC and visualized with OpenGL (Flowing fluid is displayed in 2D space as density). All calculations are performed on data arrays allocated on device memory using the acc_malloc runtime function to minimize host to device data transfers.

To render the fluid i’m using a pixel buffer object (PBO) mapped to a pointer. This pointer is sent to a a OpenACC kernel region where data from one of the acc_malloc data arrays are copied to the pixel buffer object which effectively results in a device to device copy operation.

All this worked fine when using PGI OpenACC compiler 12.8, but when changing to version 13.3 i get the error “call to cuMemHostRegister returned error 999: Unknown” when trying to pass the OpenGL PBO pointer to the OpenACC region. When i use a host allocated pointer instead it works fine, but this means i have to perform a device to host copy and then a host to device copy instead of just a device to device copy.

Any advice/explanation would be greatly appreciated!

The OpenGL code:

	float4 *output; 

	if (output) {
        // get results from fluid simulation
    	renderFluid(&config, output); // runs the densitytocolor kernel

The kernel code:

void densityToColor(float4 *output, const float *density, int N) {
#pragma acc kernels deviceptr(density) copyout(output[0:(N+2)*(N+2)])
	#pragma acc loop independent
	for (int j=1; j<=N; j++) {
		#pragma acc loop independent
		for (int i=1; i<=N; i++) {
			// Just copy data from density to output right now
			// Might do some color manipulation later
			float densityValue = density[IX(i,j)];
			float4 f4ptr;
			f4ptr.x = densityValue; f4ptr.y = densityValue; f4ptr.z = densityValue; f4ptr.w = densityValue;
			output[IX(i,j)] = f4ptr;

Here is also some of the debug output when using PGI_ACC_DEBUG=1:

pgi_uacc_begin( compute region, file=/home/mmikalsen/pgicode/fluid2dacc/fluid2d.c, function=densityToColor, lines=49:64, startline=50, endline=64, devid=0, threadid=1 )
pgi_uacc_begin( file=/home/mmikalsen/pgicode/fluid2dacc/fluid2d.c, function=densityToColor, lines=49:64, startline=50, endline=64, devid=1, threadid=1 ) dindex=1
pgi_uacc_dataon( devid=1, threadid=1 )
pgi_uacc_dataon( devid=1, threadid=1 ) dindex=1
NO map for host:0x2aaaae989e00
pgi_uacc_alloc(size=1065024,devid=1,threadid=1) returns 0x600500000
map    dev:0x600500000 host:0x2aaaae989e00 size:1065024 offset:0  data[dev:0x600500000 host:0x2aaaae989e00 size:1065024] (line:50 name:output)
alloc done with devptr at 0x600500000
MemHostRegister( 0x2aaaae989e00, 1065024, 0 )
call to cuMemHostRegister returned error 999: Unknown
Starting simulation
make: *** [run] Error 1

Hi mmikalsen,

In the 13.1 compilers, the run time started using pinned memory by default with goal of improving asynchronous data copies and the overall performance of data copies. However, I suspect that given OpenGL memory is mapped separately from other CUDA memory, this change is interfering with your usage.

We’re looking at adding a flag and/or environment variable which will revert to the old behaviour of using 1MB buffers instead of pinned memory to transfer data. Hopefully this change will allow to continue to use the OpenGL mapped memory.

Would you be able to share your code with us? OpenGL memory mapped data isn’t something that we directly support and therefore don’t test. If possible, I’d like to add your code to our testing system so that this wont occur again.


Thanks for the reply Mat.

The code generates a fluid flow that looks like smoke being emitted from the center of the simulation grid. It’s based on GPU Gems 38. - Fast Fluid Dynamics Simulation on the GPU.

The code can be found in the following github repository:

Thanks mmikalsen,

I’m short on time today but will take a look later in the week.

  • Mat