Error: Internal profiling error 4182:999


I’m trying to profile an app that uses CUDA and OpenGL in multiple contexts. When running under the profiler and as soon as something happens in OpenGL contexts, the profiler crashes with the following error:

Error: Internal profiling error 4182:999

cuda-memcheck finds no problems.

Any idea how I can proceed from there?

CUDA 9.2, driver 396.26 on Ubuntu 16.04, kernel 4.15.0-33


Hi, konstantin_a

Can you check if you can reproduce the same problem if use SDK sample like 2_Graphics/simpleGL or 5_Simulations/fluidsGL ?

If not, can you share the sample you used ?



The sample (fluidsGL) works. Unfortunately it’s not easy to share my code here because it’s a big proprietary application. Something else that is different in my app is that it’s using cuda from multiple threads - maybe it has something to do with the crash. I’ll update the thread if I find something else interesting.

It would be nice to know what the error numbers from the profiler actually mean.

Actually, the 3_Imaging/simpleCUDA2GL sample does not work. Here is the output:

[/usr/local/cuda/samples/3_Imaging/simpleCUDA2GL] [Mon Sep 10]
[13:17:14 :) ]$ nvprof ./simpleCUDA2GL 
./simpleCUDA2GL Starting...

(Interactive OpenGL Demo)
==30550== NVPROF is profiling process 30550, command: ./simpleCUDA2GL
GPU Device 0: "GeForce GTX 1070" with compute capability 6.1

Shader compilation error: Fragment info
0(4) : warning C7533: global variable gl_Color is deprecated after version 120

Shader compilation error: Fragment info
0(5) : warning C7533: global variable gl_TexCoord is deprecated after version 120
0(6) : warning C7533: global variable gl_FragColor is deprecated after version 120

	(right click mouse button for Menu)
	[esc] - Quit

==30550== Error: Internal profiling error 4182:999.
======== Error: CUDA profiling error.

I have met the same problem when trying 3_Imaging/simpleCUDA2GL.

======== Error: CUDA profiling error.
==2668== Warning: Unified Memory Profiling is not supported on the current configuration because a pair of devices without peer-to-peer support is detected on this multi-GPU setup. When peer mappings are not available, system falls back to using zero-copy memory. It can cause kernels, which access unified memory, to run slower. More details can be found at:
==2668== Error: Internal profiling error 4182:999.

ps. I am using profiler on Win10, surface book 1.

Do you have any idea about this problem now?


I’m pretty convinced that it is a bug in the profiling library. I haven’t found a workaround. Maybe CUDA 10 fixes it? I haven’t tried.

Thanks for you reply.

My PC seems not support CUDA 10.
And when i was using CUDA 8, profiler was OK.