Memory leak in OpenCL

Hi,

I am facing a memory leak when calling a OpenCL kernel many times, around 10^8. I posted more details in the cuda forum https://devtalk.nvidia.com/default/topic/529018/cuda-programming-and-performance/memory-leak-in-opencl-under-linux-when-the-number-of-kernels-calls-is-huge/. Here I would like to know the meaning of this error message I always see in dmesg whenever a memory leak happens

NVRM: Xid (0000:02:00): 31, Ch 00000001, engmask 00000101, intr 10000000

Thanks in advance

nvidia-bug-report.log.gz.bmp (70.8 KB)

I decided to follow one of the suggestions present in this forum http://www.nvnews.net/vbulletin/showthread.php?t=177732&page=7 and it seems to be working. After disabling the thermal monitor in nvidia-settings the simulation passed through the memory leak point without any problems. I will run longer simulations and post them back here…

After running a few tests with msi enabled and thermal monitor disabled in nvidia-settings I can say that they do not solve the memory leak problem, but they clearly provide a tremendous improvement. Now it is not always that there is a memory leak and when it happens it takes much more time than before…

It seams that there is something strange in the way the nvidia driver is handling interruptions…