Error: Internal profiling error 3938:999

Hello again.
I have:
GTX 960
cuda_11.2.0_460.89_win10
windows 10 64bit, 20H2, 19042.804

I have a simple monte carlo implantation that run with arguments for iterations (two first args) and thread per block number (third arg).

I’m trying to get analytics metrics with this command:

nvprof --analysis-metrics -o test.nvvp .\test.exe 512 65536 32

but i’m gating this error:

==5908== NVPROF is profiling process 5908, command: .\test.exe 512 65536 32
==5908== Some kernel(s) will be replayed on device 0 in order to collect all events/metrics.
==5908== Replaying kernel “GPGPUKernel(unsigned int, unsigned int, unsigned int*, curandStateXORWOW*, float)” (1 of 53).==5908== Replaying kernel “GPGPUKernel(unsigned int, unsigned int, unsigned int*, curandStateXORWOW*, float)” (2 of 53)…
elapsed_cycles_sm
active_cycles
inst_issued1
inst_issued2
l2_subp0_total_read_sector_queries
l2_subp1_total_read_sector_queries
4 internal events
==5908== Error: Internal profiling error 3938:999.
Time: 0.00======== Error: CUDA profiling error.

What that is mean and how could resolve it?
Thanks to all…

3 Likes

Did you ever make any progress with this? I’m seeing the same thing and I’ve narrowed it down to a cudaMemcpy() but have no idea what’s causing it. (Yes, the memcpy is inbounds and the error happens even if I only copy a single float.)

A lot of time has passed and I do not remember very well.
I believe i solved it with NVIDIA Nsight Options were i changed the WDDM TDR Enabled to False.