This is my first time on the forums, so I apologize if I’m not following any norms on here. I have two questions:
This first and most important question is where I can find some information about the error code I am getting from nvprof. I have a Tesla P100-PCIE on an Ubuntu 18.04.3 server running CUDA 10.2 with a driver version: 440.33.01. I am running the command “nvprof --metrics all --events all ./exe” where ./exe is one of the basic benchmarking programs I am running. All programs run with nvprof returning an error “internal profiling error 4055:7” I can find no information on what this error means and absolutely nothing about how to remedy this problem. I saw a very old forum post that says I need to run nvprof with sudo priveleges, but that did not help. Any information helps.
My second question is about the kernel timings output by nvprof in the most simple call of “nvprof ./exe” I want to find the runtime of my program that is independent of any timing differences due to other programs running in the background. I see nvprof divides the timing by GPU Activites and API Calls. The timings within these categories add up to some 100% of the timing of that category. To get the timing I am looking for, do I simply add up all of the timings reported from nvprof? Or would that increase the actual runtime of the program due to some parallel processes being counted as in serial? Again, anything help!