[RESOLVED] Profiling error 4168:999

When attempting to run either “kernel profile - Instruction Execution” or “kernel compute” analysis of my kernel, I get the following messages:

nvprof log: C:\Users\alexa\nvvp_workspace.metadata.plugins\com.nvidia.viper\launch\12\nvprof_19020.log
nvprof log: C:\Users\alexa\nvvp_workspace.metadata.plugins\com.nvidia.viper\launch\12\nvprof_10800.log
======== Error: CUDA profiling error.
==19020== Error: Internal profiling error 4168:999.

I am not certain what to do with this.

Occasionally when running “kernel compute” I get the basic graph of Function Unit Utilization.

I am running a 1050Ti with Cuda 9.0 installed.

I have some additional information: another kernel in the same process does not have the crash, but it does not do much computation, only memory movement.

For anyone who comes across this problem, I needed to increase the timeout. To ensure the computer is still usable, the OS (Windows 10) was killing my kernel.

I fixed this by:

  1. Running regedit from the search bar.
  2. Navigating to "Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers"
  3. Creating a new registry value called "TdrDelay" and setting it to 30 seconds.
1 Like

Hi, rennietherocket

Have your issue resolved now ?

Yes, marked resolved.

I am having the same issue on Ubuntu 16.04. nvprof --version:

nvprof: NVIDIA ® Cuda command line profiler
Copyright © 2012 - 2017 NVIDIA Corporation
Release version 9.0.176 (21)

Trying to profile my program gives the following output:

/usr/local/cuda/bin/nvprof -f -o EncodingBenchmark.nvprof ./E ncodingBenchmark -d 0
==3160== NVPROF is profiling process 3160, command: ./EncodingBenchmark -d 0
Using device Quadro P4000
==3160== Error: Internal profiling error 4168:999.
======== Error: CUDA profiling error.

Just noticed it fails only with Pascal GPUs but works fine with Kepler.

Do you know what the timeout is for both GPUs? That was what resolved my issue.

I read your comment and tried several options for the timeout, even 10s, but that did not change anything sadly.

Could this be some kind of driver / profiler version incompatibility? I am running driver 384.90.

Are you sure you are not overrunning the generous 10sec timeout? Do you have a way to try and reduce the amount of time in the kernel? Does your code run to completion outside of the profiler, if so how long does it take?

I am asking this as the error I got was related to the timeout on the GPU, so I am assuming that error code means that the GPU timedout.

Its possible, I don’t know.

Driver 384.98 shows the same behavior. Tried using 384.81, which is bundled with CUDA 9.0 toolkit, same issue. Is there any possibility to enable some extended logging to understand what causes this 4168 error?

Hi, rbundulis

Have you tried other sdk sample ?

If others work OK, can you provide us your sample ?

Hi,

I get the same error (Profiling error 4168:999).

Card: GTX1060 on Windows 10 (driver 388.19) with 9.0/9.1, and on Linux (drivers 387.34, 384.98) with 9.0/9.1.

Profiling my application with 8.0 works.

Hi, bernhardh

Can you use the driver along with the toolkit and have a try ?

Maybe its a driver/toolkit mismatch issue

I simply installed the cuda toolkit (including the driver). Is this sufficient to have them matching?

Nice to hear I am not the only one.

I actually tried CUDA 9.0 toolkit with all the drivers (with the one bundled with the toolkit and the latest one) - both showed the same error.

Btw If I actually use the api calls cudaProfilerInitialize / cudaProfilerStart / cudaProfilerStop I get a valid output file, so it seems to be an issue with the profiling tool not the API as such (just my guess).

Let’s conclude here.

So you are using GTX1050 Ti and GTX1060 and with cuda 9.0.176.
With any sdk sample, windows and Linux will give this error, right ?

Win10, cuda9.1.85, drv 388.19, Visual Profiler works when running samples (tried simpleGL, transpose, freeImageInterop).

Win10, cuda9.1.85, drv 388.19, when trying to profile my custom application (with Visual Profiler) i get “Internal profiling error 4168:999”

EDIT:
Win10, cuda9.1.85, drv 388.19, when trying to profile my custom application (with nvprof from commandline, no additional options) i get “Internal profiling error 4168:999”