Trying to debug matrixMul example and hit an CUDBG_ERROR_INTERNAL(0xa) error

Environment: WSL2 (Windows 10)
CUDA: 12.0
NVIDIA Driver: 528.02
GPU: GeForce RTX 3080 Laptop

I’m trying to test out the Nsight VSCode extension to update my work environment. I can build and run the matrixMul example without issue. However, when I try to launch the CUDA C++ debugger, I get the following error:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/".
[Detaching after fork from child process 2091]
[New Thread 0x7fffef75e000 (LWP 2094)]
Error: Failed to suspend device for CUDA device 0, error=CUDBG_ERROR_INTERNAL(0xa).

I have tried a clean reinstall of WSL2 and CUDA, but I cannot seem to get this to work. Any help would be massively appreciated!

Thanks for the report. I’d like to get a bit more information about your environment. You mention running VS Code. Are you running VS Code as a Window application, and using the remote extension to interact with CUDA GDB in WSL? If you describe the full system setup, that would assist us in debugging this. Also, have you tried running CUDA GDB natively within WSL? This will help us to determine if we have a VS Code issue, or a CUDA GDB issue.

Hi @steveu,

I’m running VS Code as a remote extension and I installed CUDA 12.0 as per here. I have realised the issue is unlikely with the extension and more so with CUDA-GDB itself.

In a clean installation of Ubuntu, I install CUDA 12.0 using the following:

sudo apt-key del 7fa2af80
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt-get install cuda
export PATH=/usr/local/cuda-12.0/bin${PATH:+:${PATH}}

Following on from this, I obtain the cuda-samples folder using git clone The deviceQuery and bandwidthTest files run without issue, and the resulting tests PASS, as below.

Finally, I navigate to the matrixMul folder and make the file before running:

file matrixMul
break main

Which returns the following:

Then, after inputting continue, the following is returned:

[Matrix Multiply Using CUDA] - Starting...
[Detaching after fork from child process 2006]
[New Thread 0x7fffef75e000 (LWP 2009)]
Error: Failed to suspend device for CUDA device 0, error=CUDBG_ERROR_INTERNAL(0xa).

A full screenshot of the CUDA-GDB output is shown below:

The same error arises when running the debugger through NSight.

Thanks in advance for any help.