My device is rtx5060 and I’m trying to debug the following demo program in Ubuntu-24.04 WSL:
#include <cstdio>
#include <cuda_runtime.h>
__global__ void testKernel(int *data) {
int idx = threadIdx.x;
data[idx] += 1;
}
int main() {
cudaSetDevice(0);
const int N = 4;
int h_data[N] = {1, 2, 3, 4};
int *d_data = nullptr;
cudaMalloc(&d_data, N * sizeof(int));
cudaMemcpy(d_data, h_data, N * sizeof(int), cudaMemcpyHostToDevice);
// 启动 kernel
testKernel<<<1, N>>>(d_data);
cudaDeviceSynchronize();
// 拷回结果
cudaMemcpy(h_data, d_data, N * sizeof(int), cudaMemcpyDeviceToHost);
cudaFree(d_data);
printf("Results: ");
for (int i = 0; i < N; i++)
printf("%d ", h_data[i]);
printf("\n");
return 0;
}
When I start cuda-gdb, set the breakpoint on testKernel and run, it comes to the error below:
(cuda-gdb) break testKernel
Breakpoint 1 at 0x8ec4: file /home/sameta/my-flash-attention-minimal/test.cu, line 5.
(cuda-gdb) run
Starting program: /home/sameta/my-flash-attention-minimal/cuda_test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff5bff000 (LWP 1773)]
[New Thread 0x7ffff49e0000 (LWP 1775)]
[Detaching after fork from child process 1776]
[New Thread 0x7fffefbff000 (LWP 1785)]
[Thread 0x7fffefbff000 (LWP 1785) exited]
[New Thread 0x7fffefbff000 (LWP 1786)]
[New Thread 0x7fffee87e000 (LWP 1787)]
[Detaching after vfork from child process 1793]
cuda-gdb/14/gdb/cuda/cuda-state.c:274: internal-error: create_module: Assertion `context' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0x54234b ???
0x9932e4 ???
0x993508 ???
0xb3cac1 ???
0x637ee0 ???
0x603ec4 ???
0x6044be ???
0x4afe0f ???
0x7be492 ???
0x94b0bd ???
0x7736b2 ???
0x786a34 ???
0xb3d77c ???
0xb3d895 ???
0x7d1e56 ???
0x7d38a4 ???
0x44ae64 ???
0x7f8322bb91c9 __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
0x7f8322bb928a __libc_start_main_impl
../csu/libc-start.c:360
0x45e7fd ???
0xffffffffffffffff ???
---------------------
cuda-gdb/14/gdb/cuda/cuda-state.c:274: internal-error: create_module: Assertion `context' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
nvidia-smi output:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.04 Driver Version: 576.52 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 5060 On | 00000000:01:00.0 On | N/A |
| 0% 36C P8 12W / 145W | 773MiB / 8151MiB | 3% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
I use “nvcc -g -G test.cu -o cuda_test” to generate the execute file.
What I have done to solve the problem:
- Try different toolkit version. In fact, at fisrt, my toolkit is 13.0 and my driver is 580+, the result is nothing different. I also try 12.8, 12.5 with the 580+ driver.
- I try the current version refer to this post Cuda-gdb/13/gdb/cuda/cuda-state.c:250: internal-error: create_module: Assertion `context’ failed - CUDA Developer Tools / CUDA-GDB - NVIDIA Developer Forums
