Cuda-gdb doesnt fit hopper?(tested on cutlass-example-48)

(base) zyhuang@sdzx-h100-1:~/cutlass/build/examples/48_hopper_warp_specialized_gemm$ cuda-gdb ./48_hopper_warp_specialized_gemm
NVIDIA (R) cuda-gdb 12.4
Portions Copyright (C) 2007-2023 NVIDIA Corporation
Based on GNU gdb 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This CUDA-GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://forums.developer.nvidia.com/c/developer-tools/cuda-developer-tools/cuda-gdb>.
Find the CUDA-GDB manual and other documentation resources online at:
    <https://docs.nvidia.com/cuda/cuda-gdb/index.html>.
--Type <RET> for more, q to quit, c to continue without paging--

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./48_hopper_warp_specialized_gemm...
(cuda-gdb) run
Starting program: /home/zyhuang/cutlass/build/examples/48_hopper_warp_specialized_gemm/48_hopper_warp_specialized_gemm 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff5baa000 (LWP 774620)]
[New Thread 0x7ffff48cc000 (LWP 774621)]
[Detaching after fork from child process 774622]
Enter sm90_mma_tma_gmma_ss_warpspecialized
[New Thread 0x7ffde0f9c000 (LWP 774954)]
warning: Cuda API error detected: cudaLaunchKernelExC returned (0x1)

[ ERROR: CUDA Runtime ] /home/zyhuang/cutlass/include/cutlass/cluster_launch.hpp:176: invalid argument
warning: Cuda API error detected: cudaGetLastError returned (0x1)

Got cutlass error: Error Internal at: 439
[Thread 0x7ffff48cc000 (LWP 774621) exited]
[Thread 0x7ffde0f9c000 (LWP 774954) exited]
[Thread 0x7ffff5baa000 (LWP 774620) exited]
[Inferior 1 (process 774616) exited with code 01]
1 Like

Hi @202476410arsmart,
Does you application work without the cuda-gdb? The log you posted suggests that there might be an issue with cudaLaunchKernelExC call.

1 Like

good idea, let me try and feedback to you tomorrow

Well, the code is correct(exactly from cutlass and can run without cuda-gdb and -g -G). But -g -G compile is very slow and it shows many scaring things like:

setmaxnreg would be eliminated…wmma sth will be serialized…

I think this is nvcc’s problem?

And adding -g -G makes compiling verrrry slow.

1 Like

Thank you for the reply! We are investigating the issue.

2 Likes

Hi, @202476410arsmart

I can reproduce the error. But note the debug version sample run fail directly without cuda-gdb.

local-veraj@ipp2-0051:~/cutlass/examples/examples/48_hopper_warp_specialized_gemm$ /usr/local/cuda-12.6/bin/cuda-gdb ./48_hopper_warp_specialized_gemm
NVIDIA (R) cuda-gdb 12.6
Portions Copyright (C) 2007-2024 NVIDIA Corporation
Based on GNU gdb 13.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type “show copying” and “show warranty” for details.
This CUDA-GDB was configured as “x86_64-pc-linux-gnu”.
Type “show configuration” for configuration details.
For bug reporting instructions, please see:
https://forums.developer.nvidia.com/c/developer-tools/cuda-developer-tools/cuda-gdb.
Find the CUDA-GDB manual and other documentation resources online at:
https://docs.nvidia.com/cuda/cuda-gdb/index.html.

For help, type “help”.
Type “apropos word” to search for commands related to “word”…
Reading symbols from ./48_hopper_warp_specialized_gemm…
(cuda-gdb) r
Starting program: /localhome/local-veraj/cutlass/examples/examples/48_hopper_warp_specialized_gemm/48_hopper_warp_specialized_gemm
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
[New Thread 0x7ffff43ff000 (LWP 4065)]
[New Thread 0x7ffff2fff000 (LWP 4066)]
[Detaching after fork from child process 4067]
[New Thread 0x7ffff1383000 (LWP 4077)]
warning: Cuda API error detected: cudaLaunchKernelExC returned (0x2)

warning: Cuda API error detected: cudaGetLastError returned (0x2)

Got cutlass error: Error Internal at: 415
[Thread 0x7ffff2fff000 (LWP 4066) exited]
[Thread 0x7ffff1383000 (LWP 4077) exited]
[Thread 0x7ffff43ff000 (LWP 4065) exited]
[Inferior 1 (process 4061) exited with code 01]
(cuda-gdb) exit

local-veraj@ipp2-0051:~/cutlass/examples/examples/48_hopper_warp_specialized_gemm$ ./48_hopper_warp_specialized_gemm
Got cutlass error: Error Internal at: 415

So this is not an issue for cuda-gdb.
This is related with cutlass. Can you please check with cutlass team in the github directly ?

1 Like

Thanks! OK! I will do that at once.

The -g is fixed, but -G is not fixed. I am stilling tracking their progress!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.