(base) zyhuang@sdzx-h100-1:~/cutlass/build/examples/48_hopper_warp_specialized_gemm$ cuda-gdb ./48_hopper_warp_specialized_gemm
NVIDIA (R) cuda-gdb 12.4
Portions Copyright (C) 2007-2023 NVIDIA Corporation
Based on GNU gdb 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This CUDA-GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://forums.developer.nvidia.com/c/developer-tools/cuda-developer-tools/cuda-gdb>.
Find the CUDA-GDB manual and other documentation resources online at:
<https://docs.nvidia.com/cuda/cuda-gdb/index.html>.
--Type <RET> for more, q to quit, c to continue without paging--
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./48_hopper_warp_specialized_gemm...
(cuda-gdb) run
Starting program: /home/zyhuang/cutlass/build/examples/48_hopper_warp_specialized_gemm/48_hopper_warp_specialized_gemm
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff5baa000 (LWP 774620)]
[New Thread 0x7ffff48cc000 (LWP 774621)]
[Detaching after fork from child process 774622]
Enter sm90_mma_tma_gmma_ss_warpspecialized
[New Thread 0x7ffde0f9c000 (LWP 774954)]
warning: Cuda API error detected: cudaLaunchKernelExC returned (0x1)
[ ERROR: CUDA Runtime ] /home/zyhuang/cutlass/include/cutlass/cluster_launch.hpp:176: invalid argument
warning: Cuda API error detected: cudaGetLastError returned (0x1)
Got cutlass error: Error Internal at: 439
[Thread 0x7ffff48cc000 (LWP 774621) exited]
[Thread 0x7ffde0f9c000 (LWP 774954) exited]
[Thread 0x7ffff5baa000 (LWP 774620) exited]
[Inferior 1 (process 774616) exited with code 01]
Hi @202476410arsmart,
Does you application work without the cuda-gdb
? The log you posted suggests that there might be an issue with cudaLaunchKernelExC
call.
good idea, let me try and feedback to you tomorrow
Well, the code is correct(exactly from cutlass and can run without cuda-gdb and -g -G). But -g -G compile is very slow and it shows many scaring things like:
setmaxnreg would be eliminated…wmma sth will be serialized…
I think this is nvcc’s problem?
And adding -g -G makes compiling verrrry slow.
Thank you for the reply! We are investigating the issue.
I can reproduce the error. But note the debug version sample run fail directly without cuda-gdb.
local-veraj@ipp2-0051:~/cutlass/examples/examples/48_hopper_warp_specialized_gemm$ /usr/local/cuda-12.6/bin/cuda-gdb ./48_hopper_warp_specialized_gemm
NVIDIA (R) cuda-gdb 12.6
Portions Copyright (C) 2007-2024 NVIDIA Corporation
Based on GNU gdb 13.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type “show copying” and “show warranty” for details.
This CUDA-GDB was configured as “x86_64-pc-linux-gnu”.
Type “show configuration” for configuration details.
For bug reporting instructions, please see:
https://forums.developer.nvidia.com/c/developer-tools/cuda-developer-tools/cuda-gdb.
Find the CUDA-GDB manual and other documentation resources online at:
https://docs.nvidia.com/cuda/cuda-gdb/index.html.
For help, type “help”.
Type “apropos word” to search for commands related to “word”…
Reading symbols from ./48_hopper_warp_specialized_gemm…
(cuda-gdb) r
Starting program: /localhome/local-veraj/cutlass/examples/examples/48_hopper_warp_specialized_gemm/48_hopper_warp_specialized_gemm
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
[New Thread 0x7ffff43ff000 (LWP 4065)]
[New Thread 0x7ffff2fff000 (LWP 4066)]
[Detaching after fork from child process 4067]
[New Thread 0x7ffff1383000 (LWP 4077)]
warning: Cuda API error detected: cudaLaunchKernelExC returned (0x2)
warning: Cuda API error detected: cudaGetLastError returned (0x2)
Got cutlass error: Error Internal at: 415
[Thread 0x7ffff2fff000 (LWP 4066) exited]
[Thread 0x7ffff1383000 (LWP 4077) exited]
[Thread 0x7ffff43ff000 (LWP 4065) exited]
[Inferior 1 (process 4061) exited with code 01]
(cuda-gdb) exit
local-veraj@ipp2-0051:~/cutlass/examples/examples/48_hopper_warp_specialized_gemm$ ./48_hopper_warp_specialized_gemm
Got cutlass error: Error Internal at: 415
So this is not an issue for cuda-gdb.
This is related with cutlass. Can you please check with cutlass team in the github directly ?
Thanks! OK! I will do that at once.
The -g is fixed, but -G is not fixed. I am stilling tracking their progress!
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.