The cuda-gdb indefinitely hangs after outputting "[Switching focus to CUDA kernel 0, grid 1..."

I am attempting to debug a Rust project using cuda-gdb , where the project call CUDA functions through FFI. I am unsure whether such programs are supported by cuda-gdb . However, there are no reported errors from cuda-gdb , and setting breakpoints works correctly; breakpoints on the CPU side are hit as expected, but breakpoints on the GPU side result in the program hanging indefinitely.

The rust project: sppark
OS: Ubuntu 20.04.6 LTS
CUDA toolkit:

nvcc: NVIDIA (R) Cuda compiler driver

Copyright (c) 2005-2023 NVIDIA Corporation

Built on Fri_Sep__8_19:17:24_PDT_2023

Cuda compilation tools, release 12.3, V12.3.52

Build cuda_12.3.r12.3/compiler.33281558_0

Driver version: 545.23.06
Graphics card: GTX3080
Rustc version: rustc 1.75.0-nightly (97c81e1b5 2023-10-07)

Reproduction steps for the issue:

  1. Modify the poc/msm-cuda/build.rs file
    Replace the code on lines 102 and 103:
nvcc.flag("-arch=sm_80");
nvcc.flag("-gencode").flag("arch=compute_70,code=sm_70");

with the following code:

nvcc.flag("-arch=compute_86");
nvcc.flag("-code=sm_86");
nvcc.flag("-G");
nvcc.flag("-g");
  1. cd poc/msm-cuda
  2. cargo test --features bls12_377 --no-run
    This will create an executable file with a name starting with ‘msm-’ in the ‘target/debug/deps’ directory.
  3. cuda-gdb target/debug/deps/msm-XXXXXX
  4. set args --test msm_correctness
  5. b breakdown

Output of cuda-gdb:


Hi @SparkHu
Thank you for the report! To help us identify the issue could you re-run the debugging scenario with the additional logging enabled?

  • Add NVLOG_CONFIG_FILE variable pointing the nvlog.config file (attached). E.g.: NVLOG_CONFIG_FILE=${HOME}/nvlog.config
    nvlog.config (539 Bytes)

  • Run the debugging session.

  • You should see the /tmp/debugger.log file created - could you share it with us?

Thank you for your prompt response. After cuda-gdb hung for at least 5 minutes, I forcefully terminated cuda-gdb
debugger.log (653.4 KB)

Thank you!
We will need some time to analyze. I will reply back shortly.

Hello!

We have managed to reproduce this issue locally and the work to address the issue has been scheduled. I will update this post as soon as the fixed cuda-gdb version is available.

Hi @SparkHu,
The reported issue should be resolved in the latest CUDA 12.4 release: CUDA Toolkit 12.4 Downloads | NVIDIA Developer

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.