Unknown storage specifier in cuda-gdb

When I attempt to print all the element values of an array in cuda-gdb (the array length is specified as 12 during compile-time, and each element has the type uint32_t), cuda-gdb shows an error message: “Unknown storage specifier (read) 0x10000”.
The ‘p &array’ prompts as follows:
(@register _ZN9LargeUintIjLj12EE1EE (*)[12]) 0x7fffe7fffb58
Whether in the cuda-gdb manual or in the prompts from the ‘help’ command, I haven’t seen any option to add additional parameters for setting a storage specifier. How should I address this issue?

There is a sub-forum dedicated to cuda-gdb and you may receive better / faster answers there.

1 Like

Hi @SparkHu
Could you please share a few more details about the issue:

  • Output of nvidia-smi command
  • Output of cuda-gdb --version command
  • CUDA kernel source code

@AKravets Thanks for your response.
Output of nvidia-smi:

Sun Jul 23 16:02:54 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3080         On | 00000000:01:00.0 Off |                  N/A |
| 30%   40C    P8               36W / 320W|      1MiB / 10240MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce RTX 3080         On | 00000000:25:00.0 Off |                  N/A |
| 30%   31C    P8               23W / 320W|      1MiB / 10240MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce RTX 3080         On | 00000000:41:00.0 Off |                  N/A |
| 30%   30C    P8               22W / 320W|      1MiB / 10240MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   3  NVIDIA GeForce RTX 3080         On | 00000000:61:00.0 Off |                  N/A |
| 30%   28C    P8               26W / 320W|      1MiB / 10240MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   4  NVIDIA GeForce RTX 3080         On | 00000000:81:00.0 Off |                  N/A |
| 30%   29C    P8               25W / 320W|      1MiB / 10240MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   5  NVIDIA GeForce RTX 3080         On | 00000000:A1:00.0 Off |                  N/A |
| 30%   29C    P8               20W / 320W|      1MiB / 10240MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   6  NVIDIA GeForce RTX 3080         On | 00000000:C1:00.0 Off |                  N/A |
| 30%   28C    P8               29W / 320W|      1MiB / 10240MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   7  NVIDIA GeForce RTX 3080         On | 00000000:E1:00.0 Off |                  N/A |
| 30%   27C    P8               17W / 320W|      1MiB / 10240MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Output of cuda-gdb --version:

NVIDIA (R) CUDA Debugger

CUDA Toolkit 12.1 release

Portions Copyright (C) 2007-2023 NVIDIA Corporation

**GNU gdb (GDB) 12.1**

Copyright (C) 2022 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law.

As there are several source code files involved, I have packaged the source code
src.zip (4.1 KB)
and uploaded it.The source code implements the Montgomery modular reduction algorithms on a finite field. Different implementation approaches are used on the host and device sides. The entry kernel function is in the ‘kernel’ function in ‘main.cu’.
Besides not being able to view variable values in cuda-gdb, the results of the debug and release versions are also inconsistent. The release version produces correct results, but the debug version outputs incorrect results.
The command to compile the release version is:
nvcc -o test main.cu
The output of release version:

e98b9564, a92043ac, b25e5075, 70d69a83, 2f4a1a59, 1f8ade1c, 8c1d97e5, 343b588d, 108ce2db, d4df2d9b, f276f5d6, 1795837
e98b9564, a92043ac, b25e5075, 70d69a83, 2f4a1a59, 1f8ade1c, 8c1d97e5, 343b588d, 108ce2db, d4df2d9b, f276f5d6, 1795837

The command to compile the debug version is:
nvcc -o test -G main.cu
The output of debug version:

e98b9564, a92043ac, b25e5075, 70d69a83, 2f4a1a59, 1f8ade1c, 8c1d97e5, 343b588d, 108ce2db, d4df2d9b, f276f5d6, 1795837
54331d22, 6410f330, 9badc234, 28c1d693, 8acd7b6b, c1f71e54, c66c6c90, d3b5a2ef, e87388f6, 9854398c, 34839e1, c812a3

The two lines of numbers in the output should be exactly the same.
Could you help me take a look at these two issues? Thanks a lot.

In cuda-gdb, when I enter the function d_mont_reduce and try to print the value of data , an error of “Unknown storage specifier” occurred. The output is as follows:

(cuda-gdb) n
65	        for (int i = 0; i < N; i += 2) {
(cuda-gdb) l
60	    }
61	
62	    static __device__ void d_mont_reduce(LU &data, const uint32_t inv, const LU &modulo) {
63	        constexpr uint32_t N = GetLargeUintParam<LU>::element_num;
64	        LU odd;
65	        for (int i = 0; i < N; i += 2) {
66	            mul_by_1(data, odd, inv, modulo, i==0);
67	            mul_by_1(odd, data, inv, modulo, false);
68	        }
69	        asm volatile("add.cc.u32 %0, %0, %1;" : "+r"(data[0]) : "r"(odd[1]));
(cuda-gdb) p/x data
$1 = (@register LU & @register) <error reading variable: Unknown storage specifier (read) 0x10000>
(cuda-gdb) p/x data.value
Unknown storage specifier (read) 0x10000

Hi @SparkHu
Thank you very much for providing the details. We were able to reproduce the issue, so please expect it to be fixed in one of the upcoming CUDA releases.

I will update this topic when the CUDA version with this issue fixed is released.

Hi @AKravets
Which issue are you referring to? or both?

@SparkHu
We managed to reproduce the Unknown storage specifier issue.

For the inconsistency between the release and debug builds of the same CUDA kernel you might get a faster reply if you ask in the sub-forum dedicated to CUDA programming: CUDA Programming and Performance - NVIDIA Developer Forums