Hi guys,
I need your help on virtual machine setup with A100 and vCS. I have MIG enabled and have a VM configured like this:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GRID A100-3-20C On | 00000000:02:02.0 Off | On |
| N/A N/A P0 N/A / N/A | 1808MiB / 20475MiB | N/A Default |
| | | Enabled |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 0 0 0 | 1808MiB / 20475MiB | 42 0 | 3 0 2 0 0 |
| | 0MiB / 4096MiB | | |
+------------------+----------------------+-----------+-----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
I am able to run cuda example without any problem.
However when I try to use cuda-gdb. It throws error msg like:
NVIDIA (R) CUDA Debugger
11.2 release
Portions Copyright (C) 2007-2020 NVIDIA Corporation
GNU gdb (GDB) 8.3.1
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./simplePrintf...
(cuda-gdb) r
Starting program: /home/qizhuang/NVIDIA_CUDA-11.2_Samples/0_Simple/simplePrintf/simplePrintf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Detaching after fork from child process 4451]
fatal: One or more CUDA devices cannot be used for debugging. Please consult the list of supported CUDA devices for more details. (error code = CUDBG_ERROR_INVALID_DEVICE(0xb)
(cuda-gdb)
I didn’t see any documentation on this. Could somebody help me to solve this problem? Thanks!