Error message "No device could be created" when running Isaac Sim on a multi-GPU server

Hello all,

I’m hoping someone can help me with an error message I’m encountering when trying to run Isaac Sim on a multi-GPU server. Specifically, I’m receiving the following error message:

[Info] [carb] Logging to file: /home/xxx/.nvidia-omniverse/logs/Kit/Isaac-Sim/2022.2/kit_20230311_102322.log
2023-03-11 10:23:22 [29ms] [Warning] [omni.ext.plugin] [ext: omni.drivesim.sensors.nv.lidar] Extensions config 'extension.toml' doesn't exist '/home/xxx/.local/share/ov/pkg/isaac_sim-2022.2.0/exts/omni.drivesim.sensors.nv.lidar' or '/home/xxx/.local/share/ov/pkg/isaac_sim-2022.2.0/exts/omni.drivesim.sensors.nv.lidar/config'
2023-03-11 10:23:22 [29ms] [Warning] [omni.ext.plugin] [ext: omni.drivesim.sensors.nv.radar] Extensions config 'extension.toml' doesn't exist '/home/xxx/.local/share/ov/pkg/isaac_sim-2022.2.0/exts/omni.drivesim.sensors.nv.radar' or '/home/xxx/.local/share/ov/pkg/isaac_sim-2022.2.0/exts/omni.drivesim.sensors.nv.radar/config'
[0.309s] [ext: omni.stats-0.0.0] startup
[0.360s] [ext: omni.rtx.shadercache-1.0.0] startup
[0.378s] [ext: omni.assets.plugins-0.0.0] startup
[0.380s] [ext: omni.gpu_foundation-0.0.0] startup
2023-03-11 10:23:22 [354ms] [Warning] [carb] FrameworkImpl::setDefaultPlugin(client: omni.gpu_foundation_factory.plugin, desc : [carb::graphics::Graphics v2.11], plugin : carb.graphics-vulkan.plugin) failed. Plugin selection is locked, because the interface was previously acquired by: 
[0.389s] [ext: carb.windowing.plugins-1.0.0] startup
[0.400s] [ext: omni.kit.renderer.init-0.0.0] startup

|---------------------------------------------------------------------------------------------|
| Driver Version: 0             | Graphics API: Vulkan
|=============================================================================================|
| GPU | Name                             | Active | LDA | GPU Memory | Vendor-ID | LUID       |
|     |                                  |        |     |            | Device-ID | UUID       |
|=============================================================================================|
| OS: Linux 5abaebe11826, Version: 5.4.0-132-generic
| XServer Vendor: The X.Org Foundation, XServer Version: 12008000 (1.20.8.0)
| Processor: AMD EPYC 7542 32-Core Processor                 | Cores: Unknown | Logical: 128
|---------------------------------------------------------------------------------------------|
| Total Memory (MB): 515833 | Free Memory: 265317
| Total Page/Swap (MB): 65535 | Free Page/Swap: 65535
|---------------------------------------------------------------------------------------------|
2023-03-11 10:23:22 [438ms] [Error] [gpu.foundation.plugin] No device could be created. Some known system issues:
- The driver is not installed properly and requires a clean re-install.
- Your GPUs do not support RayTracing: DXR or Vulkan ray_tracing, or hardware is excluded due to performance.
- The driver cannot enumerate any GPU: driver, display or a docker issue. For Vulkan, test it with Vulkaninfo tool from Vulkan SDK, instead of nvidia-smi.
- For Ubuntu, it requires server-xorg-core 1.20.7+ and a display to work without --no-window.
- For Linux dockers, the setup is not complete. Install the latest driver, xServer and NVIDIA container runtime.


|---------------------------------------------------------------------------------------------|
| Driver Version: 0             | Graphics API: Vulkan
|=============================================================================================|
| GPU | Name                             | Active | LDA | GPU Memory | Vendor-ID | LUID       |
|     |                                  |        |     |            | Device-ID | UUID       |
|=============================================================================================|
| OS: Linux 5abaebe11826, Version: 5.4.0-132-generic
| XServer Vendor: The X.Org Foundation, XServer Version: 12008000 (1.20.8.0)
| Processor: AMD EPYC 7542 32-Core Processor                 | Cores: Unknown | Logical: 128
|---------------------------------------------------------------------------------------------|
| Total Memory (MB): 515833 | Free Memory: 265283
| Total Page/Swap (MB): 65535 | Free Page/Swap: 65535
|---------------------------------------------------------------------------------------------|
2023-03-11 10:23:22 [458ms] [Error] [gpu.foundation.plugin] No device could be created. Some known system issues:
- The driver is not installed properly and requires a clean re-install.
- Your GPUs do not support RayTracing: DXR or Vulkan ray_tracing, or hardware is excluded due to performance.
- The driver cannot enumerate any GPU: driver, display or a docker issue. For Vulkan, test it with Vulkaninfo tool from Vulkan SDK, instead of nvidia-smi.
- For Ubuntu, it requires server-xorg-core 1.20.7+ and a display to work without --no-window.
- For Linux dockers, the setup is not complete. Install the latest driver, xServer and NVIDIA container runtime.

2023-03-11 10:23:22 [458ms] [Error] [omni.gpu_foundation_factory.plugin] Failed to create GPU foundation devices for compatibilityMode!

However, I believe that I have correctly installed the necessary drivers, and the output from nvidia-smi seems to indicate that both of my NVIDIA A40 GPUs are working properly. Here’s the output from nvidia-smi :

(base) xxx@5abae:~$nvidia-smi
Sat Mar 11 10:47:30 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A40          On   | 00000000:01:00.0 Off |                    0 |
|  0%   44C    P0   121W / 300W |   2515MiB / 46068MiB |     30%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A40          On   | 00000000:25:00.0 Off |                    0 |
|  0%   54C    P0    85W / 300W |   3198MiB / 46068MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA A40          On   | 00000000:41:00.0 Off |                    0 |
|  0%   45C    P0   108W / 300W |   2375MiB / 46068MiB |     19%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA A40          On   | 00000000:61:00.0 Off |                    0 |
|  0%   44C    P0    84W / 300W |   3846MiB / 46068MiB |      4%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   4  NVIDIA A40          On   | 00000000:81:00.0 Off |                    0 |
|  0%   58C    P0   193W / 300W |   4080MiB / 46068MiB |     48%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   5  NVIDIA A40          On   | 00000000:A1:00.0 Off |                    0 |
|  0%   56C    P0   156W / 300W |   4080MiB / 46068MiB |     55%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   6  NVIDIA A40          On   | 00000000:C1:00.0 Off |                    0 |
|  0%   59C    P0   153W / 300W |   4080MiB / 46068MiB |     53%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   7  NVIDIA A40          On   | 00000000:E1:00.0 Off |                    0 |
|  0%   60C    P0   180W / 300W |   4080MiB / 46068MiB |     54%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    709398      C   python                           2513MiB |
|    1   N/A  N/A    711881      C   python                           2779MiB |
|    1   N/A  N/A    827543      G   python                            416MiB |
|    2   N/A  N/A    728400      C   python                           2373MiB |
|    3   N/A  N/A    711881      G   python                            118MiB |
|    3   N/A  N/A    728400      G   python                             63MiB |
|    3   N/A  N/A    827543      C   python                           3661MiB |
|    4   N/A  N/A    827539      C   python                           3661MiB |
|    4   N/A  N/A    827541      G   python                            416MiB |
|    5   N/A  N/A    827540      C   python                           3661MiB |
|    5   N/A  N/A    827542      G   python                            416MiB |
|    6   N/A  N/A    827540      G   python                            416MiB |
|    6   N/A  N/A    827541      C   python                           3661MiB |
|    7   N/A  N/A    827539      G   python                            416MiB |
|    7   N/A  N/A    827542      C   python                           3661MiB |
+-----------------------------------------------------------------------------+

Meanwhile, I also face with a problem connecting to omniverse local host, would it give rise to this running failure?
Could someone please help me troubleshoot this issue? I’d be happy to provide any additional information that might be helpful in resolving the problem.

Thank you in advance for your help!

Hi @Rey19 , can you confirm, what version of Isaac Sim is installed on your computer?

You can find the latest driver requirements here: 1. Isaac Sim Requirements — Omniverse Robotics documentation

Also, you can review the Linux troubleshooting document here: Linux Troubleshooting — Omniverse Robotics documentation

Hi. Are you running natively from the OV Launcher and a display connected to one of the GPU?

Try install the latest 525.89.02 drivers. Uninstall the current driver and remove the folders below before re-installing:

/etc/vulkan
/usr/share/vulkan

Are you able to install and run Create and Code from the Omniverse Launcher?

Thank you for reaching out. Currently, I am using vncserver/vncviewer to access the system. I encountered the same issue with Create and Code when I tried to run them. However, previously, I was able to run Create, Code and Isaac-Sim from the Omniverse Launcher via vnc on another server.

Since I am now using a shared remote server, reinstalling the driver may not be convenient. Aside from reinstalling the driver, do you have any other suggestions for troubleshooting the issue? Please let me know. Thank you.

I installed the latest version isaac_sim-2022.2.0.
I think the computer(Ubuntu 20.04, 8 RTX A40 GPUs) satisfies all the requirements.

xxx@yyy:~$ lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   43 bits physical, 48 bits virtual
CPU(s):                          128
On-line CPU(s) list:             0-127
Thread(s) per core:              2
Core(s) per socket:              32
Socket(s):                       2
NUMA node(s):                    2
Vendor ID:                       AuthenticAMD
CPU family:                      23
Model:                           49
Model name:                      AMD EPYC 7542 32-Core Processor
Stepping:                        0
Frequency boost:                 enabled
CPU MHz:                         1511.050
CPU max MHz:                     2900.0000
CPU min MHz:                     1500.0000
BogoMIPS:                        5800.00
Virtualization:                  AMD-V
L1d cache:                       2 MiB
L1i cache:                       2 MiB
L2 cache:                        32 MiB
L3 cache:                        256 MiB
NUMA node0 CPU(s):               0-31,64-95
NUMA node1 CPU(s):               32-63,96-127
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Mmio stale data:   Not affected
Vulnerability Retbleed:          Vulnerable
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and secc
                                 omp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitiza
                                 tion
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditi
                                 onal, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pa
                                 t pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
                                 pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid e
                                 xtd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_
                                 1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_lega
                                 cy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch os
                                 vw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perf
                                 ctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibrs
                                  ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdsee
                                 d adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves
                                  cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf
                                  xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale
                                 vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avi
                                 c v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
xxx@yyy:~$ cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c
    128  AMD EPYC 7542 32-Core Processor
xxx@yyy:~$ free -g
              total        used        free      shared  buff/cache   available
Mem:            503         201         277           4          24         294
Swap:            63          11          52
xxx@yyy:~$ nvidia-smi
Fri Mar 17 02:42:28 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A40          On   | 00000000:01:00.0 Off |                    0 |
|  0%   66C    P0   261W / 300W |   4439MiB / 46068MiB |     82%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A40          On   | 00000000:25:00.0 Off |                    0 |
|  0%   28C    P8    29W / 300W |      0MiB / 46068MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA A40          On   | 00000000:41:00.0 Off |                    0 |
|  0%   27C    P8    29W / 300W |      0MiB / 46068MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA A40          On   | 00000000:61:00.0 Off |                    0 |
|  0%   26C    P8    29W / 300W |    503MiB / 46068MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   4  NVIDIA A40          On   | 00000000:81:00.0 Off |                    0 |
|  0%   64C    P0   242W / 300W |   4943MiB / 46068MiB |     92%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   5  NVIDIA A40          On   | 00000000:A1:00.0 Off |                    0 |
|  0%   61C    P0   236W / 300W |   4943MiB / 46068MiB |     88%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   6  NVIDIA A40          On   | 00000000:C1:00.0 Off |                    0 |
|  0%   63C    P0   249W / 300W |   4943MiB / 46068MiB |     83%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   7  NVIDIA A40          On   | 00000000:E1:00.0 Off |                    0 |
|  0%   64C    P0   241W / 300W |   4943MiB / 46068MiB |     82%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A   3532963      C   python                           4435MiB |
|    3   N/A  N/A   3532963      G   python                            503MiB |
|    4   N/A  N/A   3532959      C   python                           4435MiB |
|    4   N/A  N/A   3532961      G   python                            503MiB |
|    5   N/A  N/A   3532960      C   python                           4435MiB |
|    5   N/A  N/A   3532962      G   python                            503MiB |
|    6   N/A  N/A   3532960      G   python                            503MiB |
|    6   N/A  N/A   3532961      C   python                           4435MiB |
|    7   N/A  N/A   3532959      G   python                            503MiB |
|    7   N/A  N/A   3532962      C   python                           4435MiB |
+-----------------------------------------------------------------------------+

Hi. I believe the best way to troubleshoot this issue is to update the drivers.

If possible, try to update to the latest Isaac Sim 2022.2.1 release as well.