Wsl2 and Isaac Gym problem

Hi! I’m actually find some problem running Isaac Gym. I got a nvidia 2070, windows 11 (so there is no problem running graphics application), but when I start an example In python i got:

*** Warning: failed to preload CUDA lib
*** Warning: failed to preload PhysX libs
Importing module ‘gym_38’ (/home/enne/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_38.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/enne/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
WARNING: Forcing CPU pipeline.
Not connected to PVD
/buildAgent/work/f3416cf82e3cf1ba/source/physx/src/gpu/PxPhysXGpuModuleLoader.cpp (147) : internal error : libcuda.so!

[Warning] [carb.gym.plugin] Failed to create a PhysX CUDA Context Manager. Falling back to CPU.
Physics Engine: PhysX
Physics Device: cpu
GPU Pipeline: disabled
No GPU devices found.
[Error] [carb.gym.plugin] Failed to create Nvf device in createNvfGraphics. Please make sure Vulkan is correctly installed.
*** Failed to create sim

If i run nvidia-smi in got correctly my graphic card, with a Driver Version: 510.10 and CUDA Version: 11.6.

Is there any incompatibility or some steps are missing?

3 Likes

I am having the same problem with almost exactly the same error message. However, at the end I get a segfault instead:

(rlgpu) dtch1997@DESKTOP-AR4R24K:~/isaacgym/python/examples$ python joint_monkey.py
*** Warning: failed to preload CUDA lib
*** Warning: failed to preload PhysX libs
Importing module ‘gym_37’ (/home/dtch1997/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_37.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/dtch1997/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
WARNING: Forcing CPU pipeline.
Not connected to PVD
/buildAgent/work/f3416cf82e3cf1ba/source/physx/src/gpu
/PxPhysXGpuModuleLoader.cpp (147) : internal error : libcuda.so!
[Warning] [carb.gym.plugin] Failed to create a PhysX CUDA Context Manager. Falling back to CPU.
Physics Engine: PhysX
Physics Device: cpu
GPU Pipeline: disabled
Segmentation fault

2 Likes

I am getting the same error:

python3 joint_monkey.py
Importing module ‘gym_37’ (/home/kaykay/Downloads/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_37.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/kaykay/Downloads/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
WARNING: Forcing CPU pipeline.
[Error] [carb.gym.plugin] Sim CUDA device 0 can’t be set, the total number of available devices is -1
Not connected to PVD
/buildAgent/work/f3416cf82e3cf1ba/source/cudamanager/src/CudaContextManager.cpp (404) : warning : cuInit failed

[Warning] [carb.gym.plugin] Failed to create a valid PhysX CUDA Context Manager. Falling back to CPU.
Physics Engine: PhysX
Physics Device: cpu
GPU Pipeline: disabled
[Error] [carb.gym.plugin] Gym cuda error: no CUDA-capable device is detected: …/…/…/source/plugins/carb/gym/impl/Gym/GymCuda.h: 110
[Error] [carb.gym.plugin] Failed to create primary CUDA context
[Warning] [carb.gym.plugin] Failed to create primary CUDA context on graphics device
No GPU devices found.
[Error] [carb.gym.plugin] Failed to create Nvf device in createNvfGraphics. Please make sure Vulkan is correctly installed.
*** Failed to create sim

nvidia-smi gives this:

nvidia-smi
Sun Oct 24 17:54:21 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 510.00 Driver Version: 510.06 CUDA Version: 11.6 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … On | 00000000:01:00.0 Off | N/A |
| N/A 54C P8 4W / N/A | 220MiB / 6144MiB | N/A Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

1 Like

Hi there! We currently only support Ubuntu 18.04 or 20.04 as mentioned in our docs.

Hello @kellyg - thanks for your update. Though I instead tried running up Ubuntu in a Virtual Machine and was still facing an issue reported here: Failed to acquire interface - #7 by kaykay

Can you please help me out with this? Let me know if I should make a new thread instead.

I was hoping that that would not be a problem with wsl2 (since it is basically a virtual machine, with GPU passthrough and GUI - in my case of Ubuntu 20.04). As far I understood there is some Vulkan problem in wsl2 and maybe this is the motivation of this problem

1 Like

We have not tested Isaac Gym in a virtual machine or wsl2 so I can’t say for sure what issues may arise. Vulkan is generally required for rendering, if your use case doesn’t require rendering, you could try running one of the examples in headless mode and see if that works for you. From the error messages posted in this thread, it mostly looks like the PhysX backend was not able to find the correct CUDA binaries, or it couldn’t find any available GPU devices, so it’s possible that these things are not being mapped correctly in the virtual environments.

I am getting the same error. I’m using Ubuntu 18.04 on an ec2 g4dn.2xlarge instance.

$ python3.8 joint_monkey.py
Importing module 'gym_38' (/home/ubuntu/.local/lib/python3.8/site-packages/isaacgym/_bindings/linux-x86_64/gym_38.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/ubuntu/.local/lib/python3.8/site-packages/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
WARNING: Forcing CPU pipeline.
[Error] [carb.gym.plugin] Sim CUDA device 0 can't be set, the total number of available devices is -1
Not connected to PVD
/buildAgent/work/45f70df4210b2e3e/source/cudamanager/src/CudaContextManager.cpp (404) : warning : cuInit failed

[Warning] [carb.gym.plugin] Failed to create a valid PhysX CUDA Context Manager. Falling back to CPU.
Physics Engine: PhysX
Physics Device: cpu
GPU Pipeline: disabled
[Error] [carb.gym.plugin] Gym cuda error: no CUDA-capable device is detected: ../../../source/plugins/carb/gym/impl/Gym/GymCuda.h: 110
[Error] [carb.gym.plugin] Failed to create primary CUDA context
[Warning] [carb.gym.plugin] Failed to create primary CUDA context on graphics device
$ nvidia-smi
Wed Apr  6 15:58:12 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   49C    P0    28W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

torch has access to cuda, so I’m not sure what’s going wrong

$ python3.8
Python 3.8.12 (default, Oct 12 2021, 13:49:34) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True

I’m using SSH with X11 forwarding.

Same here, any progress on this problem?

Another thread tracking this issue Virtual environment installations can't run joint_monkey.py · Issue #27 · NVIDIA-Omniverse/IsaacGymEnvs · GitHub

Edit: I’ve tested on both docker and nvidia-docker. Neither works.

I also encountered a similar problem on Ubuntu 20.04.1 LTS. Finally, I found that the key issue was as follows:

the directory “/usr/lib/x86_64-linux-gnu” was missing “libcuda.so”. I uninstalled the original NVIDIA driver and then reinstalled it. The problem was solved.

One succeeded example reported here Isaac Gym on Windows Subsystem for Linux (WSL) - Robotics - Isaac / Isaac Gym - NVIDIA Developer Forums
But I am still truggling to let IsaacGym access GPU on windows.
The problem seems because isaacgym override the cuda backend of PyTorch. If you run

import isaacgym
import torch
torch.cuda.is_available()

You will get a False in wsl2.

4 Likes

same issue here

Same problem here on Ubuntu 20.04.4
Have anyone solved this?

1 Like

Hi All!

I made some progress on a couple of issues.

1) The “internal error: libcuda.so!” issue.
Fix: You need to add the path to LD_LIBRARY_PATH. In my case:

which libcuda.so
/usr/lib/wsl/lib/libcuda.so

Add via:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/wsl/lib/

Note that this will enable GPU for PhysX, but will not enable GPU pipeline for joint monkey. See additional step below.

2) “WARNING: Forcing CPU pipeline.”
If you look in the join_monkey.py, line 73, you can see that GPU pipeline is forced to False. I forced it to True and this enabled the GPU pipeline. (You can see that the object is created with pipeline)

Remaining issues:

1) “WARNING: Forcing CPU pipeline.”
You can see that there is still one “WARNING: Forcing CPU pipeline.” right at the beginning before the sim object is created. As far as I could tell, this is not coming from the Python of the example or of the isaacgym python module. I guess is is coming from one of the compiled libraries.

2) Errors galore
The simulation shows graphically, as it did without the above changes, ad it runs for about the same time before the segfault. However, there are a lot more errors reported in the console.

2 Likes

Regarding the source of the segmentation fault, I ran gym via gdb with and without GPU pipeline enabled. In both cases, the source of the segmentation fault seems to be Vulkan - specifically the lavapipe software render library. Therefore I am guessing that it has something to do with the GUI.

From what I could see online, GPU acceleration for Vulkan might only have been made available recently and seems to need a re-install/compile. I am planning to try that next.

Did you succeed with this problem?

No, but here is some information.

  1. My next plan for WSL2 was to rebuild the mesa driver (open source vulkan/opengl implementation) locally with support for Microsoft’s Direct-X wrapper for WSL2. See info here:
    D3D12 GPU Video acceleration in the Windows Subsystem for Linux now available! - Windows Command Line (microsoft.com)

Unfortunately, this descended into dependency hell and I haven’t made the time to follow up.

  1. Since Hyper-V can implement GPU partitioning I was going to try that route. Note that it seems that this is not “true” SR-IOV in the case of graphics and again relies on direct X. All the same, Microsoft have provided the necessary library to pick up the exposed device in the linux guest.
    See here:
    brokeDude2901/dxgkrnl_ubuntu: Microsoft GPU-P (dxgkrnl) on Hyper-V Ubuntu VM (github.com)

  2. I had the opportunity to speak with an NVIDIA employee recently and their advice was that getting it running under windows is still challenging and to go the dual boot route. I did that and got it running under 22.04, but noticed the following:

  • Phys-X pipeline works without any particular bother (as long at the necessary libraries are on the path).
  • Selecting GPU pipeline results in segfault even under pure linux!

Happy hacking.

2 Likes

Thank you for your suggestion!
I am having the exact same issue as described in the comments above. Same errors.

I can (almost) identically produce the outcome you’ve described after modifing the ~/.bashrc file.

What this modification does is it;

  • Fixes the issue of Physics Engine running on CPU

I still have the issue;

  • ‘WARNING: Forcing CPU pipeline.’

In contrast to you’re example, I still get
‘GPU Pipeline: disabled’ unfortunately…

I really hope this is solved at some point. I love the idea of not needing dual-boot!