Isaac Gym + 3090 issues

gstate · November 9, 2020, 2:58am

The problem you’re hitting is that the default PyTorch version available through Anaconda doesn’t support GeForce Ampere cards out of the box. Here’s a thread on the topic on the PyTorch side: The support for 3080 or 3090 · Issue #45021 · pytorch/pytorch · GitHub

The ideal solution will be to wait until cuda-toolkit 11.1 is made available on the Anaconda side, and use that in conjunction with an updated PyTorch release, but there are two ways to deal with it in the meantime:

1. Use the docker container installation instructions

The Dockerfile we provide uses the latest PyTorch Image from ngc.nvidia.com: PyTorch | NVIDIA NGC

This image has the right configuration to allow headless training in Docker, but doesn’t support anything that requires rendering, as Vulkan is not supported in the container

Alternatively you can try…

2. Use the pre-release PyTorch 1.8 nightly build, along with hacks for CUDA 11.1 support:

This is a hacky solution that works with rendering, but which is far from pretty, and could be challenging to set up. It also might not work depending on that day’s build of PyTorch. Try this at your own risk.

First, download and install the latest CUDA Toolkit: https://developer.nvidia.com/cuda-toolkit - depending on how you do the install, this may require you to install an updated graphics driver.

Set up the rlgpu conda environment as described in the Isaac Gym documentation.

Next, switch over to the rlgpu conda environment, and install the latest pytorch nightly build:

conda activate rlgpu
conda install pytorch torchvision cudatoolkit=11 -c pytorch-nightly

This still won’t work, since the version of the NVRTC runtime shipped in the Anaconda version of the CUDA toolkit is 11.0, not the 11.1 required to support the 3080 and 3090. But since you have installed the CUDA toolkit locally, you can work around this with a manual symlink. First, move the old library out of the way:

cd ~/anaconda3/envs/rlgpu/lib
mkdir oldcuda
mv *nvrtc* oldcuda

With that done, make a symlink from your locally installed CUDA 11.1 NVRTC to the one in your conda environment:

ln -s /usr/local/cuda/targets/x86_64-linux/lib/libnvrtc.so.11.1 libnvrtc.so.11.0

At this point, if everything works correctly, you should be able to run the RL examples.

Take care,
-Gav