Cuda 11rc on host pc with IsaaC?

I am just wondering if I shall downgrade cuda to 10 version from 11 if I want to install IsaaC on host pc and Jetson.

docker implementation of the early preview " Omniverse Robotics Experience: Isaac Sim 2020"
seems to cause segfault on laptop with m1200 quadro & cuda 11RC;
any insights?

Hi, I believe the segfault is caused by using a non RTX card.
Could you provide more info such as logs and what graphic driver version and Ubuntu version you are using?

 nvidia-smi
Wed Jul  1 13:48:03 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 450.36.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro M1000M       On   | 00000000:01:00.0 Off |                  N/A |
| N/A   33C    P8    N/A /  N/A |    282MiB /  2002MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1270      G   /usr/lib/xorg/Xorg                 39MiB |
|    0   N/A  N/A      1726      G   /usr/bin/gnome-shell               51MiB |
|    0   N/A  N/A      2648      G   /usr/lib/xorg/Xorg                117MiB |
|    0   N/A  N/A      2819      G   /usr/bin/gnome-shell               68MiB |
+-----------------------------------------------------------------------------+
 5.3.0-62-generic #56~18.04.1-Ubuntu SMP Wed Jun 24 16:17:03 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Thanks. That error shows that the Quadro M1000M card is not supported.

Are you able to try running Isaac Sim on another machine with an RTX card and CUDA 11 RC?

1 Like

could you extend if cuda 11 is supported? for isaac sdk? or there is a need to use 10 version? to avoid discrepancies?
at the moment I do not have available machine with RTX to test. When I have a cahcnce I wil try to do so.

The current version for 2020.1 Isaac Sim and Isaac SDK supports CUDA 10 only.

1 Like

we have installation issues as well. Cuda 11 came with Graphics driver 450, so we used the advice to also install Cuda 10, but this seems to have caused more problems.

We would appreciate if Nvidia could be more specific: Is Cuda 10.0 needed on the host or could it be 10.2 ? Please note that following your setup instructions to use the current SDKManager to initialize the Xavier will cause Cuda 10.2 to be installed on the Xavier.

If Cuda 10.0 is needed on the host, does that mean the graphics driver on the host needs to be rolled back to an earlier version ?

We were successful with running simulations but got the undefined symbol error when deploying on the robot so we’re very confused about which specific Cuda version is needed on the host and on the Xavier, or how to change the Cuda version on Xavier if the SDKManager only really allows 10.2

10.2 falls under 10 cactegory and will work;
however, there might exist a need to remove installed11 cuda and driver
e.g.
sudo apt remove --purge cuda* nvidia*
then after reboot reinstallation from runfile will help setting up drivers and cuda 10
moreover, it might be required to be done with disabled graphical interface e.g. after executing
init 1

1 Like

Andrey thanks for the very helpful suggestions. Using your commands we cleaned up all traces of previous CUDA and Nvidia drivers, and as you said it required the intermediate step of using the Nouveau driver to allow complete removal of previous Nvidia driivers. That part worked. And we did a fresh install of the Nvidia driver 440 that is recommended in the Isaac setup instructions, using the method described in the Isaac 2020 setup instructions. Now Nvidia-SMI reports:

NVIDIA-SMI 440.95.01    Driver Version: 440.95.01    CUDA Version: 10.2 

And another helpful person had mentioned running the python checker:

python3 engine/build/scripts/version_checker.py

That one reports everything ok:

---------------------------------------------------------------
|Package             |Recommended Version |Current Version     |
---------------------------------------------------------------
|OS                  |Ubuntu 18.04.2 LTS  |Ubuntu 18.04.4 LTS  |
|Bazel               |2.2.0               |2.2.0               |
|GPU_Driver          |>=418               |440.95.01           |
|Cuda                |10.x.x              |10.2.89             |
|Cudnn               |7.6.x.x             |7.6.5.32            |
|TensorFlow          |1.15.0              |1.15.0              |
|pycapnp             |>=0.6.3             |0.6.4               |
|librosa             |>=0.6.3             |0.7.2               |
|SoundFile           |>=0.10.2            |0.10.3.post1        |
|Python2             |2.7.x               |2.7.17              |
|Python3             |3.6.x               |3.6.9               |
---------------------------------------------------------------

And yet, the dreaded error persists, even if we try something like the warehouse demo that we previously ran successfully on this same computer:

2020-07-08 21:09:01.076 ERROR engine/alice/backend/modules.cpp@250: packages/perception   /libperception_module.so: /home/lx/.cache/bazel/_bazel_lx/e500e4c0a09ca97c409f3c08708046e5/execroot/com_nvidia_isaac/bazel-out/k8-opt/bin/apps/navsim/navsim_navigate.runfiles/com_nvidia_isaac//packages/perception/libperception_module.so: undefined symbol: IsaacGatherComponentInfo
2020-07-08 21:09:01.076 PANIC engine/alice/backend/modules.cpp@252: Could not load all required modules for application

Do you run the app at Host PC or at the robot?
In the former case, did you install dependencies to the Host PC with

bob@desktop:~/isaac$ engine/build/scripts/install_dependencies.sh

In the latter case, did you install dependencies to the robot with the command below?

bob@desktop:~/isaac$ ./engine/build/scripts/install_dependencies_jetson.sh -h <jetson_ip> -u <jetson_username>

May I know what command do you execute when you are getting the errors? What are the steps to reproduce the error?

1 Like

Thank you so much for your followup. In my case, the error now occurs on both the host and the target. It worked previously on the host. Yes I did run both dependency scripts. I have now noticed that SDKManager downloaded Jetpack 4.4 to the target Xavier AGX and another comment on this forum mentioned it should really be Jetpack 4.3. So I’m reflashing the target now with 4.3. Then to my surprise when selecting Jetpack 4.3 for the target the SDKManager also started re-downloading some components for the host, some of which failed:

* 14:21:00 ERROR : VisionWorks on Host : E: Version '1.6.0.500n*' for 'libvisionworks' was not found
* 14:21:00 INFO : VisionWorks on Host : E: Version '1.6.0.500n*' for 'libvisionworks-dev' was not found
* 14:21:00 INFO : VisionWorks on Host : E: Version '1.6.0.500n*' for 'libvisionworks-samples' was not found

In response to your questions: When I still had Jetpack 4.4 on the target, then the attempt to run the realsense camera example was successful but the attempt to run carter or follow_me was not successful. I’m almost done with reflashing the target with 4.3 so if that still fails I’ll report back with the exact commands.

But I’m perplexed about why SDKManager fails to install some needed components for the host as shown above

cuda version on Jetpack needs to match the same at the Host PC;
Visionworks is a matter of selection of components during the sdkmanager install

1 Like

Thank you Andrey. Clearly I’m missing something. In the currently available SDKManager, if the target is a Xavier AGX, you can select Jetpack 4.4 or Jetpack 4.3.

If you select 4.4, the installation completes successfully on host and target, but then the Isaac software doesn’t work.

If you select 4.3, then the target completes successfully, but the host Visionworks installation fails due to version 1.6.0.500n* not found as shown in the error messages above.

Extracting the specific commands from the SDKManager terminal and re-running them in a normal terminal fails the same way.

Going to jetpack archive and using the SDKManager for Jetpack 4.3 results in different errors… first it wants to update itself, but if that is not allowed then it only supports Jetpack 4.2 even though it was installed from the Jetpack 4.3 archive.

Once again thanks for your help, Andrey. It is working now, with the really odd combination of the target Jetpack 4.3 selected so that Isaac is happy running on the target, and then re-running SDKManager for host installtion with Jetpack 4.4 selected as the target and “skip” for the target flashing so that the host installation can complete. Robot is moving now with the joystick in the follow_me sample, and also autonomously towards the April Tag.

Wish we knew how to do a cleaner installation. And by the way regarding your question we found that this example works on a computer with RTX card as well as a computer with a GTX1660 so not all examples really need an RTX