Fail to load scene on RTX 6000 Ada

Hello, I am trying to run a docker image I created previously on a new machine. However for a reason I can’t figure, Isaac Sim (2022.2.1) get stuck after the “Ray tracing shader compilation finished after…/RTX ready” messages.

Here is the hardware info:

Where it gets stuck:

  • The same docker image runs fine on my other computer.
  • Isaac Sim works just fine on the host computer

Any help would be appreciated as I can’t find any errors in the terminal output.

Edit
Switching to driver version 535.104.05 allows the docker image to finish lunching, however the local version of Isaac Sim running on the host machine now produces the following crash when trying to load a usd file.

Crash detected in pid 15116 thread 15116
Crash metadata:
  CarbSdkVersion = '129.11+129.tc565.0f371ae5'
  DumpId = 'fd53853f-3f5d-484f-d65b04a9-dfbd6336'
  ProductName = 'OmniverseKit'
  RetryCount = '0'
  StartupTime = '1693482796'
  UploadSuccessful = '0'
  UptimeSeconds = '33'
  Version = '104.2+release.295.529af2e4.tc'
  appName = 'Isaac-Sim'
  appState = 'started'
  appVersion = '2022.2.1'
  autoloadExts = ''
  buildBranch = 'release'
  buildCi = 'tc'
  buildHash = '529af2e4'
  buildId = '14001715'
  buildMajor = '104'
  buildMinor = '2'
  buildMr = '0'
  buildNumber = '295'
  buildPatch = '0'
  buildVersion = '104.2.0'
  environmentName = 'default'
  experience = 'Isaac Sim'
  kitRendererDriverVersion = '535.104'
  lastCommand = 'SetLightingMenuModeCommand(lighting_mode=,usd_context_name=)'
  lastCommands = 'SetLightingMenuModeCommand(lighting_mode=,usd_context_name=)'
  memoryStats = '(avail/total) RAM: 235.431/251.522GB, Swap: 2/2GB, VM: 1.71799e+10/1.71799e+10GB'
  portableMode = '0'
  stageUrl = 'omniverse://localhost/Projects/AIPipeline/test.usd'
  systemInfo = '
|---------------------------------------------------------------------------------------------|
| Driver Version: 535.104.05    | Graphics API: Vulkan
|=============================================================================================|
| GPU | Name                             | Active | LDA | GPU Memory | Vendor-ID | LUID       |
|     |                                  |        |     |            | Device-ID | UUID       |
|---------------------------------------------------------------------------------------------|
| 0   | NVIDIA RTX 6000 Ada Generation   | Yes: 0 |     | 49386   MB | 10de      | 0          |
|     |                                  |        |     |            | 26b1      | efe78990.. |
|---------------------------------------------------------------------------------------------|
| 1   | NVIDIA RTX 6000 Ada Generation   | Yes: 1 |     | 49386   MB | 10de      | 0          |
|     |                                  |        |     |            | 26b1      | 30fd862c.. |
|=============================================================================================|
| OS: Linux WMUC1071639, Version: 6.2.0-31-generic
| XServer Vendor: The X.Org Foundation, XServer Version: 12101004 (1.21.1.4)
| Processor: AMD Ryzen Threadripper PRO 5975WX 32-Cores      | Cores: Unknown | Logical: 64
|---------------------------------------------------------------------------------------------|
| Total Memory (MB): 257558 | Free Memory: 248751
| Total Page/Swap (MB): 2047 | Free Page/Swap: 2047
|---------------------------------------------------------------------------------------------|
'
  telemetrySessionId = '10848344578753085841'
  userId = 'xeHK5zvTO7iShzLCtpTT1Ok-7tZqt-cYjFCtScdxWtk'

Thread 15116 backtrace follows:
000: libc.so.6!__sigaction+0x50 (libc_sigaction.c:?)
001: libc.so.6!pthread_kill+0x12c (./nptl/pthread_kill.c:44)
002: libc.so.6!raise+0x16 (./signal/../sysdeps/posix/raise.c:27)
003: libc.so.6!abort+0xd3 (./stdlib/abort.c:81 (discriminator 21))
004: 280!+0x102905
005: 280!+0x100e66
006: 280!+0x100ea1
007: 280!+0x100d03
008: 280!+0xf9be2
009: 280!+0xf593f
010: 280!+0xfab47
011: 280!+0xd1690
012: 280!+0xd4110
013: 280!+0xb55c8
014: 280!+0x92cdd
015: 280!+0x6e9cd
016: libcarb.graphics-vulkan.plugin.so!std::_Hashtable<void*, void*, std::allocator<void*>, std::__detail::_Identity, std::equal_to<void*>, std::hash<void*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_insert_unique_node(unsigned long, unsigned long, std::__detail::_Hash_node<void*, false>*)+0x7187 (??:?)
017: libcarb.graphics-vulkan.plugin.so!std::_Hashtable<void*, void*, std::allocator<void*>, std::__detail::_Identity, std::equal_to<void*>, std::hash<void*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_insert_unique_node(unsigned long, unsigned long, std::__detail::_Hash_node<void*, false>*)+0x640e (??:?)
018: librtx.postprocessing.plugin.so!_init+0x1de80 (??:0)
019: librtx.postprocessing.plugin.so!_init+0x2027a (??:0)
020: librtx.postprocessing.plugin.so!_init+0x207c3 (??:0)
021: libgpu.foundation.plugin.so!std::vector<unsigned char, std::allocator<unsigned char> >::_M_default_append(unsigned long)+0xdb39 (??:?)
022: libcarb.scenerenderer-rtx.plugin.so!void std::vector<int, std::allocator<int> >::emplace_back<int>(int&&)+0x142be (??:?)
023: libcarb.scenerenderer-rtx.plugin.so!std::unordered_map<unsigned long, unsigned int, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, unsigned int> > >::~unordered_map()+0x18e (??:?)
024: libcarb.tasking.plugin.so!std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (carb::tasking::ThreadPool::*)(), carb::tasking::ThreadPool*> > >::~_State_impl()+0x279d (??:?)
025: libcarb.tasking.plugin.so!std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (carb::tasking::ThreadPool::*)(), carb::tasking::ThreadPool*> > >::~_State_impl()+0xbb0b (??:?)
026: libcarb.tasking.plugin.so!std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (carb::tasking::ThreadPool::*)(), carb::tasking::ThreadPool*> > >::~_State_impl()+0xbe0e (??:?)
027: libcarb.scenerenderer-rtx.plugin.so!void std::vector<int, std::allocator<int> >::emplace_back<int>(int&&)+0x1b463 (??:?)
028: libcarb.scenerenderer-rtx.plugin.so!void std::vector<int, std::allocator<int> >::emplace_back<int>(int&&)+0x1bb63 (??:?)
029: libcarb.scenerenderer-rtx.plugin.so!void std::vector<int, std::allocator<int> >::emplace_back<int>(int&&)+0x4a7db (??:?)
030: libcarb.scenerenderer-rtx.plugin.so!void std::vector<int, std::allocator<int> >::emplace_back<int>(int&&)+0x4e21b (??:?)
031: libcarb.scenerenderer-rtx.plugin.so!std::_Hashtable<unsigned int, unsigned int, std::allocator<unsigned int>, std::__detail::_Identity, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_insert_unique_node(unsigned long, unsigned long, std::__detail::_Hash_node<unsigned int, false>*)+0x1a917 (??:?)
032: librtx.hydra.so!carb::progress::ScopedLoadingEvent* std::__uninitialized_copy<false>::__uninit_copy<std::move_iterator<carb::progress::ScopedLoadingEvent*>, carb::progress::ScopedLoadingEvent*>(std::move_iterator<carb::progress::ScopedLoadingEvent*>, std::move_iterator<carb::progress::ScopedLoadingEvent*>, carb::progress::ScopedLoadingEvent*)+0x78e4 (??:?)
033: librtx.hydra.so!carb::progress::ScopedLoadingEvent* std::__uninitialized_copy<false>::__uninit_copy<std::move_iterator<carb::progress::ScopedLoadingEvent*>, carb::progress::ScopedLoadingEvent*>(std::move_iterator<carb::progress::ScopedLoadingEvent*>, std::move_iterator<carb::progress::ScopedLoadingEvent*>, carb::progress::ScopedLoadingEvent*)+0xae01 (??:?)
034: libomni.usd.so!std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<omni::usd::UsdContext::Impl::saveLayers(std::string const&, std::vector<std::string, std::allocator<std::string> > const&, bool)::{lambda()#1}> >, void> >::_M_invoke(std::_Any_data const&)+0x5e7a (??:?)
035: libomni.usd.so!std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<omni::usd::UsdContext::Impl::saveLayers(std::string const&, std::vector<std::string, std::allocator<std::string> > const&, bool)::{lambda()#1}> >, void> >::_M_invoke(std::_Any_data const&)+0xbfe1 (??:?)
036: libomni.usd.so!std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<omni::usd::UsdContext::Impl::saveLayers(std::string const&, std::vector<std::string, std::allocator<std::string> > const&, bool)::{lambda()#1}> >, void> >::_M_invoke(std::_Any_data const&)+0xc9c6 (??:?)
037: libomni.usd.so!std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<omni::usd::UsdContext::Impl::saveLayers(std::string const&, std::vector<std::string, std::allocator<std::string> > const&, bool)::{lambda()#1}> >, void> >::_M_invoke(std::_Any_data const&)+0xcf41 (??:?)
038: libomni.usd.so!std::_Rb_tree<std::string, std::string, std::_Identity<std::string>, std::less<std::string>, std::allocator<std::string> >::_M_erase(std::_Rb_tree_node<std::string>*)+0x4f9f (??:?)
039: libcarb.events.plugin.so!std::_Hashtable<unsigned long, unsigned long, std::allocator<unsigned long>, std::__detail::_Identity, std::equal_to<unsigned long>, std::hash<unsigned long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_insert_unique_node(unsigned long, unsigned long, std::__detail::_Hash_node<unsigned long, false>*)+0x188c (??:?)
040: libcarb.events.plugin.so!std::_Hashtable<unsigned long, unsigned long, std::allocator<unsigned long>, std::__detail::_Identity, std::equal_to<unsigned long>, std::hash<unsigned long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::_M_insert_unique_node(unsigned long, unsigned long, std::__detail::_Hash_node<unsigned long, false>*)+0x2026 (??:?)
041: libomni.kit.loop-default.plugin.so!std::thread::_State_impl<std::thread::_Invoker<std::tuple<omni::kit::RunLoopThread::run()::{lambda()#1}> > >::~_State_impl()+0x8ac (??:?)
042: libomni.kit.loop-default.plugin.so!std::thread::_State_impl<std::thread::_Invoker<std::tuple<omni::kit::RunLoopThread::run()::{lambda()#1}> > >::~_State_impl()+0xe8e3 (??:?)
043: libomni.kit.app.plugin.so!_init+0x28c7 (??:0)
044: libomni.kit.app.plugin.so!carbOnPluginPostShutdown+0x907 (??:?)
045: kit!_init+0x635 (??:0)
046: libc.so.6!__libc_init_first+0x90 (./csu/../sysdeps/nptl/libc_start_call_main.h:58)
047: libc.so.6!__libc_start_main+0x80 (./csu/../csu/libc-start.c:128)
048: kit!_init+0x9cb (??:0)

Hi @anthony.yaghi - Can you provide the complete log file and USD file that you are trying to load?

Also, please share the setup information as well like

  • OS version
  • vGPU
  • VM
  • Container

Also, this post might be helpful: Isaac Sim Crashes when I try to open a saved USD

I tried 5 different driver versions. Most of 525.x versions work to fix the problems with the local installation of Isaac Sim however I still can’t get the docker image to start. Always stuck at RTX ready with no errors. I will post the full logs on here on Monday anyway.

  • Ubuntu 20.04
  • GPUs: 2 x RTX 6000 Ada
  • Nvidia drivers 535.104.05
  • IsaacSim 2022.2.1

To reproduce the error, first open IsaacSim then try to save the default scene (or any scene really). File browser menu will pop up and I can choose where to save the scene but once I click the save button the app crashes. Here is the full logs from ~/.nvidia-omniverse/logs/Kit/Isaac-Sim/

kit_20230904_151112.log (1.4 MB)

Switching to drivers version 525.85.05 fixes all the issue with the local installation of IsaacSim but now the docker container gets stuck. Here is the full log from within the container after getting stuck at lunching the app.

kit_20230904_143830_docker.log (847.4 KB)

Hi. The crash when saving a scene will be fixed in our next release.

The log for the docker shows running a Python app. Can you share the commands used to run docker and then to run this Python app?
Please note that we recommend running docker for headless apps only and for Python code, you need to set headless: true config.

Thank you for the feedback. I am using a script to start IsaacSim inside the docker container.

Here is the docker file:

FROM nvcr.io/nvidia/isaac-sim:2022.2.1

RUN apt-get update && apt-get install -y apt-transport-https
RUN apt-get install nano
RUN apt-get install -y wget && rm -rf /var/lib/apt/lists/*
# Install mini conda
RUN wget \
    https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
    && mkdir /root/.conda \
    && bash Miniconda3-latest-Linux-x86_64.sh -b \
    && rm -f Miniconda3-latest-Linux-x86_64.sh 

ENV PATH="/root/miniconda3/bin:${PATH}"
ARG PATH="/root/miniconda3/bin:${PATH}"

# Create isaac-sim environment
RUN conda env create -f environment.yml

# Make RUN commands use the new environment:
SHELL ["conda", "run", "-n", "isaac-sim", "/bin/bash", "-c"]
RUN pip install tqdm
# Install requirements
COPY ./docker/requirements.txt .
RUN pip install -r requirements.txt


WORKDIR /app/replicator_worker
COPY . /app/replicator_worker

RUN chmod a+x /app/replicator_worker/entry.sh
RUN chmod +x /app/replicator_worker/docker/docker.celery.sh
ENV PYTHONPATH "${PYTHONPATH}:/app/replicator_worker"

And this is the entry.sh script that is actually used to launch isaac sim:

#!/bin/bash
source /isaac-sim/setup_conda_env.sh
nvidia-smi | grep 'python' | awk '{ print $5 }' | xargs -n1 kill -9
rm -r /etc/vulkan/icd.d

trap 'kill -TERM $PID' TERM INT

python3 run_replicator.py $1 &
PID=$!
sleep 5
wait $PID

The configs passed to the SimulationApp contain

“headless”: True

Hi. Thank you for your sample script by I’m afraid I could not run it because of some missing files.

Please give this a try again with our latest Isaac Sim 2023.1.0 version.