Unable to run Isaac SIM on AWS EC2 instance

I get the below error when I run the Isaac SIM container.

2021-10-10 17:55:44 [76,076ms] [Fatal] [carb.crashreporter-breakpad.plugin] libomni.livestream-websocket.plugin.so!std::ctype::do_widen(char) const
2021-10-10 17:55:44 [76,081ms] [Fatal] [carb.crashreporter-breakpad.plugin] libstdc++.so.6!std::error_code::default_error_condition() const
2021-10-10 17:55:44 [76,085ms] [Fatal] [carb.crashreporter-breakpad.plugin] libpthread.so.0!start_thread
2021-10-10 17:55:44 [76,090ms] [Fatal] [carb.crashreporter-breakpad.plugin] libc.so.6!clone
Aborted (core dumped)

I followed the below steps

  1. sudo docker login nvcr.io
  2. sudo docker run --gpus all -e “ACCEPT_EULA=Y” --rm --network=host -v ~/docker/isaac-sim/documents:/root/Documents:rw -v ~/docker/isaac-sim/cache:/root/.cache/ov:rw -v ~/docker/isaac-sim/logs:/root/.nvidia-omniverse/logs:rw -v ~/docker/isaac-sim/data:/root/.local/share/ov/data:rw nvcr.io/nvidia/isaac-sim:2021.1.1

Nucleus is also installed using “Nucleus Cloud Installation” from the page Native Workstation Deployment — Omniverse Robotics documentation

Attaching the logs as well.
nvidia-bug-report.log.gz (485.6 KB)
Thank you in advance,
Sagar Surendran

Hi. Please check the RAM usage before the crash. There is a known bug that crashed due to high memory usage. This is fixed in the next version.
Please share the full Isaac Sim log. If the issue is not related to RAM, there could be other clues further up the log before the crash.

Thank you for the response.

I do not think it’s the issue with the memory. Below is the memory details.

*-firmware
description: BIOS
vendor: Amazon EC2
physical id: 0
version: 1.0
date: 10/16/2017
size: 64KiB
capabilities: pci edd acpi virtualmachine
*-cache:0
description: L1 cache
physical id: 5
slot: L1-Cache
size: 1536KiB
capacity: 1536KiB
capabilities: synchronous internal write-back instruction
configuration: level=1
*-cache:1
description: L2 cache
physical id: 6
slot: L2-Cache
size: 24MiB
capacity: 24MiB
capabilities: synchronous internal varies unified
configuration: level=2
*-cache:2
description: L3 cache
physical id: 7
slot: L3-Cache
size: 35MiB
capacity: 35MiB
capabilities: synchronous internal varies unified
configuration: level=3
*-memory
description: System Memory
physical id: 8
slot: System board or motherboard
size: 64GiB
*-bank
description: DIMM DDR4 Static column Pseudo-static Synchronous Window DRAM 2933 MHz (0.3 ns)
physical id: 0
size: 64GiB
width: 64 bits
clock: 2933MHz (0.3ns)

Attaching the Isaac sim logs.docker_logs.zip (42.1 KB)

Thank you,
Sagar Surendran

Thanks for the logs. They look normal. Can you try running with Kit Remote mode instead and see if it crashes too?
Use the --entrypoint ./runheadless.kitremote.sh flag.

Thank you for your response. It worked. But I got another segmentation fault while working on it.

I was trying to load Dofbot sample application.

It did not work as per the steps in the link. DofBot Sample Application — Omniverse Robotics documentation
I mean, it did not load the DofBot.

The steps I followed was,
Isaac Examples → Controlling → Manipulation → DofBot Picking.

Since that was not working, I went into the examples, and selected DofBot.usd (Isaac/Robots/Dofbot/dofbot.usd), and it crashed. (1st log)

I tried to run the docker run command again, with the --entrypoint flag, and it crashed again and again. (logs 2 and 3).

Can you please suggest what I shall do?

docker_logs_2.zip (1.6 MB)

Below is the crash log for the 1st case.

2021-10-11 20:15:29 [3,176,261ms] [Fatal] [carb.crashreporter-breakpad.plugin] libcarb.tasking.plugin.so!std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (carb::tasking::Scheduler::)(unsigned int, int, carb::cpp20::latch), carb::tasking::Scheduler*, unsigned int, int, carb::cpp20::latch*> > >::_M_run()
2021-10-11 20:15:29 [3,176,269ms] [Fatal] [carb.crashreporter-breakpad.plugin] libomni.usd.so!std::_Function_base::_Base_manageromni::usd::OmniUsdMessenger::Impl::Impl(omni::usd::UsdContext*):{lambda(carb::events::IEvent*)#1}::_M_manager(std::_Any_data&, std::_Function_base::_Base_manageromni::usd::OmniUsdMessenger::Impl::Impl(omni::usd::UsdContext*):{lambda(carb::events::IEvent*)#1} const&, std::_Manager_operation)
2021-10-11 20:15:29 [3,176,278ms] [Fatal] [carb.crashreporter-breakpad.plugin] libpthread.so.0!funlockfile
2021-10-11 20:15:29 [3,176,285ms] [Fatal] [carb.crashreporter-breakpad.plugin] librtx.neuraylib.plugin.so!std::basic_string<char, std::char_traits, std::allocator > std::operator+<char, std::char_traits, std::allocator >(std::basic_string<char, std::char_traits, std::allocator >&&, char const*)
2021-10-11 20:15:29 [3,176,294ms] [Fatal] [carb.crashreporter-breakpad.plugin] librtx.neuraylib.plugin.so!std::__detail::_Compiler<std::regex_traits >::_M_assertion()
2021-10-11 20:15:29 [3,176,300ms] [Fatal] [carb.crashreporter-breakpad.plugin] librtx.neuraylib.plugin.so!std::__detail::_Compiler<std::regex_traits >::_M_assertion()
2021-10-11 20:15:29 [3,176,307ms] [Fatal] [carb.crashreporter-breakpad.plugin] libomni.usd.so!std::basic_string<char, std::char_traits, std::allocator > std::operator+<char, std::char_traits, std::allocator >(std::basic_string<char, std::char_traits, std::allocator >&&, char const*)
2021-10-11 20:15:29 [3,176,313ms] [Fatal] [carb.crashreporter-breakpad.plugin] libomni.usd.so!std::basic_string<char, std::char_traits, std::allocator > std::operator+<char, std::char_traits, std::allocator >(std::basic_string<char, std::char_traits, std::allocator >&&, char const*)
2021-10-11 20:15:29 [3,176,320ms] [Fatal] [carb.crashreporter-breakpad.plugin] libcarb.tasking.plugin.so!std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (carb::tasking::ThreadPool::)(), carb::tasking::ThreadPool> > >::~_State_impl()
2021-10-11 20:15:29 [3,176,327ms] [Fatal] [carb.crashreporter-breakpad.plugin] libcarb.tasking.plugin.so!std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (carb::tasking::ThreadPool::)(), carb::tasking::ThreadPool> > >::~_State_impl()
2021-10-11 20:15:29 [3,176,333ms] [Fatal] [carb.crashreporter-breakpad.plugin] libcarb.tasking.plugin.so!std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (carb::tasking::ThreadPool::)(), carb::tasking::ThreadPool> > >::~_State_impl()
2021-10-11 20:15:29 [3,176,340ms] [Fatal] [carb.crashreporter-breakpad.plugin] libcarb.tasking.plugin.so!make_fcontext
Segmentation fault (core dumped)

Regards,
Sagar Surendran

From the logs, it looks like you do not have a Nucleus server with the Isaac folder. For the examples to work, you need at least a localhost Nucleus installed and the Isaac folder with assets on it. Please see Native Workstation Deployment — Omniverse Robotics documentation.

Ok. In this AWS EC2 cloud instance, I am trying the cloud deployment steps. I did install Nucleus initially but, do I have to run the Nucleus cloud deployment steps every time I need to run the SIM container?

Below is the directory structure i created when I did it initially.

ll nucleus_installer/
total 52
drwxrwxr-x 12 ubuntu ubuntu 4096 Sep 7 18:16 ./
drwxr-xr-x 12 ubuntu ubuntu 4096 Oct 11 19:21 …/
drwxrwxr-x 17 ubuntu ubuntu 4096 Sep 7 18:16 Auth/
drwxrwxr-x 12 ubuntu ubuntu 4096 Sep 7 18:16 ‘Discovery Service’/
drwxrwxr-x 5 ubuntu ubuntu 4096 Sep 7 18:16 Nucleus/
drwxrwxr-x 12 ubuntu ubuntu 4096 Sep 7 18:16 ‘Search Service’/
drwxrwxr-x 12 ubuntu ubuntu 4096 Sep 7 18:16 Snapshot/
drwxrwxr-x 13 ubuntu ubuntu 4096 Sep 7 18:16 ‘System Monitor’/
drwxrwxr-x 10 ubuntu ubuntu 4096 Sep 7 18:16 ‘Tagging Service’/
drwxrwxr-x 15 ubuntu ubuntu 4096 Sep 7 18:16 Thumbnail/
drwxrwxr-x 12 ubuntu ubuntu 4096 Sep 7 18:16 Web/
-rw-r–r-- 1 ubuntu ubuntu 1498 Apr 9 2021 launcher.toml
drwxrwxr-x 5 ubuntu ubuntu 4096 Sep 7 18:16 setup/

Regards,
Sagar Surendran

You can run setup/nucleus-setup -i each time the instance is started. If Nucleus is still not running, try running setup/System\ Monitor/omni-system-monitor.

Thank you Sheikh, it is working fine now. Have a great rest of the day,

Sagar Surendran

1 Like

Awesome! Have a great day!

Is there a way to check if the nucleus is running or not? like running nvidia-smi to check the status of Cuda etc?

Yes. Run the command below to check:

ps aux | grep omni-nucleus

You can restart the services by running these commands:

setup/nucleus-setup -u
setup/nucleus-setup -i
setup/System\ Monitor/omni-system-monitor

Ok, thank you

Hi Sheikh,

I find it difficult to work on the websockets for long time, I mean, I am still getting Core Dumb. I went idle for sometime and then when I came back I was unable to work on the Dofbot I was working. Is this an unstability issue with running the container? Do you recommend using it Standalone version only?

Please find the logs attached. kit_20211012_235226.log (15.8 MB)

Regards,
Sagar Surendran

Yes, the Launcher version is recommended instead of the container if you have a workstation with a monitor attached.

I do not have a workstation with monitor. However, does the Launcher version work fine in Virtual Machines with Nvidia A30/A40 GPUs?

You can run the Launcher version if you have a monitor connected to the host and GPU passthrough to the VM. Without a monitor, it is recommended to run Isaac Sim headless.