Isaac SDK 2021.1 Jetbot Jupyter Notebook Inference Sim Example crashes

Hello.

I am following the “Running Inference in Simulation” Jetbot sample application in Jupyter Notebook with Isaac SDK 2021.1 + Isaac Sim on an Ubuntu 18.04 workstation. I keep running into the following error when attempting to use the inference model:

"TRT ERROR: ../rtSafe/cuda/caskUtils.cpp (98) - Assertion Error in trtSmToCask: 0 (Unsupported SM.)"

Here are a few more console lines from when I start the Isaac SDK app until the error:

2021-07-09 14:05:29.295 WARN  external/com_nvidia_isaac_engine/engine/alice/backend/codelet_canister.cpp@229: Codelet '_engine_tcp_udp_asio_context/isaac.alice.AsioContext' was not added to scheduler because no tick method is specified.
2021-07-09 14:05:29.295 INFO  packages/sight/WebsightServer.cpp@247: Sight webserver is loaded
2021-07-09 14:05:29.295 INFO  packages/sight/WebsightServer.cpp@248: Please open Chrome Browser and navigate to http://<ip address>:3000
2021-07-09 14:05:29.295 WARN  external/com_nvidia_isaac_engine/engine/alice/backend/codelet_canister.cpp@229: Codelet 'websight/isaac.sight.AliceSight' was not added to scheduler because no tick method is specified.
2021-07-09 14:05:29.295 WARN  external/com_nvidia_isaac_engine/engine/alice/components/Codelet.cpp@53: Function deprecated. Set tick_period to the desired tick parameter
2021-07-09 14:05:29.359 WARN  external/com_nvidia_isaac_engine/engine/alice/backend/codelet_canister.cpp@229: Codelet '_check_operating_system/isaac.alice.CheckOperatingSystem' was not added to scheduler because no tick method is specified.
2021-07-09 14:05:29.359 WARN  external/com_nvidia_isaac_engine/engine/alice/components/Codelet.cpp@53: Function deprecated. Set tick_period to the desired tick parameter
2021-07-09 14:05:29.359 WARN  external/com_nvidia_isaac_engine/engine/alice/components/Codelet.cpp@53: Function deprecated. Set tick_period to the desired tick parameter
2021-07-09 14:05:29.359 WARN  packages/ml/TensorRTInference.cpp@463: Could not read from .plan (path is set to 'external/jetbot_ball_detection_resnet_model/jetbot_ball_detection_resnet18.plan'). Falling back to building the .plan from frozen model external/jetbot_ball_detection_resnet_model/jetbot_ball_detection_resnet18.etlt. Note: this process may take up to several minutes.
2021-07-09 14:05:32.248 INFO  packages/sight/WebsightServer.cpp@117: Server connected / 1
2021-07-09 14:05:39.289 INFO  external/com_nvidia_isaac_engine/engine/alice/backend/allocator_backend.cpp@57: Optimized memory CPU allocator.
2021-07-09 14:05:39.289 INFO  external/com_nvidia_isaac_engine/engine/alice/backend/allocator_backend.cpp@66: Optimized memory CUDA allocator.
[I 14:06:31.611 NotebookApp] Saving file at /jetbot_notebook.ipynb
2021-07-09 14:07:10.991 WARN  packages/ml/TensorRTInference.cpp@174: TRT WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
2021-07-09 14:09:28.756 ERROR packages/ml/TensorRTInference.cpp@171: TRT ERROR: ../rtSafe/cuda/caskUtils.cpp (98) - Assertion Error in trtSmToCask: 0 (Unsupported SM.)
2021-07-09 14:09:28.756 PANIC packages/ml/TensorRTInference.cpp@258: Failed to build TensorRT engine from external/jetbot_ball_detection_resnet_model/jetbot_ball_detection_resnet18.etlt.
====================================================================================================
|                            Isaac application terminated unexpectedly                             |
====================================================================================================

Running nvidia-smi on the workstation, I can see that the card is recognized:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.42.01    Driver Version: 470.42.01    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A6000    On   | 00000000:01:00.0  On |                  Off |
| 61%   84C    P2   203W / 300W |   4190MiB / 48662MiB |     75%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

Running nvcc --version I can see that the correct CUDA version is linked:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

And, running python3 .../isaac/engine/engine/build/scripts/version_checker.py script gives that all dependencies are properly fulfilled (I think the TRT v7.1.5 is a typo since Isaac SDK uses TRT v7.1.3 and there is no TRT v7.1.5):

---------------------------------------------------------------
|Package             |Recommended Version |Current Version     |
---------------------------------------------------------------
|OS                  |Ubuntu 18.04.2 LTS  |Ubuntu 18.04.5 LTS  |
|Bazel               |3.1.0               |3.1.0               |
|GPU_Driver          |>=440               |470.42.01           |
|Cuda                |10.2.x              |10.2.89             |
|Cudnn               |8.0.3.x             |8.0.3.33            |
|TensorRt            |7.1.5               |7.1.3               |
|TensorFlow          |1.15.0              |1.15.0              |
|pycapnp             |>=0.6.3             |0.6.4               |
|librosa             |>=0.6.3             |0.8.0               |
|SoundFile           |>=0.10.2            |0.10.3.post1        |
|Python2             |2.7.x               |2.7.17              |
|Python3             |3.6.x               |3.6.9               |
---------------------------------------------------------------

Am I missing something here?

It’s odd to me that TRT says I do not have hardware with native FP16 support although the driver recognizes the RTX6000 card. I can run the previous example “Remote control Jetbot using Virtual Gamepad” perfectly fine but running the perception model keeps giving me that error.

I also tried to run this in the Isaac SDK Docker container and I’m running into the same problem, but I noticed the container didn’t have TRT when I ran the version checker.

Does anyone have any idea what could be going on? Or can anyone provide recommendations for what TRT version to use, as I suspect that it’s the TRT that is the issue. Thanks in advance!

I am facing a similar issue after running the following command (to run the pick and place example application) in an Isaac SDK docker container:

bazel run //apps/samples/pick_and_place – --arm ur10

The error I receive is shown below:

2021-08-03 21:59:23.754 WARN  external/com_nvidia_isaac_engine/engine/alice/backend/codelet_canister.cpp@229: Codelet '_engine_tcp_udp_asio_context/isaac.alice.AsioContext' was not added to scheduler because no tick method is specified.
2021-08-03 21:59:23.754 INFO  packages/sight/WebsightServer.cpp@247: Sight webserver is loaded
2021-08-03 21:59:23.754 INFO  packages/sight/WebsightServer.cpp@248: Please open Chrome Browser and navigate to http://<ip address>:3000
2021-08-03 21:59:23.754 WARN  external/com_nvidia_isaac_engine/engine/alice/backend/codelet_canister.cpp@229: Codelet 'websight/isaac.sight.AliceSight' was not added to scheduler because no tick method is specified.
2021-08-03 21:59:23.755 WARN  external/com_nvidia_isaac_engine/engine/alice/backend/codelet_canister.cpp@229: Codelet 'controller.kinematic_tree/KinematicTree' was not added to scheduler because no tick method is specified.
2021-08-03 21:59:23.755 DEBUG external/com_nvidia_isaac_engine/engine/alice/backend/event_manager.cpp@40: Stopping node 'pose_initializer' because it reached status 'SUCCESS'
2021-08-03 21:59:23.755 WARN  external/com_nvidia_isaac_engine/engine/alice/components/Codelet.cpp@53: Function deprecated. Set tick_period to the desired tick parameter
2021-08-03 21:59:23.755 WARN  external/com_nvidia_isaac_engine/engine/alice/components/Codelet.cpp@53: Function deprecated. Set tick_period to the desired tick parameter
2021-08-03 21:59:23.787 WARN  external/com_nvidia_isaac_engine/engine/alice/backend/codelet_canister.cpp@229: Codelet '_check_operating_system/isaac.alice.CheckOperatingSystem' was not added to scheduler because no tick method is specified.
2021-08-03 21:59:23.787 WARN  external/com_nvidia_isaac_engine/engine/alice/components/Codelet.cpp@53: Function deprecated. Set tick_period to the desired tick parameter
2021-08-03 21:59:23.787 WARN  external/com_nvidia_isaac_engine/engine/alice/components/Codelet.cpp@53: Function deprecated. Set tick_period to the desired tick parameter
2021-08-03 21:59:23.788 WARN  packages/ml/TensorRTInference.cpp@463: Could not read from .plan (path is set to 'external/sortbot_pose_estimation_models/resnet18_detector_kltSmall.plan'). Falling back to building the .plan from frozen model external/sortbot_pose_estimation_models/resnet18_detector_kltSmall.etlt. Note: this process may take up to several minutes.
2021-08-03 21:59:33.749 INFO  external/com_nvidia_isaac_engine/engine/alice/backend/allocator_backend.cpp@57: Optimized memory CPU allocator.
2021-08-03 21:59:33.749 INFO  external/com_nvidia_isaac_engine/engine/alice/backend/allocator_backend.cpp@66: Optimized memory CUDA allocator.
2021-08-03 22:02:24.683 WARN  packages/ml/TensorRTInference.cpp@174: TRT WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
2021-08-03 22:04:51.812 ERROR packages/ml/TensorRTInference.cpp@171: TRT ERROR: ../rtSafe/cuda/caskUtils.cpp (98) - Assertion Error in trtSmToCask: 0 (Unsupported SM.)
2021-08-03 22:04:51.812 PANIC packages/ml/TensorRTInference.cpp@258: Failed to build TensorRT engine from external/sortbot_pose_estimation_models/resnet18_detector_kltSmall.etlt.

Running python3 .../isaac/engine/engine/build/scripts/version_checker.py shows:

---------------------------------------------------------------
|Package             |Recommended Version |Current Version     |
---------------------------------------------------------------
|OS                  |Ubuntu 18.04.2 LTS  |Ubuntu 18.04.5 LTS  |
|Bazel               |3.1.0               |3.1.0               |
|GPU_Driver          |>=440               |470.57.02           |
|Cuda                |10.2.x              |10.2.89             |
|Cudnn               |8.0.3.x             |8.0.3.33            |
|TensorRt            |7.1.5               |7.1.3.4             |
|TensorFlow          |1.15.0              |1.15.0              |
|pycapnp             |>=0.6.3             |0.6.4               |
|librosa             |>=0.6.3             |0.8.0               |
|SoundFile           |>=0.10.2            |0.10.3.post1        |
|Python2             |2.7.x               |2.7.17              |
|Python3             |3.6.x               |3.6.9               |
---------------------------------------------------------------

Update: I changed the command a little bit to invoke another section of the code and then the example worked fine. Here is the updated command:

bazel run //apps/samples/pick_and_place – --arm ur10 --groundtruth

In my opinion, the code section which was being executed by the previous command which caused the issue was this (I believe the code within the second else is creating some problem):

else:
        # Pose estimation
        if args.arm == 'franka':
            perception_interface = create_block_pose_estimation(app, use_refinement=args.refinement)
            app.connect(driver_out, "depth", perception_interface, "depth")
        else:
            app.load(
                'packages/object_pose_estimation/apps/pose_cnn_decoder' \
                '/detection_pose_estimation_cnn_inference.subgraph.json',
                prefix='detection_pose_estimation')
            perception_interface = app.nodes['detection_pose_estimation.interface']['Subgraph']
            app.load('apps/samples/pick_and_place/smallKLT_detection_pose_estimation.config.json')

        app.connect(driver_out, 'color', perception_interface, "color")
        app.connect(driver_out, 'color_intrinsics', perception_interface, "intrinsics")

        app.connect(perception_interface, 'output_poses',
                    app.nodes['pick_task.perceive_object']['WaitUntilDetection'], 'detections')
        app.connect(perception_interface, 'output_poses',
                    app.nodes['pick_task.detections_to_pose_tree']['DetectionsToPoseTree'],
                    'detections')

I found out my issue. This will apply to anyone attempting to build an ETLT model on an RTX 30xx or A6000 card as a result of running some of the Isaac SDK examples.

It appears CUDA 10.2 does not support the RTX 30xx or A6000 SM architecture, at least not yet. I got this from reading the following issues:

I tried to install Cuda 11+ which supports the SM architecture but quickly realized that since Isaac SDK is built on CUDA 10.2, these example programs wouldn’t run using CUDA 11.

My solution was to install an RTX 2080 onto my workstation alongside my RTX A6000 and exporting the following environment variable:

export CUDA_VISIBLE_DEVICES=1

where 1 is the pcie slot I installed my RTX 2080. This lets Isaac SDK run on that 2080 and it built the ETLT model properly.

Hopefully a patch will come in soon so we can run Isaac SDK on the Quadro or newer 30xx series cards.

1 Like

Thanks. Because of your answer I realize that probably my issue also cannot be solved since I have a machine with RTX3080.