AGX Orin 64GB got torch.compile fails with _thread.RLock on GR00T N1.6

thanhh · March 17, 2026, 4:11am

Hi all,

I’m getting started with gr00t n1.6 to run on Jetson AGX Orin 64GB. I tried run benchmark_inference in Pytorch mode by below command:

python scripts/deployment/benchmark_inference.py \
  --model_path weights/GR00T-N1.6-3B \
  --dataset_path demo_data/gr1.PickNPlace \
  --embodiment_tag gr1 \
  --num_iterations 100 \
  --warmup 10 \
  --use_trajectory \
  --skip_compile

And everything is oke, but when i try to run with torch.compile mode by remove option “–skip_compile“ in above command , i get error related to this block code:

# PyTorch mode with torch.compile
policy.model.action_head.model.forward = torch.compile(
    policy.model.action_head.model.forward, mode="max-autotune"
)

Error logs look like below:

Traceback (most recent call last):
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/scripts/deployment/benchmark_inference.py", line 578, in <module>
    main()
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/scripts/deployment/benchmark_inference.py", line 484, in main
    times_components = benchmark_components(
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/scripts/deployment/benchmark_inference.py", line 214, in benchmark_components
    _ = policy.model.action_head.get_action(backbone_outputs, action_inputs)
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/modelopt/project/gr00ttn1.6/gr00t/model/gr00t_n1d6/gr00t_n1d6.py", line 384, in get_action
    return self.get_action_with_features(
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/modelopt/project/gr00ttn1.6/gr00t/model/gr00t_n1d6/gr00t_n1d6.py", line 339, in get_action_with_features
    model_output = self.model(
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 816, in compile_wrapper
    raise e.remove_dynamo_frames() from None  # see TORCHDYNAMO_VERBOSE=1
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 952, in _compile_fx_inner
    raise InductorError(e, currentframe()).with_traceback(
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 936, in _compile_fx_inner
    mb_compiled_graph = fx_codegen_and_compile(
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1616, in fx_codegen_and_compile
    return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1479, in codegen_and_compile
    compiled_module = graph.compile_to_module()
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 2310, in compile_to_module
    return self._compile_to_module()
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 2320, in _compile_to_module
    mod = self._compile_to_module_lines(wrapper_code)
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 2388, in _compile_to_module_lines
    mod = PyCodeCache.load_by_key_path(
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/codecache.py", line 3360, in load_by_key_path
    mod = _reload_python_module(key, path, set_sys_modules=in_toplevel)
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/runtime/compile_tasks.py", line 31, in _reload_python_module
    exec(code, mod.__dict__, mod.__dict__)
  File "/workspace/tmp/torchinductor_modelopt/oe/coevhwlsmqou4psjptsf34qsaaw7nake56l6iu6qjvtt2bxflqgo.py", line 1254, in <module>
    async_compile.wait(globals())
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/async_compile.py", line 573, in wait
    self._wait_futures(scope)
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/async_compile.py", line 593, in _wait_futures
    kernel = result.result()
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/codecache.py", line 4095, in result
    return self.result_fn()
  File "/mnt/modelopt/project/gr00ttn1.6_tcp/.venv/lib/python3.10/site-packages/torch/_inductor/async_compile.py", line 452, in get_result
    kernel, elapsed_us = task.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
torch._inductor.exc.InductorError: TypeError: cannot pickle '_thread.RLock' object

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

I tried updating above block code to:

# PyTorch mode with torch.compile
policy.model = torch.compile(
    policy.model, mode="max-autotune"
)

This block can run but the result is same with pytorch mode, not any improvement about performance, my target is the same with NVIDIA report as below (The pytorch mode is the same with report):

Orin	PyTorch Eager	6 ms	93 ms	202 ms	300 ms	3.3 Hz
Orin	torch.compile	6 ms	93 ms	101 ms	199 ms	5.0 Hz

But actually, i got the same, i think above update is wrong way. Does anyone get the same problem?

This is my dependencies:

absl-py==2.4.0
accelerate==1.2.1
aiosignal==1.4.0
albucore==0.0.17
albumentations==1.4.18
annotated-types==0.7.0
antlr4-python3-runtime==4.9.3
asttokens==3.0.1
astunparse==1.6.3
attrs==25.4.0
av==12.3.0
blessings==1.7
certifi==2026.2.25
charset-normalizer==3.4.4
click==8.3.1
cloudpickle==3.1.2
comm==0.2.3
contourpy==1.3.2
cramjam==2.11.0
cycler==0.12.1
debugpy==1.8.20
decorator==5.2.1
decord==0.6.0
diffusers==0.35.0
distro==1.9.0
dm-tree==0.1.8
docker-pycreds==0.4.0
docstring-parser==0.17.0
einops==0.8.2
eval-type-backport==0.3.1
exceptiongroup==1.3.1
executing==2.2.1
farama-notifications==0.0.4
fastparquet==2024.11.0
filelock==3.25.0
flash-attn==2.8.2
flatbuffers==25.12.19
fonttools==4.61.1
frozenlist==1.8.0
fsspec==2026.2.0
gast==0.7.0
gitdb==4.0.12
gitpython==3.1.46
google-pasta==0.2.0
-e file:///mnt/modelopt/project/gr00ttn1.6
grpcio==1.78.0
gymnasium==1.0.0
h5py==3.12.1
hf-xet==1.3.2
huggingface-hub==0.36.2
hydra-core==1.3.2
idna==3.11
imageio==2.34.2
importlib-metadata==8.7.1
iniconfig==2.3.0
iopath==0.1.9
ipykernel==7.2.0
ipython==8.38.0
jedi==0.19.2
jetson-stats==4.3.2
jinja2==3.1.6
jsonschema==4.26.0
jsonschema-specifications==2025.9.1
jupyter-client==8.8.0
jupyter-core==5.9.1
keras==3.12.1
kiwisolver==1.4.9
kornia==0.7.4
kornia-rs==0.1.10
lazy-loader==0.4
libclang==18.1.1
llvmlite==0.46.0
lmdb==1.8.1
mako==1.3.10
markdown==3.10.2
markdown-it-py==4.0.0
markupsafe==3.0.3
matplotlib==3.10.0
matplotlib-inline==0.2.1
mdurl==0.1.2
ml-dtypes==0.4.1
mpmath==1.3.0
msgpack==1.1.2
namex==0.1.0
nest-asyncio==1.6.0
networkx==3.4.2
ninja==1.13.0
numba==0.64.0
numpy==1.26.4
numpydantic==1.6.7
-e file:///home/modelopt/workspace/project/Model-Optimizer
nvtx==0.2.14
omegaconf==2.3.0
onnx==1.18.0
opencv-python==4.11.0.86
opencv-python-headless==4.11.0.86
opt-einsum==3.4.0
optree==0.19.0
packaging==26.0
pandas==2.2.3
parso==0.8.6
peft==0.17.0
pettingzoo==1.25.0
pexpect==4.9.0
pillow==12.1.1
pip==26.0.1
platformdirs==4.9.2
pluggy==1.6.0
portalocker==3.2.0
prompt-toolkit==3.0.52
protobuf==4.25.1
psutil==7.2.2
ptyprocess==0.7.0
pulp==3.3.0
pure-eval==0.2.3
pyarrow==14.0.1
pydantic==2.10.6
pydantic-core==2.27.2
pygments==2.19.2
pyparsing==3.3.2
pytest==9.0.2
python-dateutil==2.9.0.post0
pytools==2025.2.5
pytorch3d @ git+https://github.com/facebookresearch/pytorch3d.git@33824be3cbc87a7dd1db0f6a9a9de9ac81b2d0ba
pytz==2026.1.post1
pyyaml==6.0.2
pyzmq==27.1.0
ray==2.40.0
referencing==0.37.0
regex==2026.2.28
requests==2.32.3
rich==14.3.3
rpds-py==0.30.0
safetensors==0.7.0
scikit-image==0.25.2
scipy==1.15.3
sentry-sdk==2.54.0
setproctitle==1.3.7
setuptools==82.0.0
shtab==1.8.0
siphash24==1.8
six==1.17.0
smbus2==0.6.0
smmap==5.0.2
stack-data==0.6.3
sympy==1.14.0
tensorboard==2.18.0
tensorboard-data-server==0.7.2
tensorflow==2.18.0
tensorflow-io-gcs-filesystem==0.37.1
termcolor==3.3.0
tianshou==0.5.1
tifffile==2025.5.10
timm==1.0.14
tokenizers==0.21.4
tomli==2.4.0
torch==2.8.0
torchvision==0.23.0
tornado==6.5.4
tqdm==4.67.1
traitlets==5.14.3
transformers==4.51.0
triton==3.6.0
typeguard==4.4.2
typing-extensions==4.15.0
tyro==0.9.17
tzdata==2025.3
urllib3==2.6.3
wandb==0.18.0
wcwidth==0.6.0
werkzeug==3.1.6
wheel==0.46.3
wrapt==2.1.1
zipp==3.23.0

Thank you!

whitesscott · March 17, 2026, 7:28am

You’re running scripts/activate_orin.sh after each Orin logon or ssh and prior to running standalone_inference_script.py ?

https://github.com/NVIDIA/Isaac-GR00T/blob/main/scripts/activate_orin.sh
# this script sets the variables among other things:
    export TRITON_PTXAS_PATH=/usr/local/cuda/bin/ptxas
    export CUDA_HOME=/usr/local/cuda
    export CUDA_PATH=/usr/local/cuda

Jetson Orin, not docker, setup.

bash scripts/deployment/orin/install_deps.sh
source .venv/bin/activate
source scripts/activate_orin.sh

I just built docker image with

./build.sh --profile=orin

Started container with

docker run -it --rm --net=host --gpus all --runtime nvidia \
    --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
    -v /home/scott/.cache:/root/.cache \
    -e $HF_TOKEN \
    gr00t-thor:latest

and following ran to completion without error.

python scripts/deployment/standalone_inference_script.py \
    --model-path nvidia/GR00T-N1.6-3B \
    --dataset-path demo_data/gr1.PickNPlace \
    --embodiment-tag GR1 \
    --traj-ids 0 \
    --inference-mode pytorch \
    --denoising-steps 4

AastaLLL · March 17, 2026, 7:48am

Hi,

Do you set up the environment with the link below?

github.com/NVIDIA/Isaac-GR00T

scripts/deployment/README.md

main

# GR00T Deployment & Inference Guide

Run inference with PyTorch or TensorRT acceleration for the GR00T policy.

---

## Prerequisites

- Model checkpoint (e.g., `nvidia/GR00T-N1.6-3B`)
- Dataset in LeRobot format
- CUDA-enabled GPU

## Choose Your Setup

- dGPU local environment: use the installation commands below, then use the PyTorch or TensorRT commands in this guide
- Thor Docker or bare metal: skip to [Jetson Thor Setup](#jetson-thor-setup)
- Spark Docker or bare metal: skip to [DGX Spark Setup](#dgx-spark-setup)
- Orin Docker or bare metal: skip to [Jetson Orin Setup](#jetson-orin-setup)

### dGPU Installation

This file has been truncated. show original

Thanks.

thanhh · March 19, 2026, 2:22am

Thank you so much. I tried using docker and it worked without no error!

Topic		Replies	Views
Error when using torch.compile on Jetson Orin NX Jetson AGX Orin pytorch	7	1385	June 27, 2024
Install a CUDA compiled version of Torch in Jetson Orin AGX with jetpack 6.2 Jetson AGX Orin cuda	3	203	July 24, 2025
Unable to compile torch_tensorrt on Orin Jetson AGX Orin tensorrt	4	723	June 14, 2022
ImportError: cannot import name 'is_compiling' from 'torch._dynamo' Jetson AGX Orin pytorch	5	2673	June 9, 2023
Torch2trt on AGX orin flashed as Nano Jetson AGX Orin tensorrt , python	12	1224	March 21, 2023
Jetson AGX Orin ,pytorch, torchvision Jetson AGX Orin pytorch	10	2359	October 11, 2022
Jetson distributed torch Jetson AGX Orin pytorch	4	135	September 22, 2025
Unable to install Torch on jetson agx orin (Im going insane) Jetson AGX Orin cuda , yolo , pytorch , cudnn	3	226	August 22, 2025
Torch not compiled with CUDA enabled Jetson AGX Orin cuda , pytorch , cudnn	3	480	May 16, 2025
I installed pytorch through your website source, but it is CPU only. why? Jetson Orin Nano cuda	2	133	October 8, 2025

AGX Orin 64GB got torch.compile fails with _thread.RLock on GR00T N1.6

Related topics