PyTorch for Jetson

hi @dusty_nv , I really appreciate the docker your release, but I found some problem and I solve it.

PROBLEM:
I am use the l4t-pytorch 35.2.1, when export the model from torch2onnx and use the func torch.onnx.symbolic_helper._unsqueeze and other func cause the attribute error g has no attribute opset.

PROBLEM_LOCATION
I check the torch doc and GitHub repo, and PyTorch has change some code below
(I just use the func _unsqueeze for example)

  • before 1.13 symbolic_helper.py
def _unsqueeze_helper(g, input, axes_i):
    if _is_constant(axes_i[0]):
        if GLOBALS.export_onnx_opset_version >= 13:
            axes = g.op("Constant", value_t=torch.tensor(axes_i, dtype=torch.long))
            return g.op("Unsqueeze", input, axes)
        return g.op("Unsqueeze", input, axes_i=axes_i)
    # Tensor type
    if GLOBALS.export_onnx_opset_version < 13:
        raise ValueError(
            f"Opset version must be >= 13 for Unsqueeze with dynamic axes. {input.node().sourceRange()}"
        )
    return g.op("Unsqueeze", input, axes_i[0])
  • after 1.13 symbolic_helper.py
@_beartype.beartype
def _unsqueeze_helper(g: jit_utils.GraphContext, input, axes_i):
    if _is_constant(axes_i[0]):
        if g.opset >= 13:
            axes = g.op("Constant", value_t=torch.tensor(axes_i, dtype=torch.long))
            return g.op("Unsqueeze", input, axes)
        return g.op("Unsqueeze", input, axes_i=axes_i)
    # Tensor type
    if g.opset < 13:
        raise errors.SymbolicValueError(
            "Opset version must be >= 13 for Unsqueeze with dynamic axes.", input
        )
    return g.op("Unsqueeze", input, axes_i[0])

PROBLEM_SOLVE_METHOD
If we use the symbolic method, the torch will using this torch.onnx._run_symbolic_method func. and the PyTorch has modified this func. I will show diff below.

  • _run_symbolic_method before 1.12.0
def _run_symbolic_method(g, op_name, symbolic_fn, args):
    r"""
    This trampoline function gets invoked for every symbolic method
    call from C++.
    """
    try:
        return symbolic_fn(g, *args)
    except TypeError as e:
        # Handle the specific case where we didn't successfully dispatch
        # to symbolic_fn.  Otherwise, the backtrace will have the clues
        # you need.
        e.args = ("{} (occurred when translating {})".format(e.args[0], op_name),)
        raise
  • _run_symbolic_method after 2.0.0
@_beartype.beartype
def _run_symbolic_method(g, op_name, symbolic_fn, args):
    r"""
    This trampoline function gets invoked for every symbolic method
    call from C++.
    """
    try:
        graph_context = jit_utils.GraphContext(
            graph=g,
            block=g.block(),
            opset=GLOBALS.export_onnx_opset_version,
            original_node=None,  # type: ignore[arg-type]
            params_dict=_params_dict,
            env={},
        )
        return symbolic_fn(graph_context, *args)
    except TypeError as e:
        # Handle the specific case where we didn't successfully dispatch
        # to symbolic_fn.  Otherwise, the backtrace will have the clues
        # you need.
        e.args = (f"{e.args[0]} (occurred when translating {op_name})",)
        raise

and the solution is very very simple… Just add

graph_context = jit_utils.GraphContext(
            graph=g,
            block=g.block(),
            opset=GLOBALS.export_onnx_opset_version,
            original_node=None,  # type: ignore[arg-type]
            params_dict=_params_dict,
            env={},
        )

that to your utils.py func _run_symbolic_method. and model can export.

TEST_CODE
I test this change on Jetson AGX Orin and it work well.

Thanks @1106310035 - it would seem the issue you encountered wasn’t specific to Jetson/aarch64, but rather a bug in the PyTorch JIT/ONNX exporter that they fixed (these happen). Probably the easiest solution for others it to just use PyTorch 2.0/2.1 and if needed rebuild pytorch container for it or use one of my updated images (unless you have a requirement to use PyTorch 1.13 and need to backport these patches to that version)

I didn’t express my point clearly. I believe we should rebuild this image and correct the faulty code. In Docker 35.2.1 with PyTorch 2.0, jit_utils.GraphContext is not being used, but it is used in the GitHub repository for PyTorch 2.0. I think Docker 35.2.1 and PyTorch 2.0 are following the old code from PyTorch version 1.12.0, so jit_utils.GraphContext should be added

It’s possible the wheel was built before those changes were merged. In that case, use PyTorch 2.1 wheel instead and the containers from here.

Sure, Thanks.
I will try that.

Pre-built PyTorch wheels for JetPack 6 have been posted:

PyTorch v2.2.0

PyTorch v2.1.0

This topic has also been moved to the Jetson Announcements forum for visibility beyond the original Jetson Nano.

5 Likes

The latest PyTorch 2.3 wheels for JetPack 6 have been posted, along with wheels for torchvision and torchaudio:

The default version of CUDA in JetPack 6 is CUDA 12.2, so use those unless you have upgraded it to the CUDA 12.4. These were built with jetson-containers.