Torch2trt on AGX orin flashed as Nano

Kevin_P · January 30, 2023, 4:43pm

Hello everyone!
I’m facing an issue.
I’m working on a Jetson AGX orin developer Kit.
I want to optimize a model created with Pytorch using Torch2trt and trt_pose modules. Using the AGX Orin 32GB my code works fine.
So I tried to re-flash the host machine as a Nano 8GB. I re-run my code but I got the looping “error” lines like below and then my machine freezes:

[TRT] [E] 3: [builderConfig.cpp::canRunOnDLA::493] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builderConfig.cpp::canRunOnDLA::493, condition: dlaEngineCount > 0

I don’t know if you can help me, but I would like to understand what is the problem at least…
Thank you very much!

AastaLLL · January 31, 2023, 2:59am

Hi,

Which JetPack do you use?
Since Orin 4GB, 8GB variant is added in TRT 8.5, please try it with the latest JetPack 5.1.

Thanks.

Kevin_P · January 31, 2023, 8:51am

Thanks for the reply!
My setup is as follow:

Package: nvidia-jetpack
Version: 5.1-b147
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-jetpack-runtime (= 5.1-b147), nvidia-jetpack-dev (= 5.1-b147)
Homepage: http://developer.nvidia.com/jetson
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_5.1-b147_arm64.deb
Size: 29306
SHA256: 750acd147aa354a2dff225245149c8ac6a3802234157f2185c5d1b6fa9b9d2d9
SHA1: 8363c940eadd7300de57a70e2cd99dd321781b1c
MD5sum: 3da9b145351144eb1588e07f04e1e3d3
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8

Kevin_P · January 31, 2023, 4:40pm

Hi @AastaLLL !
I’ve tried to re-flash the host machine, always considering the emulation of a AGX Nano 8GB, and after re-installing all the dependencies, that issue didn’t occur…
But since I’ve done nothing different from before I don’t know if there was a problem with the hardware. The only thing is that to install Pytorch I’ve downloaded the wheel from “Pytorch for Jetpack” here
And I had to install torchvision=0.10.0 because of some dependencies error. Do you think I did all right?

By the way, re-running my code to optimize the NN model, the host machine seems freezed (now it’s more or less 30mins). It happens also for the Jetson Nano developer kit, and the way I’ve fixed that problem was to do:

sudo swapoff -a
sudo swapon -a
sudo sysctl vm.swappiness=100

So I did the same operation on the AGX Nano 8GB, but as said, it freezes…

AastaLLL · February 1, 2023, 5:45am

Hi,

We don’t have PyTorch for JetPack 5.1 yet but will be available soon.
Some users try the prebuilt for JetPack 5.0.2 and report it can work.
You can also use it as a temporal workaround.

For the freeze, do you have any idea how much memory it takes when running on the Nano devkit?

Thanks.

Kevin_P · February 1, 2023, 8:09am

Hi!
Thanks for the support as always!
About version of JetPack, you suggest to re-flash the AGX as Nano 8GB again or it is not needed?
And then install the JetPack component with version 5.0.2?
What I’m asking is, which are the correct steps starting from my current machine setup?

As for the freeze, I’m not able to know how much memory it takes when running on Nano devkit… When executing the optimization also the Nano freezes for 10/20 min more or less and I’m not able to check the memory usage.
As for an update, I let my AGX as Nano 8GB turned on for all night and it’s still freezed. Now it’s 15h that it’s like that. Do you suggest to force the shutdown and re-try with JetPack 5.0.2?

Just to have more detail (maybe it can help), while using the torch2trt modules, the functions that have used and are freezing the machine are those below. After calling:

torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)

the freezing is caused by the function below and in particular while doing the instruction under # BUILD ENGINE section (“engine = builder.build_engine(network, config)”):

def torch2trt(module,
              inputs,
              input_names=None,
              output_names=None,
              log_level=trt.Logger.ERROR,
              fp16_mode=False,
              max_workspace_size=1<<25,
              strict_type_constraints=False,
              keep_network=True,
              int8_mode=False,
              int8_calib_dataset=None,
              int8_calib_algorithm=DEFAULT_CALIBRATION_ALGORITHM,
              use_onnx=False,
              default_device_type=trt.DeviceType.GPU,
              dla_core=0,
              gpu_fallback=True,
              device_types={},
              min_shapes=None,
              max_shapes=None,
              opt_shapes=None,
              onnx_opset=None,
              max_batch_size=None,
              **kwargs):

    # capture arguments to provide to context
    kwargs.update(locals())
    kwargs.pop('kwargs')
        
    # handle inputs as dataset of list of tensors
    if issubclass(inputs.__class__, Dataset):
        dataset = inputs
        if len(dataset) == 0:
            raise ValueError('Dataset must have at least one element to use for inference.')
        inputs = dataset[0]
    else:
        dataset = ListDataset()
        dataset.insert(inputs)
        inputs = dataset[0]

    outputs = module(*inputs)
    input_flattener = Flattener.from_value(inputs)
    output_flattener = Flattener.from_value(outputs)

    # infer default parameters from dataset

    if min_shapes == None:
        min_shapes_flat = [tuple(t) for t in dataset.min_shapes(flat=True)]
    else:
        min_shapes_flat = input_flattener.flatten(min_shapes)

    if max_shapes == None:
        max_shapes_flat = [tuple(t) for t in dataset.max_shapes(flat=True)]
    else:
        max_shapes_flat = input_flattener.flatten(max_shapes)
    
    if opt_shapes == None:
        opt_shapes_flat = [tuple(t) for t in dataset.median_numel_shapes(flat=True)]
    else:
        opt_shapes_flat = input_flattener.flatten(opt_shapes)

    # handle legacy max_batch_size
    if max_batch_size is not None:
        min_shapes_flat = [(1,) + s[1:] for s in min_shapes_flat]
        max_shapes_flat = [(max_batch_size,) + s[1:] for s in max_shapes_flat]

    dynamic_axes_flat = infer_dynamic_axes(min_shapes_flat, max_shapes_flat)
    
    if default_device_type == trt.DeviceType.DLA:
        for value in dynamic_axes_flat:
            if len(value) > 0:
                raise ValueError('Dataset cannot have multiple shapes when using DLA')

    logger = trt.Logger(log_level)
    builder = trt.Builder(logger)
    config = builder.create_builder_config()

    if input_names is None:
        input_names = default_input_names(input_flattener.size)
    if output_names is None:
        output_names = default_output_names(output_flattener.size)

    if use_onnx:
        import onnx_graphsurgeon as gs
        import onnx
        
        module_flat = Flatten(module, input_flattener, output_flattener)
        inputs_flat = input_flattener.flatten(inputs)

        f = io.BytesIO()
        torch.onnx.export(
            module_flat, 
            inputs_flat, 
            f, 
            input_names=input_names, 
            output_names=output_names,
            dynamic_axes={
                name: {int(axis): 'axis_%d' % axis for axis in dynamic_axes_flat[index]}
                for index, name in enumerate(input_names)
            },
            opset_version=onnx_opset
        )
        f.seek(0)
        
        onnx_graph = gs.import_onnx(onnx.load(f))
        onnx_graph.fold_constants().cleanup()


        f = io.BytesIO()
        onnx.save(gs.export_onnx(onnx_graph), f)
        f.seek(0)

        onnx_bytes = f.read()
        network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
        parser = trt.OnnxParser(network, logger)
        parser.parse(onnx_bytes)

    else:
        network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
        with ConversionContext(network, torch2trt_kwargs=kwargs, builder_config=config, logger=logger) as ctx:
            
            inputs_flat = input_flattener.flatten(inputs)

            ctx.add_inputs(inputs_flat, input_names, dynamic_axes=dynamic_axes_flat)

            outputs = module(*inputs)

            outputs_flat = output_flattener.flatten(outputs)
            ctx.mark_outputs(outputs_flat, output_names)

    # set max workspace size
    config.max_workspace_size = max_workspace_size

    if fp16_mode:
        config.set_flag(trt.BuilderFlag.FP16)

    config.default_device_type = default_device_type
    if gpu_fallback:
        config.set_flag(trt.BuilderFlag.GPU_FALLBACK)
    config.DLA_core = dla_core
    
    if strict_type_constraints:
        config.set_flag(trt.BuilderFlag.STRICT_TYPES)

    if int8_mode:

        # default to use input tensors for calibration
        if int8_calib_dataset is None:
            int8_calib_dataset = dataset

        config.set_flag(trt.BuilderFlag.INT8)

        #Making sure not to run calibration with QAT mode on
        if not 'qat_mode' in kwargs:
            calibrator = DatasetCalibrator(
                int8_calib_dataset, algorithm=int8_calib_algorithm
            )
            config.int8_calibrator = calibrator

    # OPTIMIZATION PROFILE
    profile = builder.create_optimization_profile()
    for index, name in enumerate(input_names):
        profile.set_shape(
            name,
            min_shapes_flat[index],
            opt_shapes_flat[index],
            max_shapes_flat[index]
        )
    config.add_optimization_profile(profile)

    if int8_mode:
        config.set_calibration_profile(profile)

    # BUILD ENGINE

    engine = builder.build_engine(network, config)

    module_trt = TRTModule(engine, input_names, output_names, input_flattener=input_flattener, output_flattener=output_flattener)

    if keep_network:
        module_trt.network = network

    return module_trt

Thank you very much!!

AastaLLL · February 2, 2023, 6:46am

Hi,

Since Orin Nano variance is added from TensorRT 8.5, it’s recommended to stay on JetPack 5.1.

Could you share how you install the PyTorch package?
If you are not following the doc below, please give it a try.

If no luck with the above installation (JP5.1+our prebuilt PyTorch), could you share your model and the script for us to test?

For the freeze issue, could you try if the app can be terminated with control-C?
Or force killing with the PID?

Thanks.

Kevin_P · February 2, 2023, 10:19am

Hello again!

Starting from the second question, as for the freeze, I cannot shutdown the process with shortcut Ctrl+C or by killing the process with the PID since totally no operation can be done: no keyboard or mouse actions are no more allowed. Hardware is completely freezed. I’m sorry…

As for the first question, I have used exactly that documentation to install Pytorch. And the Pytorch wheel is the one mentioned in the first comment. I will attach the files for the model optimization, but for the model it exceeds the file size limit (186MB) even if it is zipped… How can I share it with you?
nnHardwareOptimization.py (3.4 KB)
object_pose.json (4.2 KB)

AastaLLL · February 6, 2023, 6:56am

Hi,

Is it possible to ssh the system and terminate the app?

We just release the PyTorch that was built with JetPack 5.1 last weekend.
Would you mind testing if the same issue still occurs?

Thanks.

Kevin_P · February 6, 2023, 1:51pm

Hi!
As soon as I’ll collect those data and tests I will come back again to you!
Thank you very much!

kayccc · February 21, 2023, 3:54am

Is this still an issue to support? Any result can be shared? Thanks

Kevin_P · February 21, 2023, 10:43am

Hello!
I didn’t tried it yet. I have prioritized other stuff. If it is the case, you can close the topic and as soon as I’ll do further test and some issue occurs I will post again.
Thanks!

system · March 21, 2023, 9:12am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TensorRT quantization bug on Jetpack 6.0 Jetson AGX Orin tensorrt , pytorch	6	606	January 22, 2024
Issue with yolov8s-seg conversion from onnx to engine Jetson AGX Orin yolo , onnx	5	46	October 29, 2024
Jetson Orin Nano restarts while running building text-generation-webui (Loading exllamav2_ext extension (JIT)...) Jetson Orin Nano containers , generative_ai	10	68	March 12, 2025
Perfomances drop after AGX Orin update Jetson AGX Orin cudnn	7	69	March 28, 2025
JetPack 6.2 Brings Super Mode to NVIDIA Jetson Orin Nano and Jetson Orin NX Modules Jetson Orin Nano cudnn , jetson , deepstream	6	1102	February 6, 2025
Emulation Flash Configurations not working with Jetpack 6.0 Jetson AGX Orin reflash	26	104	December 31, 2024
Fan runs on full on startup Jetson AGX Orin fan-facts	26	2672	January 30, 2023
Jetson Orin Nano missing power modes Jetson Orin Nano power	11	113	March 14, 2025
TensorRT 8.6 not running properly on Orin NX with Jetpack 6 Jetson AGX Orin tensorrt , generative_ai	6	1014	December 25, 2023
Unable to Flash NVME Jetson AGX Orin 64GB Dev kit with supplied USB Cable Jetson AGX Orin reflash , nvme	44	142	April 22, 2025

Torch2trt on AGX orin flashed as Nano

Related topics