TensorRT process killed with Orin Nano


We are trying to convert model to TensorRT.
But it is failed with the error as below :

We set like this
=> torch2trt option : dla=True, max_workspace_size=1GB
But it looks it doesn’t apply.

The conversion process is killed but we can get the output.
This output works and inferencing time is half than before conversion.


Which tool do you use for the conversion?
Is it trtexec or other tools?


that’s error raise when convert Pytorch model to TensorRT using torch2trt (GitHub - NVIDIA-AI-IOT/torch2trt: An easy to use PyTorch to TensorRT converter) in python code

do i understand question correctly?

more environment info is …
device : jetson orin nano 4GB
test pytorch model : densenet121.
jetpack ver : 5.1.1


It looks like you try to convert the model into a DLA engine.
But Orin Nano doesn’t have the DLA hardware.

Please set dla=False and try it again.

there is no error message about dla after change above option dla=False. thanks.
but still process killed again and i think the reason of this problem is out of memory.

when i test on jetson nano 4GB, ( cli mode, available RAM size 3.2 GB ) operation work well.
but jetson orin nano 4GB, (cli mode, available RAM size 2.2GB) process killed.

Can i solve this problem using torch2trt option? ( max_workspace_size, etc…)
test pytorch model is densenet121.


Yes, you can try a smaller batch and workspace value.

Could you also try to add some swap memory?
This will help if the compiling takes some host memory.


thank you.
I tried the two methods you recommend, but swap memory didn’t solve this problem, probably for the reasons below.

but change the batch is worked. (change 8 to 4.) thanks.
but problem of change batch size is do inference 2 times.
it takes time spent 1.4 times more at inference.
i want keep my batch size for optimize my system

my question is,
Can you recommend another method for get more free RAM space?
(Stop certain services or other methods, etc. current available RAM size is 2.2GB in idle state, w/o GUI )
maybe 200MB space seems to be enough for pytorch model to TensorRT conversion and inference with keep my batchsize.

thank you for your support.


Loading cuDNN memory can take up to 600M or more.
There is a function called setTacticSources in TensorRT allows the user to deploy without calling cuDNN.

The function seems not been exported in torch2trt.
Is it possible to use TensorRT API directly?
This will require you to convert the model into ONNX format first.



Following your advice, I solved it in the following way.

  1. pytoch model to onnx (torch.onnx.export())
    i found the way use setTacticSource in torch2trt function (i’m not sure and I don’t know if it actually applied)
    (use builder and config, in https://github.com/NVIDIA-AI-IOT/torch2trt/blob/master/torch2trt/torch2trt.py#L654)

  2. onnx to tensorrt engine (trtexec) / trtexec option : --tacticSources=-CUDNN.
    (include CUDNN tactic raised warning that memory not enough)

So i can use my batchsize and increase available RAM space almost 1GB !

thanks for your support.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.