We are trying to convert model to TensorRT.
But it is failed with the error as below :
We set like this
=> torch2trt option : dla=True, max_workspace_size=1GB
But it looks it doesn’t apply.
The conversion process is killed but we can get the output.
This output works and inferencing time is half than before conversion.
Which tool do you use for the conversion?
Is it trtexec or other tools?
that’s error raise when convert Pytorch model to TensorRT using torch2trt ( GitHub - NVIDIA-AI-IOT/torch2trt: An easy to use PyTorch to TensorRT converter) in python code
do i understand question correctly?
more environment info is …
device : jetson orin nano 4GB
test pytorch model : densenet121.
jetpack ver : 5.1.1
It looks like you try to convert the model into a DLA engine.
But Orin Nano doesn’t have the DLA hardware.
dla=False and try it again.
there is no error message about dla after change above option dla=False. thanks.
but still process killed again and i think the reason of this problem is out of memory.
when i test on jetson nano 4GB, ( cli mode, available RAM size 3.2 GB ) operation work well.
but jetson orin nano 4GB, (cli mode, available RAM size 2.2GB) process killed.
Can i solve this problem using torch2trt option? ( max_workspace_size, etc…)
test pytorch model is densenet121.
Yes, you can try a smaller batch and workspace value.
Could you also try to add some swap memory?
This will help if the compiling takes some host memory.
Would you mind to create some swap space to see if help?
You will need an extra space for the swap file.
sudo fallocate -l 8G [/media/mySSD/swapfile]
sudo chmod 600 [/media/mySSD/swapfile]
sudo mkswap [/media/mySSD/swapfile]
sudo /bin/sh -c 'echo "[/media/mySSD/swapfile] \t none \t swap \t defaults \t 0 \t 0" >> /etc/fstab'
sudo swapon -a
I tried the two methods you recommend, but swap memory didn’t solve this problem, probably for the reasons below.
NVIDIA product - Jetson Nano 2gb
operating system - Linux
Issue - I have my yolov4 tiny code for object tracking running on Jetson Nano 2gb, but the issue is of lower frame rate on Jetson Nano 2gb, I have already created extra swap memory of 5.9GB on Jetson. Still while running the codes it is not using the Swap memory created and thus it is giving me lower frame rate in outputs.
I would like to have a solution on how to increase the FPS while detection and how do I make the code utilize the …
but change the batch is worked. (change 8 to 4.) thanks.
but problem of change batch size is do inference 2 times.
it takes time spent 1.4 times more at inference.
i want keep my batch size for optimize my system
my question is,
Can you recommend another method for get more free RAM space?
(Stop certain services or other methods, etc. current available RAM size is 2.2GB in idle state, w/o GUI )
maybe 200MB space seems to be enough for pytorch model to TensorRT conversion and inference with keep my batchsize.
thank you for your support.
Loading cuDNN memory can take up to 600M or more.
There is a function called
setTacticSources in TensorRT allows the user to deploy without calling cuDNN.
The function seems not been exported in torch2trt.
Is it possible to use TensorRT API directly?
This will require you to convert the model into ONNX format first.
Following your advice, I solved it in the following way.
pytoch model to onnx (torch.onnx.export())
i found the way use setTacticSource in torch2trt function (i’m not sure and I don’t know if it actually applied)
(use builder and config, in https://github.com/NVIDIA-AI-IOT/torch2trt/blob/master/torch2trt/torch2trt.py#L654)
onnx to tensorrt engine (trtexec) / trtexec option : --tacticSources=-CUDNN.
(include CUDNN tactic raised warning that memory not enough)
So i can use my batchsize and increase available RAM space almost 1GB !
thanks for your support.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.