Jetsonnano 2gb custom model live inference?

Hey Team NVIDIA,
I have recently been using a jetson nano 2gb for model deployment. I have converted a custom model (pose estimation application) from Pytorch to onnx and .trt and am trying to run a live inference on the jetson nano.

model reference:

I used the following documentation to convert an onnx model to .trt format and it works on random tensors
> TensorRT/4. Using PyTorch through ONNX.ipynb at master · NVIDIA/TensorRT · GitHub

However, I want to run a live inference using one of the formats and am not able to find a simple working code example from the repo. I have seen it is possible to load some of the existing models mentioned in the repo below but how can you do the same for a custom model. Some guidance or sample code would be helpful. Thanks
GitHub - dusty-nv/jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

Hi @mukesh.narendran, the models that I run with jetson-inference poseNet are from this project: https://github.com/NVIDIA-AI-IOT/trt_pose

I have not used other custom models with it, and doing so would likely require addition pre/post-processing code to support your model.

Alternatively, you may want to try torch2trt tool which you can integrate directly with your PyTorch scripts to accelerate your model with TensorRT without much changes.

1 Like

Hey, thanks for getting back and also the great tutorials online. I have come across this repo and tried to use torch2trt with my model and the jetson is stuck for a long time. So I tried to use some sample code from the torch2trt website just as a starter and fails to convert and returns:
Segmentation fault (core dumped

import torch
from torch2trt import torch2trt
from torchvision.models.alexnet import alexnet

# create some regular pytorch model...
model = alexnet(pretrained=True).eval().cuda()

# create example data
x = torch.ones((1, 3, 224, 224)).cuda()

# convert to TensorRT feeding sample data as input
model_trt = torch2trt(model, [x])
torch.save(model_trt.state_dict(), 'alexnet_trt.pth')

I have managed to convert my model into .trt via onnx format but when i use opencv or jetson utils to run a live inference to the model it exits. How can I overcome this issue?

You would need to modify jetson-inference to use the pre/post-processing that your model expects. In my experience, there can be significant post-processing for pose estimations models. It may be easier for you to just use something like the ONNX Runtime, and use your existing Python application to do the pre/post-processing.

Do you mean the onnx runtime via cuda? I have a working onnx solution and tried it on a ras pi 4GB but it’s slow. Also tried running PyTorch directly via cuda on the jetson nano and it failed

ex: my model takes in a tensor(1, 3, 256, 192) and outputs a tensor(1, 18, 48, 48) and then takes the max values in the heatmap and thresholding. If i have to put this an onnx model through the detectnet example (or pose estimation model) how can I modify the jetson inference? I did not find much documentation online though on how to do?

Yes, if you set the ONNX backend to CUDA while running it on Jetson, it should be faster.

Your model is a pose estimation model, so it wouldn’t run through detectnet. Since your model is of a different architecture, you would need to modify the pre/post-processing here:

It’s not documented because I don’t support arbitrary pose estimation models in jetson-inference.

1 Like

Hey dusty_nv,
I used a stacked hour glass this time and converted it into onnx cuda, tensor RT and torch2trt and was successfully able to do it by running a random input tensor (1,3, 256, 256) through the model for 50 epochs and it took 0.025 s per epoch. The problem arises when I call the inference with the camera (below code/OpenCV) and try to pass in the image the terminal stops showing a ram too low problem:

How can I call a live camera via jetcam on the terminal (references would help)? Thanks

import jetson.inference
import jetson.utils

net = stacked hourglass() #example
camera = jetson.utils.videoSource("csi://0")      # '/dev/video0' for V4L2
display = jetson.utils.videoOutput("display://0") # 'my_video.mp4' for file

while display.IsStreaming():
	img = camera.Capture()
        #preprocessing done here ...
	detections = net(img) # passing in the image to model
	display.Render(img)

Have you tried disabling ZRAM and mounting swap, and if needed disabling the GUI? You can try these suggestions:

If those don’t provide enough reduction in memory, since you are on Nano 2GB you may need to use a less complex model.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.