Deploying a Small Language Model on Jetson Nano

Hi NVIDIA Community,

I’m looking for guidance to deploy a small text-based language model on a Jetson Nano. I’m relatively new to this and would greatly appreciate a detailed, step-by-step guide. Here’s my understanding so far:

  1. Set up JetPack: Install JetPack for the required software stack. Are there specific dependencies or configurations needed for NLP models?
  2. Model Conversion: I plan to use a PyTorch model. What’s the best way to convert it to ONNX for TensorRT? Any tips or limitations for text-based models?
  3. Deployment and Optimization: How do I load and run the optimized ONNX model using TensorRT? Any best practices for maximizing performance and efficiency?

If there are any tutorials, sample scripts, or recommendations for handling NLP models on Jetson Nano, I’d greatly appreciate it.

Thanks for your help!

Hi,

Here are the corresponding replies:

  1. No special configure is required.
    But if you want to use PyTorch, please install it afterward with the packages shared in the below link:
    PyTorch for Jetson

  2. Default PyTorch exporter should be fine.

  3. This can be done with trtexec binary directly.

$ /usr/src/tensorrt/bin/trtexec --onnx=[file]

You can find the TensorRT sample on our GitHub.
For example:

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.