Deploying LLMs

mainak1 · May 8, 2025, 5:47am

Hi,

I’m working on deploying LLM models with Retrieval-Augmented Generation (RAG) for educational purposes at the edge. Initially, we need a single working prototype, and after successful validation, we plan to place an order for 200 such units.

Our development setup currently uses Ollama with LangChain and ChromaDB on Ubuntu 22.04 with an RTX 4090 GPU. The application is designed for server environments, but we now need to adapt it for edge deployment.

Given our budget, the NVIDIA Jetson Orin NX 16GB is a feasible option. I would like to know:

Does the Jetson Orin NX 16GB support installing and running LangChain and ChromaDB via standard pip install, as in regular Ubuntu?
Or is it necessary to use nvidia-docker / nvidia-container for proper support and performance?

Any guidance or insights on how best to set up this stack on Jetson Orin NX would be greatly appreciated.

Thank you!

AastaLLL · May 8, 2025, 12:15pm

Hi,

We have several prebuilt in the link below but unfortunately, we don’t have such two packages:

But ideally, you can build it from the source.
To access GPU on Jetson, you will need to set up nvidia-container-toolkit.

We also have an example that demonstrates the RAG feature.
Please find the link below:

Thanks.

Topic		Replies	Views
Use of LangChain and LangGraph in Jetson Orin AGX Jetson AGX Orin generative_ai , llama	4	569	January 22, 2025
Seeking Best Practices for Deploying Efficient RAG Systems on NVIDIA Jetson Edge Devices Jetson AGX Orin llm	2	59	March 16, 2026
Want to run a Local LLM on Nvidia Jetson AGX Orin Jetson AGX Orin generative_ai	3	4105	July 17, 2024
How to create a simple Text based RAG for the Jetson Orin Nano 8GB? Jetson Orin Nano generative_ai	2	1082	April 12, 2024
FAQ: Can llama3.2 vision LM be deployed in Jetson Orin Nx 16g Jetson AGX Orin jetson-inference , generative_ai	5	385	November 27, 2024
Jetson Containers Quickstart on NVIDIA Jetson AGX Orin 64GB Jetson AGX Orin cudnn , kb , llama , deepseek	1	199	April 7, 2026
Llama.cpp on Jetson Orin NX 16GB for API-Only Inference — Bare Metal or NVIDIA Docker? Jetson Orin NX llama	2	225	February 24, 2026
LLaMa 2 LLMs w/ NVIDIA Jetson and textgeneration-web-ui Jetson Projects generative_ai	86	26365	May 10, 2024
Build Local AI RAG Agent with n8n on NVIDIA Jetson AGX Orin Jetson Projects	0	590	August 14, 2025
TensorRT for Large Language Models Jetson AGX Orin	2	671	September 11, 2023

Deploying LLMs

Related topics