Running LLM in jetson agx orin

Hello I am working on jetson agx orin

For some decition making i want the local llm running in the jetson device, i want near real time performance , also there are other vision models already i am utilizing

What i wanted to know is what is the best inbuilt solution there to run llm effectively in jetson
I hered some options as

Hi,

You cab find the vLLM/SGLang package for r36 below:

Thanks.

Is there any tutorial that i can refer , the latest one which explain how we can effectively use llm in jetson agx orin like production use case

Hi,

Yes, please find our tutorial below:

For example, you can find the container and the corresponding command for running the Qwen3 4B model:

Thanks.