Experiences with vision processing on Jetson Thor?

joost-de-v · May 31, 2026, 4:33pm

I myself I’ve been using Deepstream mostly. I like that it uses the widely used gstreamer framework. And that it makes it easy to leverage Jetson Thor hardware like nvenc, nvdec, … Also it is easy to add custom plugins like for mqtt etc.

Also: using Tao toolkit neural networks for early detection is well documented.

I have a Holoscan Sensor Bridge (HSB) fpga with Leopard camera’s.
It was quite a struggle to get HSB working with the Lattice board. But it’s working finally at v2603.

HSB has its own processing framework. Which is cool. But I felt that deepstream is more versatile. As a glue to connect basically anything to anything.

It looks like NVIDIA will not provide a way of integrating deepstream with HSB camera’s.

So I spent some AI tokens on connecting HSB with Deepstream.
This is my PoC of a deepstream source for Camera over Ethernet (CoE)
Maybe of use to you :-)

So what are your experiences with vision processing on Thor?

joost-de-v · May 31, 2026, 4:57pm

As far as VLM goes: I’ve spent a lot of time on running vllm on Thor.
There were quite a lot of PRs for Thor that just weren’t merged. But It has gotten better now. vllm/vllm-openai now supports Thor. Which is nice because we can keep up with vllm releases now with all their model fixes.

I wonder how you view this:
I got the impression that vllm is mostly geared towards highly concurrent cloud serving of models.

So Thor support, being an edge device, seemed patchy. Quite a lot of DGX Sparks fixes nowadays. Thor: not so much.

I’ve been looking into tensorrt-edge-llm though. That seems to be the sweet spot for Jetson Thor: less focus on batching / concurrency. More focus on low latency efficiency.

Again: what are your experiences with VLMs on Thor?

DaneLLL · June 1, 2026, 9:13am

Hi,
Thanks for the sharing. As of now we don’t enable CoE in Deepstream SDK. Thanks for sharing the sample for this.

And for running vLLM, we have the example:
Run VLLM in Thor from VLLM Repository

It is not included in Deepstream SDK. We will check and see if we can include it.

whitesscott · June 7, 2026, 6:45am

I just stumbled across holoscan-sensor-bridge releases 2.6.0

Vllm is now developing in vllm repo a Rust language binary. I have tried it and it is getting frequent updates and will build vllm again soon in its own empty uv venv; using uv to build the repo.

vllm-rs
Rust frontend and managed-engine CLI for vLLM.

Usage: vllm-rs <COMMAND>

Commands:
  frontend  Run the Rust OpenAI frontend as a Python-supervised worker
  serve     Launch a managed Python headless engine, then run the Rust OpenAI frontend
  help      Print this message or the help of the given subcommand(s)

These work but will need tuning

cat scripts/vllm-rs-qwen3.sh
vllm-rs serve Qwen/Qwen3-0.6B  --python "$HOME/.git/uv/vllm-rs/bin/python"  \
  --host 127.0.0.1   --port 8000   --max-model-len 4096   --   --kv-cache-memory-bytes 4G \
  --max-num-seqs 1   --max-num-batched-tokens 1024   --enforce-eager

cat scripts/vllm-rs-nemotron3.sh
vllm-rs serve nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4 \
  --python "/home/scott/.git/uv/vllm-rs/bin/python" \
  --host 127.0.0.1 \
  --port 8000 \
  --max-model-len 4096 \
  --tokenizer-mode auto \
  --reasoning-parser nemotron_v3 \
  --tool-call-parser qwen3_coder \
  --default-chat-template-kwargs '{"enable_thinking": false}' \
  --disable-log-stats \
  --engine-ready-timeout-secs 1800 \
  -- \
  --trust-remote-code \
  --kv-cache-memory-bytes 16G \
  --kv-cache-dtype fp8 \
  --max-num-seqs 1 \
  --max-num-batched-tokens 4096 \
  --enforce-eager \
  --enable-prefix-caching

nvidia-holoscan/holohub looked promising until I tried building the containers or running any of the projects. That repo has, unless it’s been very recently updated, about zero Thor support.

joost-de-v · June 8, 2026, 3:42pm

@whitesscott interesting, I hadn’t noticed the Rust vllm api. I guess compared to the python api it will mean more efficient concurrency, no garbage collection pressure, less runtime memory overhead. I do wonder whether openai endpoints are the most efficient video processing on Thor.

I had been using HSB 2.6.0-EA2 but hadn’t noticed the 2.6.0 GA. I’m revisiting my deepstream-hololink implementation as a result. It seems that full SIPL support is only possible for a sensor if you have an NVIDIA signed NITO file. That wasn’t obvious to me when I bought my Leopard IMX274 + Lattice FPGA. Seems that NVIDIA actively blocks community development / open source. Which is disappointing tbh.
I’m looking into alternatives. I think the strong suit of Jetson Thor lies in hardware optimised video processing. And I do like open source / community development. So I’ll see how far I get with my open SIPL/CoE effort.

Topic		Replies	Views
Cannot un deepstream-app sample on Jetson Thor DeepStream SDK docker , deepstream , jetson-platform-services	5	102	January 27, 2026
vLLM 0.12.x Container for jetson Thor Jetson Thor generative_ai	4	265	January 8, 2026
Facing difficulty in installing Webui related things in Jetson AGX Thor Jetson Thor generative_ai , holoscan	3	219	January 20, 2026
AI-powered Vision and Compute Solutions for NVIDIA Jetson Thor Jetson Thor camera	1	175	September 3, 2025
Missing Deep Learning Framework Support for Thor Jetson Thor	2	151	October 9, 2025
JetPack 7.0/Jetson Linux 38.2 for NVIDIA Jetson Thor is now live Jetson Thor cudnn , llama	20	3613	October 27, 2025
Is it possible to use VLMs with Deepstream DeepStream SDK deepstream	3	274	August 11, 2025
Using Jetson AGX Thor for LLM Finetuning and got some questions Jetson Thor llm	5	344	January 12, 2026
TensorRT Edge-LLM on the AGX Thor Jetson Thor tensorrt , generative_ai	11	1066	December 4, 2025
NVIDIA Jetson Thor–Optimized Cameras & Compute Platforms from e-con Systems Jetson Thor camera	0	185	September 9, 2025

Experiences with vision processing on Jetson Thor?

Related topics