Experiences with vision processing on Jetson Thor?

I myself I’ve been using Deepstream mostly. I like that it uses the widely used gstreamer framework. And that it makes it easy to leverage Jetson Thor hardware like nvenc, nvdec, … Also it is easy to add custom plugins like for mqtt etc.

Also: using Tao toolkit neural networks for early detection is well documented.

I have a Holoscan Sensor Bridge (HSB) fpga with Leopard camera’s.
It was quite a struggle to get HSB working with the Lattice board. But it’s working finally at v2603.

HSB has its own processing framework. Which is cool. But I felt that deepstream is more versatile. As a glue to connect basically anything to anything.

It looks like NVIDIA will not provide a way of integrating deepstream with HSB camera’s.

So I spent some AI tokens on connecting HSB with Deepstream.
This is my PoC of a deepstream source for Camera over Ethernet (CoE)
Maybe of use to you :-)

So what are your experiences with vision processing on Thor?

As far as VLM goes: I’ve spent a lot of time on running vllm on Thor.
There were quite a lot of PRs for Thor that just weren’t merged. But It has gotten better now. vllm/vllm-openai now supports Thor. Which is nice because we can keep up with vllm releases now with all their model fixes.

I wonder how you view this:
I got the impression that vllm is mostly geared towards highly concurrent cloud serving of models.

So Thor support, being an edge device, seemed patchy. Quite a lot of DGX Sparks fixes nowadays. Thor: not so much.

I’ve been looking into tensorrt-edge-llm though. That seems to be the sweet spot for Jetson Thor: less focus on batching / concurrency. More focus on low latency efficiency.

Again: what are your experiences with VLMs on Thor?

Hi,
Thanks for the sharing. As of now we don’t enable CoE in Deepstream SDK. Thanks for sharing the sample for this.

And for running vLLM, we have the example:
Run VLLM in Thor from VLLM Repository

It is not included in Deepstream SDK. We will check and see if we can include it.

I just stumbled across holoscan-sensor-bridge releases 2.6.0

Vllm is now developing in vllm repo a Rust language binary. I have tried it and it is getting frequent updates and will build vllm again soon in its own empty uv venv; using uv to build the repo.

vllm-rs
Rust frontend and managed-engine CLI for vLLM.

Usage: vllm-rs <COMMAND>

Commands:
  frontend  Run the Rust OpenAI frontend as a Python-supervised worker
  serve     Launch a managed Python headless engine, then run the Rust OpenAI frontend
  help      Print this message or the help of the given subcommand(s)

These work but will need tuning

cat scripts/vllm-rs-qwen3.sh
vllm-rs serve Qwen/Qwen3-0.6B  --python "$HOME/.git/uv/vllm-rs/bin/python"  \
  --host 127.0.0.1   --port 8000   --max-model-len 4096   --   --kv-cache-memory-bytes 4G \
  --max-num-seqs 1   --max-num-batched-tokens 1024   --enforce-eager

cat scripts/vllm-rs-nemotron3.sh
vllm-rs serve nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4 \
  --python "/home/scott/.git/uv/vllm-rs/bin/python" \
  --host 127.0.0.1 \
  --port 8000 \
  --max-model-len 4096 \
  --tokenizer-mode auto \
  --reasoning-parser nemotron_v3 \
  --tool-call-parser qwen3_coder \
  --default-chat-template-kwargs '{"enable_thinking": false}' \
  --disable-log-stats \
  --engine-ready-timeout-secs 1800 \
  -- \
  --trust-remote-code \
  --kv-cache-memory-bytes 16G \
  --kv-cache-dtype fp8 \
  --max-num-seqs 1 \
  --max-num-batched-tokens 4096 \
  --enforce-eager \
  --enable-prefix-caching


nvidia-holoscan/holohub looked promising until I tried building the containers or running any of the projects. That repo has, unless it’s been very recently updated, about zero Thor support.

@whitesscott interesting, I hadn’t noticed the Rust vllm api. I guess compared to the python api it will mean more efficient concurrency, no garbage collection pressure, less runtime memory overhead. I do wonder whether openai endpoints are the most efficient video processing on Thor.

I had been using HSB 2.6.0-EA2 but hadn’t noticed the 2.6.0 GA. I’m revisiting my deepstream-hololink implementation as a result. It seems that full SIPL support is only possible for a sensor if you have an NVIDIA signed NITO file. That wasn’t obvious to me when I bought my Leopard IMX274 + Lattice FPGA. Seems that NVIDIA actively blocks community development / open source. Which is disappointing tbh.
I’m looking into alternatives. I think the strong suit of Jetson Thor lies in hardware optimised video processing. And I do like open source / community development. So I’ll see how far I get with my open SIPL/CoE effort.