VIC Engine Limitation: Performance Cliff at 9+ Concurrent nvvidconv Pipelines on Jetson AGX Thor

louis62 · January 13, 2026, 11:08am

Summary

I discovered a significant performance cliff when running 9+ concurrent GStreamer pipelines using nvvidconv with default settings (VIC) on Jetson AGX Thor (JetPack 6, R38.4.0). Performance drops by ~35% when going from 8 to 9 pipelines. Using compute-hw=GPU instead of VIC eliminates this issue.

Use Case

I need to decode 20 concurrent 1080p H.264 video streams and resize each to 768x416 to feed a neural network for inference. The video processing should not be the bottleneck.

Environment

Platform: NVIDIA Jetson AGX Thor Developer Kit
JetPack: 7.1 (L4T R38.4)
Ubuntu: 24.04 LTS (Noble)
Container: nvcr.io/nvidia/deepstream:8.0-samples-multiarch
GStreamer: 1.24.x
Kernel: 6.8.12-tegra

Issue Description

When running concurrent pipelines with nvvidconv (which uses VIC by default on Jetson), performance collapses at 9 and 10 pipelines:

Test Pipeline (per stream)

filesrc ! qtdemux ! h264parse ! nvv4l2decoder ! tee name=t \
  t. ! queue ! fakesink \
  t. ! queue ! nvvidconv ! video/x-raw(memory:NVMM),width=768,height=416 ! fakesink

Results with VIC (default)

Pipelines	FPS per pipeline	Total throughput
8	~73 FPS	584 FPS
9	~47 FPS	423 FPS ⚠️
10	~26 FPS	260 FPS ⚠️
15	~12 FPS	180 FPS

35% performance drop from 8→9 pipelines!

Results with GPU mode (`compute-hw=GPU`)

Pipelines	FPS per pipeline	Total throughput
8	~73 FPS	584 FPS
9	~68 FPS	612 FPS ✓
10	~65 FPS	650 FPS ✓
15	~50 FPS	750 FPS ✓

No cliff! Smooth scaling.

Diagnostic Observations

NVDEC is NOT the bottleneck - Decode-only pipelines scale to 15+ without issues
VIC appears limited to ~8 concurrent operations - The cliff happens regardless of nvvidconv count per pipeline
GPU mode works correctly - Using nvvidconv compute-hw=GPU eliminates the cliff
nvidia-smi shows erratic utilization during VIC cliff - GPU/Decoder usage becomes unstable at 9+ VIC pipelines

Reproduction Steps

# Start DeepStream container
docker run -it --rm --runtime nvidia -e NVIDIA_DRIVER_CAPABILITIES=all \
  -v /tmp:/tmp nvcr.io/nvidia/deepstream:8.0-samples-multiarch bash

# Inside container, create and run the benchmark script below

Reproduction Script

#!/bin/bash
# vic_benchmark.sh - Test VIC vs GPU for nvvidconv scaling
# Run inside nvcr.io/nvidia/deepstream:8.0-samples-multiarch container

FRAMES=500
VIDEO_DIR="/tmp/vic_test"
mkdir -p $VIDEO_DIR

# Generate test videos
echo "Generating test videos..."
for i in $(seq 1 15); do
  [ -f "$VIDEO_DIR/test_$i.mp4" ] || \
  gst-launch-1.0 -q videotestsrc num-buffers=$FRAMES pattern=$((i % 18)) \
    ! "video/x-raw,width=1920,height=1080,format=NV12,framerate=30/1" \
    ! nvvidconv ! "video/x-raw(memory:NVMM),format=NV12" \
    ! nvv4l2h264enc ! h264parse ! mp4mux \
    ! filesink location="$VIDEO_DIR/test_$i.mp4" 2>/dev/null
done

run_test() {
  local NUM=$1
  local HW=$2
  local NVVC="nvvidconv $HW"
  
  echo "=== $NUM pipelines, ${HW:-VIC} ==="
  for i in $(seq 1 $NUM); do
    VIDEO="$VIDEO_DIR/test_$((((i-1) % 15) + 1)).mp4"
    PIPELINE="filesrc location=$VIDEO ! qtdemux ! h264parse ! nvv4l2decoder \
      ! tee name=t t. ! queue ! fakesink \
      t. ! queue ! $NVVC ! video/x-raw\(memory:NVMM\),width=768,height=416 ! fakesink"
    (
      start=$(date +%s.%N)
      gst-launch-1.0 -q $PIPELINE 2>/dev/null
      end=$(date +%s.%N)
      awk -v s="$start" -v e="$end" -v f="$FRAMES" 'BEGIN { printf "%.1f FPS\n", f/(e-s) }'
    ) &
  done
  wait
  echo ""
}

# Run tests
run_test 8 ""                    # VIC, 8 pipelines
run_test 9 ""                    # VIC, 9 pipelines (expect cliff)
run_test 9 "compute-hw=GPU"      # GPU, 9 pipelines (no cliff)
run_test 15 "compute-hw=GPU"     # GPU, 15 pipelines

Questions

Is there a documented limit on concurrent VIC operations? The cliff at 8→9 suggests a hard limit of ~8 concurrent VIC contexts.
Is using compute-hw=GPU the recommended workaround? It works, but I want to ensure this is the correct approach and won’t cause other issues.
Will this limitation be addressed in future JetPack releases? For multi-stream video analytics, VIC’s ~8 pipeline limit is quite restrictive.
Are there any performance implications of using GPU instead of VIC? In my tests, GPU mode actually provides higher total throughput.

Workaround

Add compute-hw=GPU to all nvvidconv elements when running 8+ concurrent pipelines:

nvvidconv compute-hw=GPU ! video/x-raw(memory:NVMM),width=768,height=416

Thank you for any insights into this VIC limitation!

fanzh · January 14, 2026, 2:04am

Thanks fro the sharing! ‘nvvidconv’ is not DeepStream element, please use ‘nvvideoconvert’ instead with other DeepStream elements. Please refer to the related topics topic1, topic2.

Topic		Replies	Views
AGX Orin: >8 GStreamer pipelines fail with VIC/NVMM “Cannot allocate memory” Jetson AGX Orin gstreamer	3	81	October 2, 2025
Nvvidconv slow in multi-process: why does performance collapse when using multiple GStreamer processes? Jetson Thor gstreamer , nvbugs	2	36	January 16, 2026
Big delay for first GStreamer nvvidconv when using GPU Jetson Orin NX gstreamer , deepstream	9	199	October 22, 2024
Jetson video hardware engine core dumps when the number of parallel pipelines is more than 6 DeepStream SDK gstreamer , jetson , deepstream , jetson-orin	6	189	February 7, 2025
Why 1 of the streams is terminated when initiating 9 Deepstream pipelines in parallel DeepStream SDK	20	298	September 23, 2024
Nvvideoconvert on GPU instead of VIC DeepStream SDK deepstream	4	215	February 24, 2025
Limit of multiple nvv4l2h264enc & nvvidconv instances at the same time Jetson Nano gstreamer , encoder , video	7	1455	April 22, 2022
NvVideoEncoder: maximum number of streams and VIC compositions Jetson AGX Xavier mmapi	11	2385	October 18, 2021
Why 1 of the streams is terminated when initiating 9 Deepstream pipelines in parallel in Jetpack 6.0 DeepStream SDK	9	182	September 23, 2024
Nvvideoconvert, nvstreammux and nvstreamdemux stall when using too many channels DeepStream SDK	5	320	April 7, 2024