Hello, I’m trying to convert a model inside deepstream container but somehow it cannot access the GPU memory properly
Memory check:
root@tegra-ubuntu:/app/resources# nvidia-smi
Tue Mar 31 16:50:47 2026
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.4.0 Driver Version: 540.4.0 CUDA Version: 12.6 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Orin (nvgpu) N/A | N/A N/A | N/A |
| N/A N/A N/A N/A / N/A | Not Supported | N/A N/A |
| | | N/A |
±----------------------------------------±---------------------±---------------------+
±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
±--------------------------------------------------------------------------------------+
root@tegra-ubuntu:/app/resources# free -m
total used free shared buff/cache available
Mem: 15656 4513 9865 42 1276 10837
Swap: 7828 0 7828
However:
[03/31/2026-16:48:49] [I] [TRT] ----------------------------------------------------------------
[03/31/2026-16:48:49] [I] [TRT] Input filename: yolox_single_channel_orin.onnx
[03/31/2026-16:48:49] [I] [TRT] ONNX IR version: 0.0.6
[03/31/2026-16:48:49] [I] [TRT] Opset version: 11
[03/31/2026-16:48:49] [I] [TRT] Producer name: pytorch
[03/31/2026-16:48:49] [I] [TRT] Producer version: 1.9
[03/31/2026-16:48:49] [I] [TRT] Domain:
[03/31/2026-16:48:49] [I] [TRT] Model version: 0
[03/31/2026-16:48:49] [I] [TRT] Doc string:
[03/31/2026-16:48:49] [I] [TRT] ----------------------------------------------------------------
[03/31/2026-16:48:49] [I] Finished parsing network model. Parse time: 0.0622941
[03/31/2026-16:48:49] [I] Set shape of input tensor input for optimization profile 0 to: MIN=1x1x640x640 OPT=72x1x640x640 MAX=80x1x640x640
[03/31/2026-16:48:49] [W] [TRT] DLA requests all profiles have same min, max, and opt value. All dla layers are falling back to GPU
[03/31/2026-16:48:49] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[03/31/2026-16:48:49] [W] [TRT] Tactic Device request: 250MB Available: 6MB. Device memory is insufficient to use tactic.
[03/31/2026-16:48:49] [W] [TRT] UNSUPPORTED_STATE: Skipping tactic 0 due to insufficient memory on requested size of 262144000 detected for tactic 0x0000000000000000.
[03/31/2026-16:48:49] [W] [TRT] Tactic Device request: 156MB Available: 6MB. Device memory is insufficient to use tactic.
[03/31/2026-16:48:49] [W] [TRT] UNSUPPORTED_STATE: Skipping tactic 0 due to insufficient memory on requested size of 163840000 detected for tactic 0x0000000000000000.
[03/31/2026-16:48:49] [W] [TRT] Tactic Device request: 125MB Available: 6MB. Device memory is insufficient to use tactic.
[03/31/2026-16:48:49] [W] [TRT] UNSUPPORTED_STATE: Skipping tactic 0 due to insufficient memory on requested size of 131072000 detected for tactic 0x0000000000000000.
[03/31/2026-16:48:49] [W] [TRT] Tactic Device request: 78MB Available: 6MB. Device memory is insufficient to use tactic.
[03/31/2026-16:48:49] [W] [TRT] UNSUPPORTED_STATE: Skipping tactic 0 due to insufficient memory on requested size of 81920000 detected for tactic 0x0000000000000000.
[03/31/2026-16:48:49] [E] Error[10]: IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node node_of_1552.)
[03/31/2026-16:48:49] [E] Engine could not be created from network
[03/31/2026-16:48:49] [E] Building engine failed
[03/31/2026-16:48:49] [E] Failed to create engine from model or file.
[03/31/2026-16:48:49] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v100300] # trtexec --onnx=yolox_single_channel_orin.onnx --saveEngine=yolox_single_channel_orin_ds71nx-new.engine --fp16 --minShapes=input:1x1x640x640 --optShapes=input:72x1x640x640 --maxShapes=input:80x1x640x640 --memPoolSize=workspace:8000
Tegrastats output (outside container):
03-31-2026 16:52:52 RAM 4583/15656MB (lfb 310x4MB) SWAP 0/7828MB (cached 0MB) CPU [0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497] GR3D_FREQ 0% cv0@68.5C cpu@71.812C soc2@69.625C soc0@70.281C cv1@70.437C gpu@68.468C tj@73.281C soc1@73.281C cv2@69.531C VDD_IN 7654mW/7654mW VDD_CPU_GPU_CV 1243mW/1243mW VDD_SOC 2868mW/2868mW
03-31-2026 16:52:53 RAM 4583/15656MB (lfb 310x4MB) SWAP 0/7828MB (cached 0MB) CPU [0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497] GR3D_FREQ 0% cv0@68.531C cpu@72.093C soc2@69.625C soc0@70.25C cv1@70.156C gpu@68.593C tj@73C soc1@73C cv2@69.593C VDD_IN 7654mW/7654mW VDD_CPU_GPU_CV 1243mW/1243mW VDD_SOC 2868mW/2868mW
03-31-2026 16:52:54 RAM 4583/15656MB (lfb 310x4MB) SWAP 0/7828MB (cached 0MB) CPU [0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497] GR3D_FREQ 0% cv0@68.562C cpu@71.718C soc2@69.468C soc0@70.312C cv1@70.125C gpu@68.781C tj@73C soc1@73C cv2@69.625C VDD_IN 7654mW/7654mW VDD_CPU_GPU_CV 1243mW/1243mW VDD_SOC 2868mW/2868mW
03-31-2026 16:52:55 RAM 4583/15656MB (lfb 310x4MB) SWAP 0/7828MB (cached 0MB) CPU [0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497,0%@1497] GR3D_FREQ 0% cv0@68.625C cpu@71.781C soc2@69.406C soc0@70.375C cv1@70.25C gpu@68.625C tj@73.156C soc1@73.156C cv2@69.5C VDD_IN 7654mW/7654mW VDD_CPU_GPU_CV 1243mW/1243mW VDD_SOC 2868mW/2868mW
Platform info:
• Hardware Platform (Jetson / GPU) Orin NX 16gb
• DeepStream Version 7.1
• JetPack Version (valid for Jetson only) 6.2
• TensorRT Version 10.3
• NVIDIA GPU Driver Version (valid for GPU only) 540.4.0
• Base image: nvcr.io/nvidia/deepstream:7.1-triton-multiarch