Hi. I’m experiencing problems when converting a ONNX model to engine. I can see that the jetson has 24+GiB free memory but it still comlpains about not having sufficient memory. I tried increasing the workspace size but the error persists. An example of the increased workspace with verbose can be seen in the file below and a non verbose example is pasted in the formatted text. I also tried decreasing it to 4GB, but that didn’t work either.
16GiB-verbose.txt (396.8 KB)
/./usr/src/tensorrt/bin/trtexec --explicitBatch --onnx=/home/nvidia/repos/128x128_2021-09-21.onnx --saveEngine=model.engine --workspace=6000 --buildOnly --shapes=inputs:160x128x128x3
&&&& RUNNING TensorRT.trtexec [TensorRT v8001] # /./usr/src/tensorrt/bin/trtexec --explicitBatch --onnx=/home/nvidia/repos/128x128_2021-09-21.onnx --saveEngine=model.engine --workspace=6000 --buildOnly --shapes=inputs:160x128x128x3
[11/01/2021-13:22:37] [I] === Model Options ===
[11/01/2021-13:22:37] [I] Format: ONNX
[11/01/2021-13:22:37] [I] Model: /home/nvidia/repos/128x128_2021-09-21.onnx
[11/01/2021-13:22:37] [I] Output:
[11/01/2021-13:22:37] [I] === Build Options ===
[11/01/2021-13:22:37] [I] Max batch: explicit
[11/01/2021-13:22:37] [I] Workspace: 6000 MiB
[11/01/2021-13:22:37] [I] minTiming: 1
[11/01/2021-13:22:37] [I] avgTiming: 8
[11/01/2021-13:22:37] [I] Precision: FP32
[11/01/2021-13:22:37] [I] Calibration:
[11/01/2021-13:22:37] [I] Refit: Disabled
[11/01/2021-13:22:37] [I] Sparsity: Disabled
[11/01/2021-13:22:37] [I] Safe mode: Disabled
[11/01/2021-13:22:37] [I] Restricted mode: Disabled
[11/01/2021-13:22:37] [I] Save engine: model.engine
[11/01/2021-13:22:37] [I] Load engine:
[11/01/2021-13:22:37] [I] NVTX verbosity: 0
[11/01/2021-13:22:37] [I] Tactic sources: Using default tactic sources
[11/01/2021-13:22:37] [I] timingCacheMode: local
[11/01/2021-13:22:37] [I] timingCacheFile:
[11/01/2021-13:22:37] [I] Input(s)s format: fp32:CHW
[11/01/2021-13:22:37] [I] Output(s)s format: fp32:CHW
[11/01/2021-13:22:37] [I] Input build shape: inputs=160x128x128x3+160x128x128x3+160x128x128x3
[11/01/2021-13:22:37] [I] Input calibration shapes: model
[11/01/2021-13:22:37] [I] === System Options ===
[11/01/2021-13:22:37] [I] Device: 0
[11/01/2021-13:22:37] [I] DLACore:
[11/01/2021-13:22:37] [I] Plugins:
[11/01/2021-13:22:37] [I] === Inference Options ===
[11/01/2021-13:22:37] [I] Batch: Explicit
[11/01/2021-13:22:37] [I] Input inference shape: inputs=160x128x128x3
[11/01/2021-13:22:37] [I] Iterations: 10
[11/01/2021-13:22:37] [I] Duration: 3s (+ 200ms warm up)
[11/01/2021-13:22:37] [I] Sleep time: 0ms
[11/01/2021-13:22:37] [I] Streams: 1
[11/01/2021-13:22:37] [I] ExposeDMA: Disabled
[11/01/2021-13:22:37] [I] Data transfers: Enabled
[11/01/2021-13:22:37] [I] Spin-wait: Disabled
[11/01/2021-13:22:37] [I] Multithreading: Disabled
[11/01/2021-13:22:37] [I] CUDA Graph: Disabled
[11/01/2021-13:22:37] [I] Separate profiling: Disabled
[11/01/2021-13:22:37] [I] Time Deserialize: Disabled
[11/01/2021-13:22:37] [I] Time Refit: Disabled
[11/01/2021-13:22:37] [I] Skip inference: Enabled
[11/01/2021-13:22:37] [I] Inputs:
[11/01/2021-13:22:37] [I] === Reporting Options ===
[11/01/2021-13:22:37] [I] Verbose: Disabled
[11/01/2021-13:22:37] [I] Averages: 10 inferences
[11/01/2021-13:22:37] [I] Percentile: 99
[11/01/2021-13:22:37] [I] Dump refittable layers:Disabled
[11/01/2021-13:22:37] [I] Dump output: Disabled
[11/01/2021-13:22:37] [I] Profile: Disabled
[11/01/2021-13:22:37] [I] Export timing to JSON file:
[11/01/2021-13:22:37] [I] Export output to JSON file:
[11/01/2021-13:22:37] [I] Export profile to JSON file:
[11/01/2021-13:22:37] [I]
[11/01/2021-13:22:37] [I] === Device Information ===
[11/01/2021-13:22:37] [I] Selected Device: Xavier
[11/01/2021-13:22:37] [I] Compute Capability: 7.2
[11/01/2021-13:22:37] [I] SMs: 8
[11/01/2021-13:22:37] [I] Compute Clock Rate: 1.377 GHz
[11/01/2021-13:22:37] [I] Device Global Memory: 31928 MiB
[11/01/2021-13:22:37] [I] Shared Memory per SM: 96 KiB
[11/01/2021-13:22:37] [I] Memory Bus Width: 256 bits (ECC disabled)
[11/01/2021-13:22:37] [I] Memory Clock Rate: 1.377 GHz
[11/01/2021-13:22:37] [I]
[11/01/2021-13:22:37] [I] TensorRT version: 8001
[11/01/2021-13:22:39] [I] [TRT] [MemUsageChange] Init CUDA: CPU +353, GPU +0, now: CPU 371, GPU 2730 (MiB)
[11/01/2021-13:22:39] [I] Start parsing network model
[11/01/2021-13:22:39] [I] [TRT] ----------------------------------------------------------------
[11/01/2021-13:22:39] [I] [TRT] Input filename: /home/nvidia/repos/128x128_2021-09-21.onnx
[11/01/2021-13:22:39] [I] [TRT] ONNX IR version: 0.0.4
[11/01/2021-13:22:39] [I] [TRT] Opset version: 9
[11/01/2021-13:22:39] [I] [TRT] Producer name: keras2onnx
[11/01/2021-13:22:39] [I] [TRT] Producer version: 1.8.1
[11/01/2021-13:22:39] [I] [TRT] Domain: onnxmltools
[11/01/2021-13:22:39] [I] [TRT] Model version: 0
[11/01/2021-13:22:39] [I] [TRT] Doc string:
[11/01/2021-13:22:39] [I] [TRT] ----------------------------------------------------------------
[11/01/2021-13:22:39] [I] Finish parsing network model
[11/01/2021-13:22:39] [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 391, GPU 2810 (MiB)
[11/01/2021-13:22:39] [I] [TRT] [MemUsageSnapshot] Builder begin: CPU 391 MiB, GPU 2810 MiB
[11/01/2021-13:22:39] [I] [TRT] ---------- Layers Running on DLA ----------
[11/01/2021-13:22:39] [I] [TRT] ---------- Layers Running on GPU ----------
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity4
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity31
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose36
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] conv2d
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose37
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity30
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose34
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] batch_normalization
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose35
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity29
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] PWN(elu)
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose32
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] conv2d_1
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose33
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity28
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose30
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] batch_normalization_1
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose31
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity27
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] PWN(elu_1)
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity26
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose28
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] conv2d_2
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose29
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity25
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose26
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] batch_normalization_2
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose27
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity24
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] PWN(elu_2)
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose24
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] conv2d_3
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose25
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity23
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose22
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] batch_normalization_3
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose23
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity22
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] PWN(elu_3)
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose20
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] conv2d_4
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose21
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity21
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose18
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] batch_normalization_4
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose19
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity20
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] PWN(elu_4)
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity19
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose16
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] conv2d_transpose
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose17
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity18
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose14
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] batch_normalization_5
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose15
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity17
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] PWN(elu_5)
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity16
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose12
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] conv2d_transpose_1
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose13
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity15
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose10
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] batch_normalization_6
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose11
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity14
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] PWN(elu_6)
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity13
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose8
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] conv2d_transpose_2
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose9
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity12
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose6
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] batch_normalization_7
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose7
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity11
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] PWN(elu_7)
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity10
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose4
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] conv2d_transpose_3
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose5
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity9
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose2
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] batch_normalization_8
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose3
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity8
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] PWN(elu_8)
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] conv2d_transpose_4
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Transpose1
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity7
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] PWN(Sigmoid)
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity6
[11/01/2021-13:22:39] [I] [TRT] [GpuLayer] Identity5
[11/01/2021-13:22:40] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +227, GPU +334, now: CPU 618, GPU 3150 (MiB)
[11/01/2021-13:22:42] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +307, GPU +513, now: CPU 925, GPU 3663 (MiB)
[11/01/2021-13:22:42] [W] [TRT] Detected invalid timing cache, setup a local cache instead
[11/01/2021-13:23:53] [E] [TRT] Tactic Device request: 2790MB Available: 1536MB. Device memory is insufficient to use tactic.
[11/01/2021-13:23:53] [W] [TRT] Skipping tactic 2 due to oom error on requested size of 2790 detected for tactic 2.
[11/01/2021-13:24:02] [E] [TRT] Tactic Device request: 2790MB Available: 1536MB. Device memory is insufficient to use tactic.
[11/01/2021-13:24:02] [W] [TRT] Skipping tactic 6 due to oom error on requested size of 2790 detected for tactic 58.