`trtexec` failed to save engine to file in jetson orin board

Hi

I’m trying to deploy EfficientViT-SAM model to Jetson Orin, as a TensorRT

So I’ve saved the pretrained model as a onnx file, and then tried to convert it into the TensorRT, .engine file, with trtexec, but it failed

I’ve followed the instructions here ( efficientvit/applications/efficientvit_sam/README.md at master · mit-han-lab/efficientvit · GitHub )

And I converted the original torch model into onnx at my desktop, and tried to convert onnx to engine in the Jestson Orin

When I see the log, it seems to have succeeded in generating the engine, but failed to save it as a file

I’ve already confirmed that trtexec has permission to write the file

I cannot find any clue for the reason why it failed to save it as a file, and I don’t know where I should start to investigate

Could you help me to resolve this issue?

Environment

TensorRT Version: 10.3.0
GPU Type: Jetson Orin AGX(6GB)
Nvidia Driver Version: 540.4.0
CUDA Version: 12.6
CUDNN Version:
Operating System + Version: ubuntu 22.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Command

trtexec –verbose –allowGPUFallback –timingCacheFile=\~/trt/timing.cache –onnx=assets/export_models/efficientvit_sam/onnx/efficientvit_sam_l0_encoder.onnx –minShapes=input_image:1x3x512x512 –optShapes=input_imsage:4x3x512x512 –maxShapes=input_image:4x3x512x512 –saveEngine=\~/trt/efficientvit_sam_l0_encoder.engine ==exportProfile=\~/trt/efficientvit_sam_l0_encoder.trace.json

Part of the log

[01/23/2026-21:10:23] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 12 MiB, GPU 300 MiB
[01/23/2026-21:10:23] [V] [TRT] Adding 1 engine(s) to plan file.
[01/23/2026-21:10:23] [I] [TRT] [MemUsageStats] Peak memory usage during Engine building and serialization: CPU: 1758 MiB
[01/23/2026-21:10:23] [I] Engine built in 995.709 sec.
[01/23/2026-21:10:23] [I] Created engine with size: 119.411 MiB
[01/23/2026-21:10:23] [E] Saving engine to file failed.
[01/23/2026-21:10:23] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v100300] # /usr/src/tensorrt/bin/trtexec --verbose --allowGPUFallback --timingCacheFile=~/trt/timing.cache --onnx=assets/export_models/efficientvit_sam/onnx/efficientvit_sam_l0_encoder.onnx --minShapes=input_image:1x3x512x512 --optShapes=input_image:4x3x512x512 --maxShapes=input_image:4x3x512x512 --saveEngine=~/trt/efficientvit_sam_l0_encoder.engine --exportProfile=~/trt/efficientvit_sam_l0_encoder.trace.json

I would like to share the onnx file that I used, but it is blocked to upload any file in my organization, so it is hard to share it, sorry

Hi,

Could you enable verbose to gather more logs and share them with us?
This can be done via passing --verbose configuration to the trtexec.

Thanks.

Thank you for your prompt reply!

Actually, I’ve already run trtexec with --verbose option, but the full log is too long to share here as a raw text (The number of lines of the full log exceeds 42k 😭)

So I intentionally trimmed it and just shared part of it


And I tried just before to share a bit longer part of the log, which is at the information level

But it failed again with 403 forbidden error 🥲
(I’m not sure why it occurred, but I’ll try a bit later again)


Please let me know if I can do, or what information will be helpful


Ah, I guess that the log is too long, so I cannot share it at once
I’ll post an additional reply to share the information level log

Log 1

[01/23/2026-20:53:28] [I] === Model Options ===
[01/23/2026-20:53:28] [I] Format: ONNX
[01/23/2026-20:53:28] [I] Model: assets/export_models/efficientvit_sam/onnx/efficientvit_sam_l0_encoder.onnx
[01/23/2026-20:53:28] [I] Output:
[01/23/2026-20:53:28] [I] === Build Options ===
[01/23/2026-20:53:28] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default, tacticSharedMem: default
[01/23/2026-20:53:28] [I] avgTiming: 8
[01/23/2026-20:53:28] [I] Precision: FP32
[01/23/2026-20:53:28] [I] LayerPrecisions: 
[01/23/2026-20:53:28] [I] Layer Device Types: 
[01/23/2026-20:53:28] [I] Calibration: 
[01/23/2026-20:53:28] [I] Refit: Disabled
[01/23/2026-20:53:28] [I] Strip weights: Disabled
[01/23/2026-20:53:28] [I] Version Compatible: Disabled
[01/23/2026-20:53:28] [I] ONNX Plugin InstanceNorm: Disabled
[01/23/2026-20:53:28] [I] TensorRT runtime: full
[01/23/2026-20:53:28] [I] Lean DLL Path: 
[01/23/2026-20:53:28] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[01/23/2026-20:53:28] [I] Exclude Lean Runtime: Disabled
[01/23/2026-20:53:28] [I] Sparsity: Disabled
[01/23/2026-20:53:28] [I] Safe mode: Disabled
[01/23/2026-20:53:28] [I] Build DLA standalone loadable: Disabled
[01/23/2026-20:53:28] [I] Allow GPU fallback for DLA: Enabled
[01/23/2026-20:53:28] [I] DirectIO mode: Disabled
[01/23/2026-20:53:28] [I] Restricted mode: Disabled
[01/23/2026-20:53:28] [I] Skip inference: Disabled
[01/23/2026-20:53:28] [I] Save engine: ~/trt/efficientvit_sam_l0_encoder.engine
[01/23/2026-20:53:28] [I] Load engine: 
[01/23/2026-20:53:28] [I] Profiling verbosity: 0
[01/23/2026-20:53:28] [I] Tactic sources: Using default tactic sources
[01/23/2026-20:53:28] [I] timingCacheMode: global
[01/23/2026-20:53:28] [I] timingCacheFile: ~/trt.timing.cache
[01/23/2026-20:53:28] [I] Enable Compilation Cache: Enabled
[01/23/2026-20:53:28] [I] errorOnTimingCacheMiss: Disabled
[01/23/2026-20:53:28] [I] Preview Features: Use default preview flags.
[01/23/2026-20:53:28] [I] MaxAuxStreams: -1
[01/23/2026-20:53:28] [I] BuilderOptimizationLevel: -1
[01/23/2026-20:53:28] [I] Calibration Profile Index: 0
[01/23/2026-20:53:28] [I] Weight Streaming: Disabled
[01/23/2026-20:53:28] [I] Runtime Platform: Same As Build
[01/23/2026-20:53:28] [I] Debug Tensors: 
[01/23/2026-20:53:28] [I] Input(s)s format: fp32:CHW
[01/23/2026-20:53:28] [I] Output(s)s format: fp32:CHW
[01/23/2026-20:53:28] [I] Input build shape (profile 0): input_image=1x3x512x512+4x3x512x512+4x3x512x512
[01/23/2026-20:53:28] [I] Input calibration shapes: model
[01/23/2026-20:53:28] [I] === System Options ===
[01/23/2026-20:53:28] [I] Device: 0
[01/23/2026-20:53:28] [I] DLACore: 
[01/23/2026-20:53:28] [I] Plugins:
[01/23/2026-20:53:28] [I] setPluginsToSerialize:
[01/23/2026-20:53:28] [I] dynamicPlugins:
[01/23/2026-20:53:28] [I] ignoreParsedPluginLibs: 0
[01/23/2026-20:53:28] [I] 
[01/23/2026-20:53:28] [I] === Inference Options ===
[01/23/2026-20:53:28] [I] Batch: Explicit
[01/23/2026-20:53:28] [I] Input inference shape : input_image=4x3x512x512
[01/23/2026-20:53:28] [I] Iterations: 10
[01/23/2026-20:53:28] [I] Duration: 3s (+ 200ms warm up)
[01/23/2026-20:53:28] [I] Sleep time: 0ms
[01/23/2026-20:53:28] [I] Idle time: 0ms
[01/23/2026-20:53:28] [I] Inference Streams: 1
[01/23/2026-20:53:28] [I] ExposeDMA: Disabled
[01/23/2026-20:53:28] [I] Data transfers: Enabled
[01/23/2026-20:53:28] [I] Spin-wait: Disabled
[01/23/2026-20:53:28] [I] Multithreading: Disabled
[01/23/2026-20:53:28] [I] CUDA Graph: Disabled
[01/23/2026-20:53:28] [I] Separate profiling: Disabled
[01/23/2026-20:53:28] [I] Time Deserialize: Disabled
[01/23/2026-20:53:28] [I] Time Refit: Disabled
[01/23/2026-20:53:28] [I] NVTX verbosity: 0
[01/23/2026-20:53:28] [I] Persistent Cache Ratio: 0
[01/23/2026-20:53:28] [I] Optimization Profile Index: 0
[01/23/2026-20:53:28] [I] Weight Streaming Budget: 100.000000%
[01/23/2026-20:53:28] [I] Inputs:
[01/23/2026-20:53:28] [I] Debug Tensor Save Destinations:
[01/23/2026-20:53:28] [I] === Reporting Options ===
[01/23/2026-20:53:28] [I] Verbose: Enabled
[01/23/2026-20:53:28] [I] Averages: 10 inferences
[01/23/2026-20:53:28] [I] Percentiles: 90,95,99
[01/23/2026-20:53:28] [I] Dump refittable layers:Disabled
[01/23/2026-20:53:28] [I] Dump output: Disabled
[01/23/2026-20:53:28] [I] Profile: Disabled
[01/23/2026-20:53:28] [I] Export timing to JSON file: 
[01/23/2026-20:53:28] [I] Export output to JSON file: 
[01/23/2026-20:53:28] [I] Export profile to JSON file: ~/trt/efficientvit_sam_l0_encoder.trace.json
[01/23/2026-20:53:28] [I] 
[01/23/2026-20:53:28] [I] === Device Information ===
[01/23/2026-20:53:28] [I] Available Devices: 
[01/23/2026-20:53:28] [I]   Device 0: "Orin" UUID: GPU-a04955e3-ad65-59ff-9578-a27db520877b
[01/23/2026-20:53:28] [I] Selected Device: Orin
[01/23/2026-20:53:28] [I] Selected Device ID: 0
[01/23/2026-20:53:28] [I] Selected Device UUID: GPU-a04955e3-ad65-59ff-9578-a27db520877b
[01/23/2026-20:53:28] [I] Compute Capability: 8.7
[01/23/2026-20:53:28] [I] SMs: 8
[01/23/2026-20:53:28] [I] Device Global Memory: 62840 MiB
[01/23/2026-20:53:28] [I] Shared Memory per SM: 164 KiB
[01/23/2026-20:53:28] [I] Memory Bus Width: 256 bits (ECC disabled)
[01/23/2026-20:53:28] [I] Application Compute Clock Rate: 1.3 GHz
[01/23/2026-20:53:28] [I] Application Memory Clock Rate: 0.612 GHz
[01/23/2026-20:53:28] [I] 
[01/23/2026-20:53:28] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[01/23/2026-20:53:28] [I] 
[01/23/2026-20:53:28] [I] TensorRT version: 10.3.0
[01/23/2026-20:53:28] [I] Loading standard plugins
[01/23/2026-20:53:28] [I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 31, GPU 11833 (MiB)
[01/23/2026-20:53:30] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +928, GPU +1093, now: CPU 1002, GPU 12971 (MiB)
[01/23/2026-20:53:30] [I] Start parsing network model.
[01/23/2026-20:53:30] [I] [TRT] ----------------------------------------------------------------
[01/23/2026-20:53:30] [I] [TRT] Input filename:   assets/export_models/efficientvit_sam/onnx/efficientvit_sam_l0_encoder.onnx
[01/23/2026-20:53:30] [I] [TRT] ONNX IR version:  0.0.8
[01/23/2026-20:53:30] [I] [TRT] Opset version:    17
[01/23/2026-20:53:30] [I] [TRT] Producer name:    pytorch
[01/23/2026-20:53:30] [I] [TRT] Producer version: 2.10.0
[01/23/2026-20:53:30] [I] [TRT] Domain:           
[01/23/2026-20:53:30] [I] [TRT] Model version:    0
[01/23/2026-20:53:30] [I] [TRT] Doc string:       
[01/23/2026-20:53:30] [I] [TRT] ----------------------------------------------------------------
[01/23/2026-20:53:31] [I] Finished parsing network model. Parse time: 0.276996
[01/23/2026-20:53:31] [I] Set shape of input tensor input_image for optimization profile 0 to: MIN=1x3x512x512 OPT=4x3x512x512 MAX=4x3x512x512
[01/23/2026-20:53:31] [I] [TRT] Global timing cache in use. Profiling results in this builder pass will be stored.

Log 2

[01/23/2026-20:53:45] [I] === Model Options ===
[01/23/2026-20:53:45] [I] Format: ONNX
[01/23/2026-20:53:45] [I] Model: assets/export_models/efficientvit_sam/onnx/efficientvit_sam_l0_encoder.onnx
[01/23/2026-20:53:45] [I] Output:
[01/23/2026-20:53:45] [I] === Build Options ===
[01/23/2026-20:53:45] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default, tacticSharedMem: default
[01/23/2026-20:53:45] [I] avgTiming: 8
[01/23/2026-20:53:45] [I] Precision: FP32
[01/23/2026-20:53:45] [I] LayerPrecisions: 
[01/23/2026-20:53:45] [I] Layer Device Types: 
[01/23/2026-20:53:45] [I] Calibration: 
[01/23/2026-20:53:45] [I] Refit: Disabled
[01/23/2026-20:53:45] [I] Strip weights: Disabled
[01/23/2026-20:53:45] [I] Version Compatible: Disabled
[01/23/2026-20:53:45] [I] ONNX Plugin InstanceNorm: Disabled
[01/23/2026-20:53:45] [I] TensorRT runtime: full
[01/23/2026-20:53:45] [I] Lean DLL Path: 
[01/23/2026-20:53:45] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[01/23/2026-20:53:45] [I] Exclude Lean Runtime: Disabled
[01/23/2026-20:53:45] [I] Sparsity: Disabled
[01/23/2026-20:53:45] [I] Safe mode: Disabled
[01/23/2026-20:53:45] [I] Build DLA standalone loadable: Disabled
[01/23/2026-20:53:45] [I] Allow GPU fallback for DLA: Enabled
[01/23/2026-20:53:45] [I] DirectIO mode: Disabled
[01/23/2026-20:53:45] [I] Restricted mode: Disabled
[01/23/2026-20:53:45] [I] Skip inference: Disabled
[01/23/2026-20:53:45] [I] Save engine: ~/trt/efficientvit_sam_l0_encoder.engine
[01/23/2026-20:53:45] [I] Load engine: 
[01/23/2026-20:53:45] [I] Profiling verbosity: 0
[01/23/2026-20:53:45] [I] Tactic sources: Using default tactic sources
[01/23/2026-20:53:45] [I] timingCacheMode: global
[01/23/2026-20:53:45] [I] timingCacheFile: ~/trt/timing.cache
[01/23/2026-20:53:45] [I] Enable Compilation Cache: Enabled
[01/23/2026-20:53:45] [I] errorOnTimingCacheMiss: Disabled
[01/23/2026-20:53:45] [I] Preview Features: Use default preview flags.
[01/23/2026-20:53:45] [I] MaxAuxStreams: -1
[01/23/2026-20:53:45] [I] BuilderOptimizationLevel: -1
[01/23/2026-20:53:45] [I] Calibration Profile Index: 0
[01/23/2026-20:53:45] [I] Weight Streaming: Disabled
[01/23/2026-20:53:45] [I] Runtime Platform: Same As Build
[01/23/2026-20:53:45] [I] Debug Tensors: 
[01/23/2026-20:53:45] [I] Input(s)s format: fp32:CHW
[01/23/2026-20:53:45] [I] Output(s)s format: fp32:CHW
[01/23/2026-20:53:45] [I] Input build shape (profile 0): input_image=1x3x512x512+4x3x512x512+4x3x512x512
[01/23/2026-20:53:45] [I] Input calibration shapes: model
[01/23/2026-20:53:45] [I] === System Options ===
[01/23/2026-20:53:45] [I] Device: 0
[01/23/2026-20:53:45] [I] DLACore: 
[01/23/2026-20:53:45] [I] Plugins:
[01/23/2026-20:53:45] [I] setPluginsToSerialize:
[01/23/2026-20:53:45] [I] dynamicPlugins:
[01/23/2026-20:53:45] [I] ignoreParsedPluginLibs: 0
[01/23/2026-20:53:45] [I] 
[01/23/2026-20:53:45] [I] === Inference Options ===
[01/23/2026-20:53:45] [I] Batch: Explicit
[01/23/2026-20:53:45] [I] Input inference shape : input_image=4x3x512x512
[01/23/2026-20:53:45] [I] Iterations: 10
[01/23/2026-20:53:45] [I] Duration: 3s (+ 200ms warm up)
[01/23/2026-20:53:45] [I] Sleep time: 0ms
[01/23/2026-20:53:45] [I] Idle time: 0ms
[01/23/2026-20:53:45] [I] Inference Streams: 1
[01/23/2026-20:53:45] [I] ExposeDMA: Disabled
[01/23/2026-20:53:45] [I] Data transfers: Enabled
[01/23/2026-20:53:45] [I] Spin-wait: Disabled
[01/23/2026-20:53:45] [I] Multithreading: Disabled
[01/23/2026-20:53:45] [I] CUDA Graph: Disabled
[01/23/2026-20:53:45] [I] Separate profiling: Disabled
[01/23/2026-20:53:45] [I] Time Deserialize: Disabled
[01/23/2026-20:53:45] [I] Time Refit: Disabled
[01/23/2026-20:53:45] [I] NVTX verbosity: 0
[01/23/2026-20:53:45] [I] Persistent Cache Ratio: 0
[01/23/2026-20:53:45] [I] Optimization Profile Index: 0
[01/23/2026-20:53:45] [I] Weight Streaming Budget: 100.000000%
[01/23/2026-20:53:45] [I] Inputs:
[01/23/2026-20:53:45] [I] Debug Tensor Save Destinations:
[01/23/2026-20:53:45] [I] === Reporting Options ===
[01/23/2026-20:53:45] [I] Verbose: Enabled
[01/23/2026-20:53:45] [I] Averages: 10 inferences
[01/23/2026-20:53:45] [I] Percentiles: 90,95,99
[01/23/2026-20:53:45] [I] Dump refittable layers:Disabled
[01/23/2026-20:53:45] [I] Dump output: Disabled
[01/23/2026-20:53:45] [I] Profile: Disabled
[01/23/2026-20:53:45] [I] Export timing to JSON file: 
[01/23/2026-20:53:45] [I] Export output to JSON file: 
[01/23/2026-20:53:45] [I] Export profile to JSON file: ~/trt/efficientvit_sam_l0_encoder.trace.json
[01/23/2026-20:53:45] [I] 
[01/23/2026-20:53:45] [I] === Device Information ===
[01/23/2026-20:53:45] [I] Available Devices: 
[01/23/2026-20:53:45] [I]   Device 0: "Orin" UUID: GPU-a04955e3-ad65-59ff-9578-a27db520877b
[01/23/2026-20:53:45] [I] Selected Device: Orin
[01/23/2026-20:53:45] [I] Selected Device ID: 0
[01/23/2026-20:53:45] [I] Selected Device UUID: GPU-a04955e3-ad65-59ff-9578-a27db520877b
[01/23/2026-20:53:45] [I] Compute Capability: 8.7
[01/23/2026-20:53:45] [I] SMs: 8
[01/23/2026-20:53:45] [I] Device Global Memory: 62840 MiB
[01/23/2026-20:53:45] [I] Shared Memory per SM: 164 KiB
[01/23/2026-20:53:45] [I] Memory Bus Width: 256 bits (ECC disabled)
[01/23/2026-20:53:45] [I] Application Compute Clock Rate: 1.3 GHz
[01/23/2026-20:53:45] [I] Application Memory Clock Rate: 0.612 GHz
[01/23/2026-20:53:45] [I] 
[01/23/2026-20:53:45] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[01/23/2026-20:53:45] [I] 
[01/23/2026-20:53:45] [I] TensorRT version: 10.3.0
[01/23/2026-20:53:45] [I] Loading standard plugins
[01/23/2026-20:53:45] [I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 31, GPU 11830 (MiB)
[01/23/2026-20:53:47] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +928, GPU +1097, now: CPU 1002, GPU 12972 (MiB)
[01/23/2026-20:53:47] [I] Start parsing network model.
[01/23/2026-20:53:47] [I] [TRT] ----------------------------------------------------------------
[01/23/2026-20:53:47] [I] [TRT] Input filename:   assets/export_models/efficientvit_sam/onnx/efficientvit_sam_l0_encoder.onnx
[01/23/2026-20:53:47] [I] [TRT] ONNX IR version:  0.0.8
[01/23/2026-20:53:47] [I] [TRT] Opset version:    17
[01/23/2026-20:53:47] [I] [TRT] Producer name:    pytorch
[01/23/2026-20:53:47] [I] [TRT] Producer version: 2.10.0
[01/23/2026-20:53:47] [I] [TRT] Domain:           
[01/23/2026-20:53:47] [I] [TRT] Model version:    0
[01/23/2026-20:53:47] [I] [TRT] Doc string:       
[01/23/2026-20:53:47] [I] [TRT] ----------------------------------------------------------------
[01/23/2026-20:53:47] [I] Finished parsing network model. Parse time: 0.275097
[01/23/2026-20:53:47] [I] Set shape of input tensor input_image for optimization profile 0 to: MIN=1x3x512x512 OPT=4x3x512x512 MAX=4x3x512x512
[01/23/2026-20:53:48] [I] [TRT] Global timing cache in use. Profiling results in this builder pass will be stored.
[01/23/2026-21:10:19] [I] [TRT] [GraphReduction] The approximate region cut reduction algorithm is called.
[01/23/2026-21:10:19] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[01/23/2026-21:10:23] [I] [TRT] Total Host Persistent Memory: 391872
[01/23/2026-21:10:23] [I] [TRT] Total Device Persistent Memory: 0
[01/23/2026-21:10:23] [I] [TRT] Total Scratch Memory: 16908288
[01/23/2026-21:10:23] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 167 steps to complete.
[01/23/2026-21:10:23] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 8.91597ms to assign 8 blocks to 167 nodes requiring 320864768 bytes.
[01/23/2026-21:10:23] [I] [TRT] Total Activation Memory: 320864256
[01/23/2026-21:10:23] [I] [TRT] Total Weights Memory: 122928896
[01/23/2026-21:10:23] [I] [TRT] Engine generation completed in 995.429 seconds.
[01/23/2026-21:10:23] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 12 MiB, GPU 300 MiB
[01/23/2026-21:10:23] [I] [TRT] [MemUsageStats] Peak memory usage during Engine building and serialization: CPU: 1758 MiB
[01/23/2026-21:10:23] [I] Engine built in 995.709 sec.
[01/23/2026-21:10:23] [I] Created engine with size: 119.411 MiB

Hi,

Could you save the log to a text file and attach it to the topic?
Thanks.

Sorry, but it is not possible because uploading any file is blocked in my organization, as I mentioned above.. 🥲

But my colleague said to me that he succeeded to save .engine file by giving the same sizes for the --minShapes, --optShapes, and the --maxShapes options, so I’m guessing the error may be related to that options

Is there any possibility that the dynamic input causes the error?

Hi,

Not likely.

If the error is related to the dynamic shape, the engine won’t be created.
But the error is related to serialize the engine into a file instead.

Could you try the command below to see if it is possible to run successfully?

$ /usr/src/tensorrt/bin/trtexec --verbose --allowGPUFallback --timingCacheFile=~/trt/timing.cache --onnx=assets/export_models/efficientvit_sam/onnx/efficientvit_sam_l0_encoder.onnx --minShapes=input_image:1x3x512x512 --optShapes=input_image:4x3x512x512 --maxShapes=input_image:4x3x512x512 

Thanks.

Could you try the command below to see if it is possible to run successfully?

It was successfully done with the command you gave

The log ends with the following

&&&& PASSED TensorRT.trtexec [TensorRT v100300] # /usr/src/tensorrt/bin/trtexec --verbose --allowGPUFallback --timingCacheFile=~/trt/timing.cache --onnx=assets/export_models/efficientvit_sam/onnx/efficientvit_sam_l0_encoder.onnx --minShapes=input_image:1x3x512x512 --optShapes=input_image:4x3x512x512 --maxShapes=input_image:4x3x512x512

Hi,

This indicates that TensorRT can convert the engine correctly, but fails when trying to serialize the model.
Could you double-check if TensorRT has the write permission for the folder?

For example, could you try to save the log to the folder?

$ /usr/src/tensorrt/bin/trtexec --verbose --allowGPUFallback --timingCacheFile=~/trt/timing.cache --onnx=assets/export_models/efficientvit_sam/onnx/efficientvit_sam_l0_encoder.onnx --minShapes=input_image:1x3x512x512 --optShapes=input_image:4x3x512x512 --maxShapes=input_image:4x3x512x512 >> ~/trt/log

Thanks.

As I said, TensorRT has the write permission, and the log was successfully saved as a file

It was the same even if I tried again. The log was saved without a problem

Hi,

Do you also see the ‘timing.cache’ in the ~/trt/ folder?
If yes, could you also check if you have enough storage?

Could you share the output log with --verbose around the write failure (after the line Engine built in xxx sec.).

Thanks.

Thanks for your support, but we’re not doing the task related to this issue anymore, so I’m sorry to say that it is hard to check what you said and to share the result