Feature Contribution - Smart Engine File Caching for nvdsinfer

Levi_Pereira · January 24, 2026, 4:49pm

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
Jetson / GPU
• DeepStream Version
8.0
• JetPack Version (valid for Jetson only)
• TensorRT Version
10.9
• NVIDIA GPU Driver Version (valid for GPU only)
591.74
• Issue Type( questions, new requirements, bugs)
new requirements
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I implemented a smart caching system for TensorRT engine files in nvdsinfer and wanted to share it with the community.

THE PROBLEM

The current engine file naming only uses batch size (model_b8_gpu0_fp16.engine), ignoring input dimensions, GPU model, and TensorRT version. This causes unnecessary rebuilds when you change input size or move to a different machine.

Also, if you do not explicitly set model-engine-file in config, the system always rebuilds even when a valid engine exists in the model directory.

THE SOLUTION

New naming format that includes all relevant parameters: model_b8_i640x640_sm120_rtx5090_trt10.7_fp16.engine

The system now auto-discovers existing engines without requiring manual configuration. It searches for a matching engine file based on current parameters and only rebuilds if none is found.

If any engine fails to load (corrupted, version mismatch), it automatically rebuilds.

RESULT

First run: builds engine (several minutes) Subsequent runs: loads existing engine (2-3 seconds)

Different configurations coexist as separate files, no more overwrites.

AVAILABILITY

Complete implementation available here: deepstream_8.0_plugins/libs/nvdsinfer at feature/nvdsinfer-engine-smart-caching · levipereira/deepstream_8.0_plugins · GitHub

For more details about the feature:

github.com/levipereira/deepstream_8.0_plugins

nvdsinfer-engine-smart-caching/libs/nvdsinfer/ENGINE_FILE_NAMING_FEATURE.md

feature

# TensorRT Engine File Smart Caching Feature

## Overview

New intelligent engine file naming and auto-discovery system for DeepStream's `nvdsinfer` library. Automatically finds and reuses previously built TensorRT engines, reducing pipeline startup time from minutes to seconds.

---

## Engine File Naming Format

```
{model}_b{batch}_i{width}x{height}_{compute_cap}_{gpu_model}_{trt_version}_{precision}.engine
```

**Examples:**
```
model_b8_i640x640_sm120_rtx5090_trt10.7_fp16.engine
model_b4_i1280x720_sm120_rtx5070_trt10.7_fp16.engine
model_b8_i640x640_dla0_trt10.7_int8.engine
```

This file has been truncated. show original

The code is available without restrictions for NVIDIA to incorporate into official DeepStream if useful. Compatible and tested only on DeepStream 8.0. Fully backward compatible with existing configurations.

Modified files:

libs/nvdsinfer/nvdsinfer_model_builder.h
libs/nvdsinfer/nvdsinfer_model_builder.cpp
libs/nvdsinfer/nvdsinfer_context_impl.cpp

Fiona.Chen · January 26, 2026, 3:26am

Thank you for sharing !

Topic		Replies	Views
TRT Engine Path/Name in Deepstream DeepStream SDK deepstream	2	23	January 25, 2026
How to load cached engine in deepstream to accelerate the programe starting? DeepStream SDK	5	554	October 12, 2021
What changes to inference configuration necessitate a new .engine file DeepStream SDK deepstream	8	178	July 31, 2025
Is there a way to know the config file of the engine deepstream creates? DeepStream SDK jetson , deepstream	3	96	October 29, 2024
DeepStream 8.0 gst-nvinfer: ignores model-engine-file when serializing ONNX engines → engine not persisted → TensorRT rebuilds every run DeepStream SDK deepstream	3	49	January 6, 2026
Is there anything that needs extra attention when using my own engine file? DeepStream SDK tensorrt , nvbugs	18	1607	July 20, 2021
Model.engine is always being built DeepStream SDK	10	952	April 30, 2024
Dedicated Engine file path DeepStream SDK deepstream	5	125	December 17, 2024
Engine File default location DeepStream SDK	5	144	August 9, 2024
Different model_b1_gpu0_fp32.engine when running a new thread but same deepstream configuration DeepStream SDK tensorrt , deepstream	9	192	March 10, 2025

Feature Contribution - Smart Engine File Caching for nvdsinfer

Related topics