Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
Jetson / GPU
• DeepStream Version
8.0
• JetPack Version (valid for Jetson only)
• TensorRT Version
10.9
• NVIDIA GPU Driver Version (valid for GPU only)
591.74
• Issue Type( questions, new requirements, bugs)
new requirements
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
I implemented a smart caching system for TensorRT engine files in nvdsinfer and wanted to share it with the community.
THE PROBLEM
The current engine file naming only uses batch size (model_b8_gpu0_fp16.engine), ignoring input dimensions, GPU model, and TensorRT version. This causes unnecessary rebuilds when you change input size or move to a different machine.
Also, if you do not explicitly set model-engine-file in config, the system always rebuilds even when a valid engine exists in the model directory.
THE SOLUTION
New naming format that includes all relevant parameters: model_b8_i640x640_sm120_rtx5090_trt10.7_fp16.engine
The system now auto-discovers existing engines without requiring manual configuration. It searches for a matching engine file based on current parameters and only rebuilds if none is found.
If any engine fails to load (corrupted, version mismatch), it automatically rebuilds.
RESULT
First run: builds engine (several minutes) Subsequent runs: loads existing engine (2-3 seconds)
Different configurations coexist as separate files, no more overwrites.
AVAILABILITY
Complete implementation available here: deepstream_8.0_plugins/libs/nvdsinfer at feature/nvdsinfer-engine-smart-caching · levipereira/deepstream_8.0_plugins · GitHub
For more details about the feature:
The code is available without restrictions for NVIDIA to incorporate into official DeepStream if useful. Compatible and tested only on DeepStream 8.0. Fully backward compatible with existing configurations.
Modified files:
- libs/nvdsinfer/nvdsinfer_model_builder.h
- libs/nvdsinfer/nvdsinfer_model_builder.cpp
- libs/nvdsinfer/nvdsinfer_context_impl.cpp