I don’t really have an issue, but more a question about what make my engine files shareable with another device (same GPU).
I already know that engine files are dependent on:
GPU device
Version of TensorRT
Version of CUDA
But are they dependent on:
nvidia drivers?
cuDNN?
Basically I want to know if I can share an engine to the exact same machine except the nvidia driver version?
I’m asking because currently I generate the engines from a docker container with a specific version of TensorRT, CUDA, etc, but the drivers are from the host machine.
Also it would help me to organize my engine files.
I’ve found theses 2 topics but they don’t really answer the question:
The engine usually doesn’t depend on driver versions or CUDNN versions, except for the following cases:
Some tactics require the driver be at least some version, so an engine built with new driver may fail when running on older driver. But this is rare.
Some tactics have different runtime between CUDNN versions, so an engine may be sub-optimal when the CUDNN versions are different between build phase and inference phase. But this is also rare, especially if the major version of the CUDNN is the same, like CUDNN 8.2 vs 8.3.
I have an case that I generated some engines on a “A10G” (it’s the name displayed by nvidia-smi) with trtexec in the docker image nvcr.io/nvidia/tensorrt:21.03-py3, and I deployed these engines on the same version of TensorRT (DeepStream image) on a “A10”.
So I get the warning WARNING: ../nvdsinfer/nvdsinfer_func_utils.cpp:36 [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors..
My question is: as the GPUs seems to be very similar (I don’t know the difference between A10G and A10), will the engine have significantly different inference results? or it’s only a question of lower performance, and the result will be the same (or very close)?