TensorRT’s dependencies (NVIDIA cuDNN and NVIDIA cuBLAS) can occupy large amounts of device memory. TensorRT allows you to control whether these libraries are used for inference by using the TacticSources (C++, Python) attribute in the builder configuration. Note that some plugin implementations require these libraries, so that when they are excluded, the network may not be compiled successfully.
Each of the tactic source is explained here along with their default status
Thanks for pointing me to the part. I would like to lower my device memory usage, so I am inclined to exclude them. Is there any expected performance loss (inference speed) from not using cuDNN and cuBLAS? So far I haven’t observed any difference with the models I’m using, but I don’t know if I’m just getting lucky.
Is there any way to tell if a cuDNN/cuBLAS tactic is being used? For example, from the output of trtexec --exportLayerInfo?
Hi @adam.alcolado ,
Currently, there is no way to tell whether a given tactic source has been selected for an engine or not, but you can check whether a given tactic source is available for an engine by using the --dumpLayerInfo flag in trtexec.