What if I don’t want to build it directly from TLT? I would like to build my own c++ application where the engine should be built within that application. Also building the engine using TLT would be within docker which runs using nvidia-docker and it is not supported on Windows.
So there is no way I can use TLT for outputting a frozen .pb format? Then TLT isn’t the solution for me …
Thanks.
Hi Morganh,
TLT is a very nice tool to train networks, but not having the opportunity to export it to .pb or .uff is unfortunate. Having the trianed model in such formats is useful for other use-cases …
Yes, TLT can only export model as an etlt format. User can also generate trt engine based on it.
Usually,
Download/copy the tlt-converter tool to nano or other boards
copy the etlt model into nano or other boards
run tlt-converter against the etlt model
The trt engine will be built directly
For deployment platforms with an x86 based CPU and discrete GPU’s, the tlt-converter is distributed within the TLT docker. Therefore, it is suggested to use the docker to generate the engine.
Hi Morganh,
The generated engine is optimized to be used on the GPU architecture it was generated on, right? Would it still work well on another GPU architecture?
Machine specific optimizations are done as part of the engine creation process, so a distinct engine should be generated for each environment and hardware configuration.
If the inference environment’s TensorRT or CUDA libraries are updated – including minor version updates – new engines should be generated.
Running an engine that was generated with a different version of TensorRT and CUDA is not supported and will cause unknown behavior that affects inference speed, accuracy, and stability, or it may fail to run altogether.
The TRT engine plan does depend on the compute capability and TRT versions. To change GPU architecture you’d need something that can run TRT of the same version as you built with.