Do tensorRT plan files are portable across different GPUs which have the same type

1026107838 · April 9, 2019, 2:09pm

Hi all!

In the sdk documentation, it say “The generated plan files are not portable across platforms or TensorRT versions. Plans are specific to the exact GPU model they were built on (in addition to platforms and the TensorRT version) and must be re-targeted to the specific GPU in case you want to run them on a different GPU.”

I am confused about the meaning of “the different GPU”, Is different instance GPU with the same type the different GPU.

I have the following exp, which result is odd.

With gtx 1080ti and float32 mode
I do not use the plan file, its fps is 123.55
I use the plan file that generated in GPU-0 inference in GPU-0, its fps is 123.94
I use the plan file that generated in GPU-0 inference in GPU-1, its fps is 126.67 and it WARNING: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.

With gtx 1080ti and int8 mode
I do not use the plan file, its fps is 211
I use the plan file that generated in GPU-0 inference in GPU-0, its fps is 195
I use the plan file that generated in GPU-0 inference in GPU-1, its fps is 194 and it WARNING: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.

the odd place is when i use plan file in int8 mode,its speed is slower. but i use plan file in floa32 mode is ok

Beside, what does “WARNING: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.” means? i port the plan file in GPUs of the same type, why i get this warning?

NVES_R · July 12, 2019, 10:14pm

Hi,

Using the same plan file across multiple GPUs of the same type (1080 ti’s in your case) is fine. That WARNING is just a catch-all to warn users not to use the same plan file on a 1080, a 1070, a T4, a V100, etc. and expect it work the same. However, TensorRT models even take into account the current available memory on the device at the time of creating the plan file. So even if you’re using the same GPU (1080ti) but the one you used to create the plan file had background processes taking up half of the GPUs memory, you could see different performance than if the memory was all free at the time of creating the plan file.

Regarding performance (speed), INT8 optimization is not guaranteed to improve performance, it’s very dependent on the model. When INT8 optimization isn’t possible, it ends up falling back on FP16, and then falling back again on FP32 if need be. You can get a rough idea of this based on the file size of your models. Sometimes a model that you try to optimize for INT8 will end up the same size as FP32, meaning it ended up falling back on higher precisions in order to maintain accuracy or something along those lines.

Thanks,
NVIDIA Enterprise Support

walter.krambring · August 13, 2020, 1:46pm

Funny enough, I do get that kind of warning if I disconnect the physical monitor from the devices (Jetson Nanos). When monitor is attached I do not get them. I built everything when the monitor was connected, could that be the reason?
Should I rebuild everything without a monitor (only using vnc) to avoid that warning?

Best regards, Walter

Topic		Replies	Views
Tensor-rt plan file for specific target TensorRT	3	1256	June 2, 2022
How to ensure/get portable across GPU platforms? TensorRT	2	410	September 1, 2020
TensorRT INT8 Cache Portability TensorRT	3	855	February 20, 2020
tensorRT engine file can be used in different TX2 devices? Jetson TX2 tensorrt	3	1131	September 19, 2021
TRT engine across different driver version TensorRT	3	1413	December 1, 2020
Request: ability to cross-build PLAN files TensorRT	3	1256	January 17, 2021
TensorRT engine dependencies TensorRT	4	1700	March 10, 2022
Doubts on reusing serialized engines on different platforms or TensorRT versions TensorRT	3	623	August 22, 2019
Tensorrt PlaneFile error TensorRT	1	1960	January 10, 2019
Cross GPU optimization TensorRT	4	808	October 12, 2021

Do tensorRT plan files are portable across different GPUs which have the same type

Related topics