Hi all!
In the sdk documentation, it say “The generated plan files are not portable across platforms or TensorRT versions. Plans are specific to the exact GPU model they were built on (in addition to platforms and the TensorRT version) and must be re-targeted to the specific GPU in case you want to run them on a different GPU.”
I am confused about the meaning of “the different GPU”, Is different instance GPU with the same type the different GPU.
I have the following exp, which result is odd.
With gtx 1080ti and float32 mode
I do not use the plan file, its fps is 123.55
I use the plan file that generated in GPU-0 inference in GPU-0, its fps is 123.94
I use the plan file that generated in GPU-0 inference in GPU-1, its fps is 126.67 and it WARNING: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
With gtx 1080ti and int8 mode
I do not use the plan file, its fps is 211
I use the plan file that generated in GPU-0 inference in GPU-0, its fps is 195
I use the plan file that generated in GPU-0 inference in GPU-1, its fps is 194 and it WARNING: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
the odd place is when i use plan file in int8 mode,its speed is slower. but i use plan file in floa32 mode is ok
Beside, what does “WARNING: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.” means? i port the plan file in GPUs of the same type, why i get this warning?