We are finding that the only way we can use TensorRT (7.2.3.4) on a new GPU that we haven’t used before, we have to rebuild TensorRT on that GPU type first.
For example, our software works on RTX 2070 Max Q but didn’t work on a GTX 1050 TI. So we got hold of a 1050 TI to build TRT on that machine but it didn’t work on a 1050. So we had to buy a 1050 to build yet another version. We thought that our TensorRT built on a GTX1660 would work on an RTX 2080 TI but it turned out we were wrong. It returns null when trying to load the engine into memory
Is the lack of inter-GPU compatibility expected with Tensor RT? If yes, what is the bare minimum of GPU types we would need to buy for to support all your GPUs above GTX1050.
@NVES@spolisetty Please help asap as we have an unhappy customer because of our lack of RTX 2080 TI support.
Serialized engines are not portable across platforms or TensorRT versions. Engines are specific to the exact GPU model they were built on (in addition to the platforms and the TensorRT version). It is recommended to build the serialized engines on the targeted platforms directly.
@spolisetty ok understood. Can you recommend which GPUs to use that will support the most number of other GPU types?
For example, if we build Tensor RT on a RTX 2070 then you should be able to support x, y and z GPUs. Or do we always need to have the EXACT same GPU as every customer?
Also, if we were to build on the non-TI version of a GPU, could we use tensorRT on the TI version (or vice- versa)? Ie if we build on a GTX 1050, would TRT work on a 1050TI?
@spolisetty@NVES is there anything you can recommend to allow us to benefit from the faster inference that TensorRT provides, but that is easier to port between machines? For example, should we be using ONNX or something?