I was wandering if there is any issue with portability with INT8 caches and TensorRT being moved between platforms with different GPUS. In particular, I am seeing that there are some tensors that are having _#### appended to them which differ and cause issues with the cache.
The generated plan files are not portable across platforms or TensorRT versions. Plans are specific to the exact GPU model they were built on (in addition to the platforms and the TensorRT version) and must be re-targeted to the specific GPU in case you want to run them on a different GPU.
In this case I’m thinking more of the calibration cache that is saved to disk before creating any plan.
It might work, but usually not done in normal scenarios. If you want you can try it and run your precision tests.