TensorRT engine compatibility between different devices with same compute capability.

jia_pu · March 30, 2018, 2:12pm

I’m converting a TensorFlow graph into TensorRT engine. The targeted device for deployment is 1080 Ti. However my desk machine has only 1080. Given that both devices have compute capability 6.1, I wonder if I can optimize TensorRT engine on 1080 while expecting getting optimized performance when deployed on 1080Ti.

SiddharthSharma_TPM · April 26, 2018, 11:36pm

We created a new “Deep Learning Training and Inference” section in Devtalk to improve the experience for deep learning and accelerated computing, and HPC users:
https://devtalk.nvidia.com/default/board/301/deep-learning-training-and-inference-/

We are moving active deep learning threads to the new section.

URLs for topics will not change with the re-categorization. So your bookmarks and links will continue to work as earlier.

-Siddharth

mvillmow · May 2, 2018, 5:20am

There is no guarantee that the performance characteristics of a network when optimized for one device will be the same on another device. That being said, the two devices are similar enough that it is unlikely that performance will diverge. The only way to be sure is to test. If there is a functional issue, please file a bug here: https://developer.nvidia.com/nvidia-developer-program
Please include the steps/files used to reproduce the problem along with the output of infer_device.

ShwetaPhilip · May 25, 2018, 11:55pm

@mvillmow - I have seen you post the infer_device tool in many of these threads, and I can’t seem to find this tool, does it even exist?

mvillmow · May 26, 2018, 3:54am

Sorry, infer_device is the name of the deviceQuery sample that we build for inference. Please use deviceQuery from the cuda samples to get the information required. We don’t ship infer_device because it is the same source code.