TensorRT latency and wattage numbers by NVIDIA recreation.

Hey everyone, I’m working with Nvidia’s T4 GPU card and I’m trying to measure my system’s performance comparison to Nvidia’s published numbers here: https://developer.nvidia.com/deep-learning-performance-training-inference

I’m looking for guidance on what to use (tool/open source code/container) and metrics to generate comparable numbers, especially for Latency and Power Consumption to get efficiency numbers. Any hints would be appreciated. I know that some containers have been marked in the links but what metric and which code should I be looking at isn’t clear for Image Classification Latency, NMT and Efficiency in all the places.