I want to test whether the deep learning model can use all the gpu's performance. From https://developer.nvidia.com/embedded/faq#xavier-performance. It said that Jetson AGX Xavier have a 11 TFLOPS FP16 gpu. But CNN convolution operation is often evaluated as TMACS. So for Jetson AGX Xavier. Its full power should be 5.5 TMACS or 11 TMACS?