Mixed Precision (Tensor) vs raw FP16 / raw FP32 Compute Metrics

elroy · July 29, 2021, 6:48pm

There is little specificity from NVIDIA (that I’ve been able to find) regarding the relative performance of the Xavier AGX for several common metrics. Namely, the technical splash page for the AGX specifies a FP16 metric of 16 TFLOPs. Yet, I get the sense that this is NOT raw FP16 compute but rather Tensor ‘mixed precision’ compute? Is this assumption correct?

Further, there is no mention of raw FP32 or INT4 compute expectations. Since this architecture is Volta, I presume INT4 is not supported? But what are compute expectations for FP32? Some sources cite 1.4 TFLOPs, but I’m struggling to find anything official from NVIDIA.

AastaLLL · July 30, 2021, 3:33am

Hi,

Xavier doesn’t support INT4 currently.

Do you mention the metrics shared below?
Basically, it’s measured by low-level instructions type rather than inference precision.

Thanks.

elroy · August 10, 2021, 3:38pm

Hey @AastaLLL, thanks for the response.

Yes, those are the metrics I am curious about. There is a fair amount of detail missing in what is reported publicly. Namely, I’m looking for:

Is the FP16 TFLOP metric for mixed precision? If so, what configuration? i.e., FP16 acc or FP32 acc? etc.
What is expected for FP32 TFLOP? Some sources cite 1.4 TFLOPS, but I cannot find anything from NVIDIA.

AastaLLL · August 24, 2021, 5:20am

Hi,

1.
No. It is low-level profiling, which is directly measured by half mode calculation.

2.
We don’t have an FP32 TFLOP score.
You can find detailed FP16 and INT8 compute data below:
https://developer.nvidia.com/blog/nvidia-jetson-agx-xavier-32-teraops-ai-robotics/?ncid=so-fac-mdjngxxrmllhml-69163

Thanks.

Topic		Replies	Views
Confirming expected performance of INT8 vs. FP16 vs. FP32 Jetson AGX Xavier	2	3696	October 18, 2021
How much acceleration in practice by using FP16 against FP32? Jetson AGX Xavier	2	705	October 2, 2018
Jetson AGX Xavier performance. TFLOPS or TMACS? Jetson AGX Xavier	4	1108	October 18, 2021
Coarse comparision of Xavier with desktop GTX10XX series performance Jetson AGX Xavier	5	5807	December 2, 2020
2D-FFT Benchmarks on Jetson AGX with various precisions Jetson AGX Xavier cuda	6	2845	October 18, 2021
Question regarding Tensor Cores/GV100 CUDA Programming and Performance	8	2546	August 12, 2017
Looking for full specs on NVIDIA A5000 CUDA Programming and Performance	2	2814	June 16, 2022
Where to find Xavier Performance in FLOPs? Jetson AGX Xavier	3	6153	April 6, 2021
Peak FP32 FLOP/s of AGX Orin Jetson AGX Orin performance	4	40	April 14, 2025
Inference using FP16 and FP32 precision giving no performance gain on Jetson Nano Jetson Nano	2	1344	October 14, 2021

Mixed Precision (Tensor) vs raw FP16 / raw FP32 Compute Metrics

Related topics