Does Tensor Core on Jetson AGX Orin support FP32( IEEE 754 single precision floating point number)?

Hi,

In the above document, there is a link connecting to the GA102 which contains more details about the 3rd generation Tensor Core.
It can take FP32 as input and output but use TF32 intermediate for acceleration.


NVIDIA Ampere Architecture Tensor Cores Support New DL Data Types

Today, the default math for AI training is FP32, without Tensor Core acceleration. The NVIDIA Ampere architecture introduces new support for TF32, enabling AI training to use Tensor Cores
by default with no effort on the user’s part. Non-tensor operations continue to use the FP32 datapath, while TF32 Tensor Cores read FP32 data and use the same range as FP32 with
reduced internal precision, before producing a standard IEEE FP32 output. TF32 includes an 8-bit exponent (same as FP32), 10-bit mantissa (same precision as FP16) and 1 sign-bit. TF32
mode of an Ampere architecture GPU Tensor Core provides up to 4x more throughput than standard FP32 when sparsity is used. Throughput is dependent on modes and SKU information;
see Table 2, Table 3, and Appendix A for per-SKU specifications.


Thanks.

1 Like