What's the spec of GA10b? How to calculate the FP16 computing capability of the CUDA cores of Orin?

RicardoLu · September 4, 2024, 2:14am

Accord to this Tensor core of Jetson AGX Orin - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums issue, I known the SM arch of Orin is GA10x, but if we calculate the FP16 FMA computing capability base on the GA10x, it should be 1.3Ghz(64GB dev kit) * 64 * 128 * 2 = 21.3TFLOPs, and the sparse INT8 computing capability should be 21.3 * 2 * 2 = 85TFLOPs. However, it said AGX Orin 64GB has 170TOPs sparse INT8 computing capability.
After I search more information, I found that the SM arch of Orin is GA10b, and according to this The tensor core performance detail of Jetson AGX Orin 32GB - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums issue, I known that the dense FP16 FMA operations per tensor core of GA10b is 256(while GA10x is 128), so 85 * 2 = 170TOPs, it matchs the description in Orin’s spec doc.
My question is where I can get the spec doc of GA10b, I’m wandering the SM arch of GA10b, and I want to know the design of CUDA core, I want to know the whole FP16 computing capability of the 2048 CUDA cores.

Thanks.

AastaLLL · September 4, 2024, 6:48am

Hi,

You can find the spec in the below link directly:

Thanks.

RicardoLu · September 4, 2024, 7:50am

Sorry, I didn’t find any datasheets or whitepapers about GA10b in this website. BTW, I also want to know the design detail of DLA, could you share any docs?

AastaLLL · September 5, 2024, 5:35am

Hi,

GA10b is the Orin series (sm-87).

What kind of design details do you want to find?
Below is the technical brief for the Orin series and you can find DLA info on page 7.

Thanks

RicardoLu · September 5, 2024, 6:07am

Hi,

Yes, I’ve already read all the design docs from the Orin website and some Ampere arch white paper, but never found any detailed information about GA10b.
I’d like to know the SM arch of GA10b, like this image:

Or at least the specific description of the operate ability of the GA10b’s SM, both CUDA core and Tensor Core.
As for DLA, I’d like to know all the operations it supported and how to distinguish the engine is run on the GPU or DLA, and in which data format. As far as I know is how to run some CV model on DLA in sparse INT8 format, but I’m wandering whether DLA could apply for dense FP16 compute.

Thanks

AastaLLL · September 9, 2024, 5:25am

Hi,

We need to check with the internal team if any public SM info can be shared.

An engine run on DLA or GPU is controlled by the user.
When building an engine with TensorRT API, it needs the placement info as well (default=GPU).
The support matrix of DLA of layers can be found in the below link:

Thanks.

system · October 9, 2024, 5:15am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Jetson AGX Orin TOPs / CUDA Cores Explained Jetson AGX Orin jetson-inference	8	5206	May 24, 2023
NVIDIA Orin Performance Jetson AGX Orin tensorrt	3	150	October 14, 2024
TFLOPS(FP16) about DLA (Deep Learning Accelerator) on Jetson Orin NX Jetson AGX Orin dla , kb	4	1774	April 13, 2023
The tensor core performance detail of Jetson AGX Orin 32GB Jetson AGX Orin	14	1051	June 13, 2023
When can we expect a Orin to release with the full 4096 shading units? Jetson AGX Orin	7	960	October 12, 2023
Keys to optimization a network on AGX Orin DLA for latency Jetson AGX Orin tensorrt , dla	2	862	October 6, 2023
DLA-v2 is slower than DLA-v1 Jetson AGX Orin tensorrt , jetson-inference	8	2552	July 6, 2022
Jetson Orin modules and developer kit announcements Jetson AGX Orin	10	5381	August 6, 2024
GeMM performance on Orin DLA Jetson AGX Orin tensorrt , cuda , jetson-inference	10	898	February 21, 2024
Announcing Jetson AGX Orin: Next-level AI performance for next-gen robotics Jetson AGX Orin	13	3279	August 10, 2022

What's the spec of GA10b? How to calculate the FP16 computing capability of the CUDA cores of Orin?

Related topics