GPU Compute and memory benchmarks for Jetson AGX Orin

virtual.ramblings · November 28, 2024, 6:01am

We are looking for benchmarks that can give the peak FLOP/s and memory bandwidth on the Jetson AGX Orin.

https://github.com/NVIDIA-AI-IOT/jetson_benchmarks: We looked at this, and it seems to focus on deep learning workloads. We are interested in measuring the peak performance/bandwidth. Please recommend any standard benchmarks.

carolyuu · November 28, 2024, 6:30am

Hi,
Here are some suggestions for the common issues:

1. Performance

Please run the below command before benchmarking deep learning use case:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

2. Installation

Installation guide of deep learning frameworks on Jetson:

TensorFlow: Installing TensorFlow for Jetson Platform - NVIDIA Docs
PyTorch: Installing PyTorch for Jetson Platform - NVIDIA Docs
We also have containers that have frameworks preinstalled:
Data Science, Machine Learning, AI, HPC Containers | NVIDIA NGC

3. Tutorial

Startup deep learning tutorial:

Jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson
TensorRT sample: Jetson/L4T/TRT Customized Example - eLinux.org

4. Report issue

If these suggestions don’t help and you want to report an issue to us, please attach the model, command/step, and the customized app (if any) with us to reproduce locally.

Thanks!

DaneLLL · November 28, 2024, 7:29am

Hi,
You may try this stress test:

Jetson/L4T/TRT Customized Example - eLinux.org

virtual.ramblings · December 10, 2024, 7:33am

Is there a similar recommended benchmark to measure memory bandwidth that the GPU is able to effectively get?

AastaLLL · December 11, 2024, 2:33am

Hi,

Please find our CUDA bandwidthTest sample below:

Thanks.

virtual.ramblings · December 11, 2024, 5:22am

I was able to run this and got the following result:

Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 36.6

This is the bandwidth obtained in coping data from the CPU to the GPU, even though in this case, both use the same physical DRAM. Is my understanding of this correct?

Also, theoretical DRAM bandwidth is around 204 GB/s, while this shows 36.6. Is this expected? What kind of overheads are involved here?

AastaLLL · December 12, 2024, 8:52am

Hi,

Yes, on Jetson both CPU and GPU use the same physical memory.
To understand more about Jetson’s memory, please find the document below:

Thanks.

system · December 26, 2024, 8:52am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Jetson AGX Orin 32GB: Measured Memory Bandwidth Much Lower Than Theoretical Spec Jetson AGX Orin hw , jetson , level3	19	143	June 2, 2025
Memory bandwidth on Orin Jetson AGX Orin	9	1395	March 15, 2024
Jetson AGX Orin Memory bandwidth stability testing issues Jetson AGX Orin	4	24	April 23, 2025
How to monitor the real time memory bandwidth of Jetson AGX Orin? Jetson AGX Orin hw , ubuntu	5	87	March 12, 2025
Confused about memory bandwidth Jetson Orin NX cuda , kernel	5	2761	May 3, 2023
Jetson AGX Orin memory read/write bandwidth Jetson AGX Orin performance	2	680	July 3, 2023
About testing the bandwidth method of memory on orin soc Jetson AGX Orin kernel , ubuntu	2	327	May 31, 2023
Memory slow speed Jetson AGX Orin performance	2	336	February 1, 2024
Measure DDR bandwidth on Orin Jetson AGX Orin	4	33	July 30, 2025
Jetson AGX Orin Memory Bus Width Jetson AGX Orin hw	11	1192	May 17, 2023

GPU Compute and memory benchmarks for Jetson AGX Orin

1. Performance

2. Installation

3. Tutorial

4. Report issue

Related topics