Disk write performance issue on Jetson AGX Orin

Hi,

During the test of the Jetson AGX Orin dev board, we encountered continuosly IO bottleneck of the built-in storage. The problem first appeared when we tried to pull and build docker images. It took very long to download and extract files, especially for large files.

Then, we performed the following disk write test both on Jetson AGX Xavier and Jetson AGX Orin. Both of them are dev boards.

countArr=($(seq 1 1 10))

for count in "${countArr[@]}"
do
    echo "Count: ${count}"
    sudo dd if=/dev/zero of=/tmp/output bs=1G count=$count && sudo rm -f /tmp/output
done

The result for Jetson AGX Xavier:

Count: 1
1+0 records in
1+0 records out
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 4,40018 s, 244 MB/s
Count: 2
2+0 records in
2+0 records out
2147483648 bytes (2,1 GB, 2,0 GiB) copied, 8,1285 s, 264 MB/s
Count: 3
3+0 records in
3+0 records out
3221225472 bytes (3,2 GB, 3,0 GiB) copied, 11,5208 s, 280 MB/s
Count: 4
4+0 records in
4+0 records out
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 15,5986 s, 275 MB/s
Count: 5
5+0 records in
5+0 records out
5368709120 bytes (5,4 GB, 5,0 GiB) copied, 20,1221 s, 267 MB/s
Count: 6
6+0 records in
6+0 records out
6442450944 bytes (6,4 GB, 6,0 GiB) copied, 23,7572 s, 271 MB/s
Count: 7
7+0 records in
7+0 records out
7516192768 bytes (7,5 GB, 7,0 GiB) copied, 30,4741 s, 247 MB/s
Count: 8
8+0 records in
8+0 records out
8589934592 bytes (8,6 GB, 8,0 GiB) copied, 40,2465 s, 213 MB/s
Count: 9
9+0 records in
9+0 records out
9663676416 bytes (9,7 GB, 9,0 GiB) copied, 48,832 s, 198 MB/s
Count: 10
10+0 records in
10+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 58,0722 s, 185 MB/s

The result for Jetson AGX Orin:

Count: 1
1+0 records in
1+0 records out
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 0,95453 s, 1,1 GB/s
Count: 2
2+0 records in
2+0 records out
2147483648 bytes (2,1 GB, 2,0 GiB) copied, 1,8267 s, 1,2 GB/s
Count: 3
3+0 records in
3+0 records out
3221225472 bytes (3,2 GB, 3,0 GiB) copied, 2,71831 s, 1,2 GB/s
Count: 4
4+0 records in
4+0 records out
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 3,86091 s, 1,1 GB/s
Count: 5
5+0 records in
5+0 records out
5368709120 bytes (5,4 GB, 5,0 GiB) copied, 75,4004 s, 71,2 MB/s
Count: 6
6+0 records in
6+0 records out
6442450944 bytes (6,4 GB, 6,0 GiB) copied, 206,466 s, 31,2 MB/s
Count: 7
7+0 records in
7+0 records out
7516192768 bytes (7,5 GB, 7,0 GiB) copied, 380,17 s, 19,8 MB/s
Count: 8
8+0 records in
8+0 records out
8589934592 bytes (8,6 GB, 8,0 GiB) copied, 477,032 s, 18,0 MB/s
Count: 9
9+0 records in
9+0 records out
9663676416 bytes (9,7 GB, 9,0 GiB) copied, 604,54 s, 16,0 MB/s
Count: 10
10+0 records in
10+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 826,765 s, 13,0 MB/s

As shown above, the write operations of Orin has some significant performance degradation starting from count=5 with bs=1G. Could this be the cause to the issue? We have the same disk IO issue on the other Jetson AGX Orin dev board as well.

Some system information:

$ cat /proc/version
Linux version 5.10.104-tegra (buildbrain@mobile-u64-5273-d7000) (aarch64-buildroot-linux-gnu-gcc.br_real (Buildroot 2020.08) 9.3.0, GNU ld (GNU Binutils) 2.33.1) #1 SMP PREEMPT Wed Aug 10 20:17:07 PDT 2022
$ cat /etc/nv_tegra_release
# R35 (release), REVISION: 1.0, GCID: 31346300, BOARD: t186ref, EABI: aarch64, DATE: Thu Aug 25 18:41:45 UTC 2022
$ sudo apt-cache show nvidia-jetpack
Package: nvidia-jetpack
Version: 5.0.2-b231
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-jetpack-runtime (= 5.0.2-b231), nvidia-jetpack-dev (= 5.0.2-b231)
Homepage: http://developer.nvidia.com/jetson
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_5.0.2-b231_arm64.deb

I appreciate your help in advance!

Kind regards,
Zhongpin

1 Like

Hi,
Do you write to a NVMe SSD at M.2 M-Key slot? Would like to confirm what the built-in storage is.

Thanks for the reply.

No, we did not install any SSD on the board. We use the default eMMC flash memory.

Hi,
The dd command being used is not actually writing to the disk before reporting the perf numbers. Please run the command with conv=fdatasync

Hi,

Thanks a lot for the suggestion. You are right. We did the write test again using conv=fdatasync and it shows that the write speed was already slow with bs=1G and count=1.

The new script:

countArr=($(seq 1 1 10))

for count in "${countArr[@]}"
do
    echo "Count: ${count}"
    sudo dd if=/dev/zero of=/tmp/output bs=1G count=$count conv=fdatasync && sudo rm -f /tmp/output
done

The new result for Jetson AGX Xavier:

Count: 1
1+0 records in
1+0 records out
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 11,486 s, 93,5 MB/s
Count: 2
2+0 records in
2+0 records out
2147483648 bytes (2,1 GB, 2,0 GiB) copied, 21,8295 s, 98,4 MB/s
Count: 3
3+0 records in
3+0 records out
3221225472 bytes (3,2 GB, 3,0 GiB) copied, 31,6822 s, 102 MB/s
Count: 4
4+0 records in
4+0 records out
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 40,2928 s, 107 MB/s
Count: 5
5+0 records in
5+0 records out
5368709120 bytes (5,4 GB, 5,0 GiB) copied, 49,5213 s, 108 MB/s
Count: 6
6+0 records in
6+0 records out
6442450944 bytes (6,4 GB, 6,0 GiB) copied, 57,8196 s, 111 MB/s
Count: 7
7+0 records in
7+0 records out
7516192768 bytes (7,5 GB, 7,0 GiB) copied, 68,6313 s, 110 MB/s
Count: 8
8+0 records in
8+0 records out
8589934592 bytes (8,6 GB, 8,0 GiB) copied, 75,3237 s, 114 MB/s
Count: 9
9+0 records in
9+0 records out
9663676416 bytes (9,7 GB, 9,0 GiB) copied, 86,192 s, 112 MB/s
Count: 10
10+0 records in
10+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 94,2227 s, 114 MB/s

The new result for Jetson AGX Orin:

Count: 1
1+0 records in
1+0 records out
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 88,581 s, 12,1 MB/s
Count: 2
2+0 records in
2+0 records out
2147483648 bytes (2,1 GB, 2,0 GiB) copied, 189,994 s, 11,3 MB/s
Count: 3
3+0 records in
3+0 records out
3221225472 bytes (3,2 GB, 3,0 GiB) copied, 249,369 s, 12,9 MB/s
Count: 4
4+0 records in
4+0 records out
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 372,454 s, 11,5 MB/s
Count: 5
5+0 records in
5+0 records out
5368709120 bytes (5,4 GB, 5,0 GiB) copied, 416,113 s, 12,9 MB/s
Count: 6
6+0 records in
6+0 records out
6442450944 bytes (6,4 GB, 6,0 GiB) copied, 488,831 s, 13,2 MB/s
Count: 7
7+0 records in
7+0 records out
7516192768 bytes (7,5 GB, 7,0 GiB) copied, 582,144 s, 12,9 MB/s
Count: 8
8+0 records in
8+0 records out
8589934592 bytes (8,6 GB, 8,0 GiB) copied, 633,637 s, 13,6 MB/s
Count: 9
9+0 records in
9+0 records out
9663676416 bytes (9,7 GB, 9,0 GiB) copied, 727,106 s, 13,3 MB/s
Count: 10
10+0 records in
10+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 828,68 s, 13,0 MB/s

Hi,
The eMMC of Orin and Xavier is from different vendor so it is possible throughput is different. We will check and confirm if the result is expected.

1 Like

Hi,
There will be improvement in next release for Orin. It will be like:

Count: 4
4+0 records in
4+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 28.378 s, 151 MB/s
Count: 5
5+0 records in
5+0 records out
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 33.7264 s, 159 MB/s
Count: 6
6+0 records in
6+0 records out
6442450944 bytes (6.4 GB, 6.0 GiB) copied, 343.346 s, 18.8 MB/s
Count: 7
7+0 records in
7+0 records out
7516192768 bytes (7.5 GB, 7.0 GiB) copied, 119.044 s, 63.1 MB/s

Varying perf is due to dynamic SLC architecture in eMMC of Orin. Please note this.

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.