Degraded performance on new Xavier NX 8GB 900-83668-0000-000 affected by PCN206980

dimitris_akridas · November 23, 2023, 5:37pm

Hello,

We are seeing degraded performance when running our commercial software pipeline on Jetson Xavier NX module, with custom designed carrier board and 6x4K cameras on Jetpack 4.6.1. It’s seems that the new board fails to handle heavy traffic and CPU is quickly caps to 100% utilization.

After contacting our supplier’s technical contact, they informed us that our boards are affected by the PCN206980, since Hynix memory & Hynix eMMC components have introduced to the BOM.

The recommended actions describe that we need to include the Appropriate BCT and DVFS changes required by the Hynix memory device on the software image and re-flash. These changes have been included in the Jetpack 4.4.1 and later releases though, and as I described above, we are using Jetpack 4.6.1.

We would appreciate some help to understand what exactly is the problem.

PS: We noticed that after boot, the EMC clock is locked at 204MHz instead of the expected 1600MHz. We can’t change that since the max frequency is also locked at 204MHz

$ sudo jetson_clocks --show
SOC family:tegra194 Machine:NVIDIA Jetson Xavier NX Developer Kit
Online CPUs: 0-5
cpu0: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu1: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu2: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu3: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu4: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu5: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
GPU MinFreq=1109250000 MaxFreq=1109250000 CurrentFreq=1109250000
EMC MinFreq=204000000 MaxFreq=204000000 CurrentFreq=204000000 FreqOverride=1
Fan: PWM=0
NV Power Mode: MODE_20W_6CORE

DaneLLL · November 24, 2023, 1:17am

Hi,
Do you mean your Xavier NX is not PCN206980 and performance is impacted since we enable PCN206980 in Jetpack 4.6.1?

dimitris_akridas · November 24, 2023, 9:11am

PCN206980 is the Product Change Notice we got from NVIDIA about the new memory that changed in our Xavier NX BOM.

dimitris_akridas · November 27, 2023, 10:51am

Hello. We would appreciate a fast responsive on this, since it’s a serious bottleneck for the production line of our product.

DaneLLL · November 27, 2023, 10:58am

Hi,
Do you mean you observe performance downgrade in all modules? Or specific to PCN206980 modules? It is uncertain what the issue is.

dimitris_akridas · November 27, 2023, 11:13am

We see performance issues specific to modules that affected by PCN206980. Below is a list with modules that dont work as expected

1422122080122,48B02D7A929D,699-13668-0001-301
1422122054471,48B02D7A8D56,699-13668-0001-301
1422122055782,48B02D7A928B,699-13668-0001-301
1422122033874,48B02D7A9607,699-13668-0001-301
1422122033872,48B02D7A960D,699-13668-0001-301
1422122055560,48B02D7A9615,699-13668-0001-301
1422122080220,48B02D7A9176,699-13668-0001-301
1422122033870,48B02D7A95FE,699-13668-0001-301
1422122033866,48B02D7A95F9,699-13668-0001-301
1422122080244,48B02D7A91A3,699-13668-0001-301
1422122033877,48B02D7A9611,699-13668-0001-301
1422122055557,48B02D7A961A,699-13668-0001-301
1422122053476,48B02D7A9400,699-13668-0001-301
1422122080429,48B02D7A9013,699-13668-0001-301
1422122080219,48B02D7A9180,699-13668-0001-301

DaneLLL · November 28, 2023, 1:14am

Hi,
Please share a method to replicate the issue on Xavier NX developer kit. Please insert either module to the developer kit and check if it is possible to replicate the issue. So that we can follow the steps to reproduce it and check.

dimitris_akridas · December 5, 2023, 5:03pm

Hello,

Please check the results that I am getting when I run the matrixMul cuda sample (/usr/local/cuda/samples/0_Simple/matrixMul) in the old and new modules

Working module: 421821020798,48B02D384B91,699-13668-0001-300

$ sudo /usr/local/cuda/samples/0_Simple/matrixMul/matrixMul
[Matrix Multiply Using CUDA] - Starting…
GPU Device 0: “Xavier” with compute capability 7.2

MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel…
done
Performance= 207.92 GFlop/s, Time= 0.630 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS

NOTE: The CUDA Samples are not meant for performancemeasurements. Results may vary when GPU Boost is enabled.

$ sudo jetson_clocks --show
SOC family:tegra194 Machine:NVIDIA Jetson Xavier NX Developer Kit
Online CPUs: 0-5
cpu0: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu1: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu2: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu3: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1344000 IdleStates: C1=0 c6=0
cpu4: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1344000 IdleStates: C1=0 c6=0
cpu5: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1190400 IdleStates: C1=0 c6=0
GPU MinFreq=1109250000 MaxFreq=1109250000 CurrentFreq=1109250000
EMC MinFreq=204000000 MaxFreq=1600000000 CurrentFreq=1600000000 FreqOverride=1
Fan: PWM=0
NV Power Mode: MODE_15W_6CORE

Not working module: 1422122054471,48B02D7A8D56,699-13668-0001-301

$ sudo /usr/local/cuda/samples/0_Simple/matrixMul/matrixMul
[Matrix Multiply Using CUDA] - Starting…
GPU Device 0: “Xavier” with compute capability 7.2

MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel…
done
Performance= 61.73 GFlop/s, Time= 2.123 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS

NOTE: The CUDA Samples are not meant for performancemeasurements. Results may vary when GPU Boost is enabled.

$ sudo jetson_clocks --show
SOC family:tegra194 Machine:NVIDIA Jetson Xavier NX Developer Kit
Online CPUs: 0-5
cpu0: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1344000 IdleStates: C1=0 c6=0
cpu1: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu2: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu3: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1190400 IdleStates: C1=0 c6=0
cpu4: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu5: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=1420800 CurrentFreq=1190400 IdleStates: C1=0 c6=0
GPU MinFreq=114750000 MaxFreq=1109250000 CurrentFreq=114750000
EMC MinFreq=204000000 MaxFreq=204000000 CurrentFreq=204000000 FreqOverride=1
Fan: PWM=0
NV Power Mode: MODE_15W_6CORE

Update 2

Forgot to run the #jetson_clocks in the not working module. Still the EMC clock is low. After running that I am getting the same poor performance (Performance= 64.47 GFlop/s, Time= 2.033 msec) in the matrixMul example.

$ sudo jetson_clocks --show
SOC family:tegra194 Machine:NVIDIA Jetson Xavier NX Developer Kit
Online CPUs: 0-5
cpu0: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu1: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu2: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu3: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu4: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
cpu5: Online=1 Governor=schedutil MinFreq=1420800 MaxFreq=1420800 CurrentFreq=1420800 IdleStates: C1=0 c6=0
GPU MinFreq=1109250000 MaxFreq=1109250000 CurrentFreq=1109250000
EMC MinFreq=204000000 MaxFreq=204000000 CurrentFreq=204000000 FreqOverride=1
Fan: PWM=0
NV Power Mode: MODE_15W_6CORE

DaneLLL · December 6, 2023, 7:59am

Hi,
Thanks for the steps. We would need your help to get the uart log of the device:

1422122054471,48B02D7A8D56,699-13668-0001-301

Would like to get RAM code of the device, which is printed in uart log.

DaneLLL · December 7, 2023, 6:15am

Hi,
We try it on Jetpack 4.6.1 but do not hit the issue. The Xavier NX module we use:

$ cat /etc/nv_boot_control.conf 
TNSPEC 3668-301-0001-A.0-1-2-jetson-xavier-nx-devkit-emmc-mmcblk0p1
COMPATIBLE_SPEC 3668-301---1--jetson-xavier-nx-devkit-emmc-
TEGRA_CHIPID 0x19
TEGRA_OTA_BOOT_DEVICE /dev/mtdblock0
TEGRA_OTA_GPT_DEVICE /dev/mtdblock0

Ram Code: 0x1

We run flash.sh command to flash system image.

dimitris_akridas · December 7, 2023, 11:03am

Hello,

We manage to narrow down the issue.

We see the issue when we flash with the generated mass flashing script ./nvmflash.sh. We produce this script with the following command

sudo BOARDID=3668 BOARDSKU=0001 FAB=100 FUSELEVEL=fuselevel_production ./nvmassflashgen.sh jetson-xavier-nx-devkit-emmc mmcblk0p1

We don’t see the issue when we flash with the regular flash command

sudo ./flash.sh jetson-xavier-nx-devkit-emmc mmcblk0p1

Can you help us understand if we need to specify something different when we generate this nvmflash.sh tool so it can work for all modules ?

WayneWWW · December 7, 2023, 11:05am

As our previous comment, please share the uart log between these two cases.
It will help identify the cause.

dimitris_akridas · December 7, 2023, 11:25am

Ofcourse.
uart.zip (2.8 KB)

WayneWWW · December 7, 2023, 11:27am

Your log is not completed and also we are talking about the working and NG case.

You should share 2 logs. Not only one.

dimitris_akridas · December 7, 2023, 12:37pm

Hello.

The non working module
48B02D7A8D56-uart.zip (3.6 KB)

And the working module
48B02D384B91-uart.zip (3.4 KB)

I generated this log with

sudo minicom -D /dev/ttyUSB0 -b 115200 -C X-uart.log

I had the program open and then powered on the modules to make sure that I will capture the very first messages. Then I waited until it was stable and no more messages where generated (a few minutes).

WayneWWW · December 7, 2023, 12:51pm

Hi,

Are you sure the cable and board are fine? Is this conducted on devkit?
This is a totally not completed log in either working and non working case.

Could you also try other console tool instead of minicom?

For example, using picocom.

dimitris_akridas · December 7, 2023, 1:01pm

I think I fixed the issue. I can see the ram code inside each log.

The non working module
48B02D7A8D56-uart.zip (10.0 KB)

And the working module
48B02D384B91-uart.zip (10.0 KB)

WayneWWW · December 7, 2023, 1:06pm

Hi,

Thanks. The log is completed now.
I think there is a mistake here. It is not “working” and “NG” modules.

The real issue should be when you run flash.sh, it will give you one ram code.
But when you use massflash, it gives you another ram code.

My point here is you should use the same module to prove what I said is correct or not. But not 2 different modules.

dimitris_akridas · December 7, 2023, 1:09pm

So you want me to give you 2 logs that are generated with the same module and each case

Flash normally with flash.sh tool
Mass flash with nvmflash.sh tool

And we should see different RAM code in each log. Is that correct ?

WayneWWW · December 7, 2023, 1:10pm

Yes, that is what I want to prove. The different RAM code causes the performance issue you saw.

Topic		Replies	Views
Different Xavier NX Modules Exhibit Inconsistent Flashing Behavior Jetson Xavier NX reflash , ubuntu	13	721	September 5, 2022
Jetson Xavier NX Failure to Flash "Task 50 or 0x4d Failed" (Jetpack 4.6.1 and Jetpack 5.0.2) Jetson Xavier NX reflash	21	1249	January 26, 2023
Jetson Xavier NX Flashing Fails After Replacing eMMC with Larger Capacity Jetson Xavier NX boot	2	500	May 2, 2023
Failed to boot the JetsonXavierNX device from the USB flash drive Jetson Xavier NX boot	31	1078	October 26, 2022
Xavier nx boot time issue Jetson Xavier NX boot , board-design	31	1144	January 1, 2024
Can't flash Jetson Xavier NX board Jetson Xavier NX reflash	19	1056	August 3, 2022
Differentiate between HW Revs of Xaviers Jetson AGX Xavier hw	21	1397	January 4, 2023
Only 525Mb left of disk space after flashing a 256Gb SD card with the SDK manager. What's going on? Jetson Xavier NX reflash	21	3362	October 18, 2021
Jetson Xavier NX (Developer Edition) not starting (blank screen) Jetson Xavier NX boot , hw , ubuntu , usb , power	21	3366	July 1, 2022
Sometimes not booting in Jetson Xavier NX Jetson Xavier NX boot , kernel	34	1129	January 16, 2024

Degraded performance on new Xavier NX 8GB 900-83668-0000-000 affected by PCN206980

Update 2

Related topics