eMMC performance regressions on TX2 SOMs?

Using L4T 28.4, I am observing very different behaviour on two of our TX2 SOMs for certain operations. The TX2 SOMs in question have different eMMC modules.

One example is the blkdiscard command.

On the old TX2 SOM:

$ time sudo blkdiscard /dev/disk/by-partlabel/UDA
real    0m0.156s
user    0m0.020s
sys     0m0.032s

On the new TX2 SOM:

$ time sudo blkdiscard /dev/disk/by-partlabel/UDA
real    0m1.847s
user    0m0.012s
sys     0m0.044s

Note that I am only using /dev/disk/by-partlabel/UDA as an example. The partition layout that we’re using for our product is different, but I wanted a test that runs on the devkit with the default software to prove that it is hardware related.

The performance regression is quite significant on the new TX2 SOMs. Similar numbers can be obtained from other commands, like mkfs.ext4.

Is there a workaround for these performance regressions?

Thanks,
Robb

hello robb_n,

let’s check what’s the difference between your two TX2s.
please refer to PCN from download center for more details.
thanks

Hi JerryChang,

It is PCN 206440: https://developer.nvidia.com/jetson-tx2-pcn-206440-dramemmc-public

Thanks.

I attempted to flash the new TX2 with the L4T 32.5 release. The blkdiscard command is much faster on this release. It is closer to 0.200s.

Based on it, I assume there is a possible software fix for the older L4T 28.X releases. It is possible for NVIDIA to point to provide a patch for the older releases (preferably for the linux kernel in L4T 28.1 or 28.2.1?), or at least point to to the relevant drivers and changes? Thanks!

hello robb_n,

could you please have a breakdown list to show all your experiment results, for example, l4t release v.s. TX2s.
is it also confirmed a software issue, that r32.5 shows similar eMMC performance with difference TX2s?
thanks

The command used to test in each case is:

time sudo blkdiscard /dev/disk/by-partlabel/UDA

I’ve re-run the command on the old and new TX2, as well as on L4T 28.4, and L4T 32.5. A summary of the results are in the table below:

     Configuration      ||          Execution Time
 Release    |  TX2 SOM  ||  real     |  user     |  sys
------------+-----------++-----------+-----------+-----------
 L4T 28.4   |  old      ||  0.168s   |  0.020s   |  0.040s
 L4T 28.4   |  new      ||  1.551s   |  0.024s   |  0.052s ***
 L4T 32.5   |  old      ||  0.118s   |  0.012s   |  0.016s
 L4T 32.5   |  new      ||  0.216s   |  0.008s   |  0.004s

Note that the results for the (L4T 28.4, new TX2 SOM) configuration is much worse than the other combinations. I’ve highlighted this row with ***.

All tests were done on a Jetson devkit.

hello robb_n,

I’m curious the device tree files used by these two TX2s; could you please check $ dmesg | grep DTS for your two TX2 SOMs.
please also test the eMMC performance with tegrastats utility enabled, you may compare the usage reports.
thanks

Results for dmesg | grep DTS:

OLD TX2 28.4

[    0.047184] DTS File Name: /dvs/git/dirty/git-master_linux/kernel/kernel-4.4/arch/arm64/boot/dts/../../../../../../hardware/nvidia/platform/t18x/quill/kernel-dts/tegra186-quill-p3310-1000-c03-00-base.dts
[    0.169917] DTS File Name: /dvs/git/dirty/git-master_linux/kernel/kernel-4.4/arch/arm64/boot/dts/../../../../../../hardware/nvidia/platform/t18x/quill/kernel-dts/tegra186-quill-p3310-1000-c03-00-base.dts

NEW TX2 28.4

[    0.047108] DTS File Name: /dvs/git/dirty/git-master_linux/kernel/kernel-4.4/arch/arm64/boot/dts/../../../../../../hardware/nvidia/platform/t18x/quill/kernel-dts/tegra186-quill-p3310-1000-c03-00-base.dts
[    0.169900] DTS File Name: /dvs/git/dirty/git-master_linux/kernel/kernel-4.4/arch/arm64/boot/dts/../../../../../../hardware/nvidia/platform/t18x/quill/kernel-dts/tegra186-quill-p3310-1000-c03-00-base.dts

OLD TX2 32.5

[    0.164190] DTS File Name: /dvs/git/dirty/git-master_linux/kernel/kernel-4.9/arch/arm64/boot/dts/../../../../../../hardware/nvidia/platform/t18x/quill/kernel-dts/tegra186-quill-p3310-1000-c03-00-base.dts
[    0.429300] DTS File Name: /dvs/git/dirty/git-master_linux/kernel/kernel-4.9/arch/arm64/boot/dts/../../../../../../hardware/nvidia/platform/t18x/quill/kernel-dts/tegra186-quill-p3310-1000-c03-00-base.dts

NEW TX2 32.5

[    0.164151] DTS File Name: /dvs/git/dirty/git-master_linux/kernel/kernel-4.9/arch/arm64/boot/dts/../../../../../../hardware/nvidia/platform/t18x/quill/kernel-dts/tegra186-quill-p3310-1000-c03-00-base.dts
[    0.429257] DTS File Name: /dvs/git/dirty/git-master_linux/kernel/kernel-4.9/arch/arm64/boot/dts/../../../../../../hardware/nvidia/platform/t18x/quill/kernel-dts/tegra186-quill-p3310-1000-c03-00-base.dts

I’m not sure how useful the tegrastats utility is since the runtime is only a few seconds for each. I’ve used the following command to test:

rm -f /tmp/tegrastats.log
tegrastats --start --logfile /tmp/tegrastats.log
sudo blkdiscard /dev/disk/by-partlabel/UDA
sleep 5
tegrastats --stop

Note that the tegrastats command was replaced with ./tegrastats depending on the L4T version since they each store the utility in different directories.

The results from /tmp/tegrastats.log are:

OLD TX2 28.4

RAM 859/7846MB (lfb 1539x4MB) CPU [0%@652,off,off,0%@652,0%@652,0%@652] BCPU@35.5C MCPU@35.5C GPU@35C PLL@35.5C Tboard@32C Tdiode@34.5C PMIC@100C thermal@35.3C VDD_IN 2025/2025 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 458/458 VDD_WIFI 307/307 VDD_DDR 345/345
RAM 859/7846MB (lfb 1539x4MB) CPU [3%@345,off,off,2%@345,3%@345,4%@345] BCPU@35.5C MCPU@35.5C GPU@35C PLL@35.5C Tboard@32C Tdiode@34.25C PMIC@100C thermal@35.6C VDD_IN 1872/1948 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 458/458 VDD_WIFI 230/268 VDD_DDR 326/335
RAM 859/7846MB (lfb 1539x4MB) CPU [14%@345,off,off,1%@345,2%@345,1%@345] BCPU@36C MCPU@36C GPU@35C PLL@36C Tboard@32C Tdiode@34.25C PMIC@100C thermal@35.6C VDD_IN 1642/1846 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 458/458 VDD_WIFI 19/185 VDD_DDR 345/338
RAM 859/7846MB (lfb 1539x4MB) CPU [0%@345,off,off,1%@345,0%@345,0%@345] BCPU@35.5C MCPU@35.5C GPU@34.5C PLL@35.5C Tboard@32C Tdiode@34.25C PMIC@100C thermal@35.3C VDD_IN 1566/1776 VDD_CPU 152/209 VDD_GPU 152/152 VDD_SOC 458/458 VDD_WIFI 0/139 VDD_DDR 326/335
RAM 859/7846MB (lfb 1539x4MB) CPU [1%@345,off,off,0%@345,1%@345,1%@345] BCPU@36C MCPU@36C GPU@34.5C PLL@36C Tboard@32C Tdiode@34C PMIC@100C thermal@35.3C VDD_IN 1604/1741 VDD_CPU 229/213 VDD_GPU 152/152 VDD_SOC 458/458 VDD_WIFI 0/111 VDD_DDR 326/333
RAM 860/7846MB (lfb 1538x4MB) CPU [3%@345,off,off,2%@345,3%@345,1%@345] BCPU@35.5C MCPU@35.5C GPU@34.5C PLL@35.5C Tboard@32C Tdiode@34.25C PMIC@100C thermal@35.3C VDD_IN 1566/1712 VDD_CPU 229/216 VDD_GPU 152/152 VDD_SOC 458/458 VDD_WIFI 0/92 VDD_DDR 326/332

NEW TX2 28.4

RAM 931/7658MB (lfb 1461x4MB) CPU [0%@499,off,off,0%@499,0%@499,0%@499] BCPU@37C MCPU@37C GPU@35C PLL@37C Tboard@33C Tdiode@34.25C PMIC@100C thermal@35.9C VDD_IN 916/916 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 305/305 VDD_WIFI 0/0 VDD_DDR 190/190
RAM 931/7658MB (lfb 1461x4MB) CPU [11%@345,off,off,2%@345,4%@345,3%@345] BCPU@37C MCPU@37C GPU@35C PLL@37C Tboard@33C Tdiode@34.5C PMIC@100C thermal@35.9C VDD_IN 1299/1107 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 382/343 VDD_WIFI 114/57 VDD_DDR 209/199
RAM 931/7658MB (lfb 1461x4MB) CPU [9%@499,off,off,6%@499,5%@499,12%@499] BCPU@36.5C MCPU@36.5C GPU@35C PLL@36.5C Tboard@33C Tdiode@34.25C PMIC@100C thermal@36.2C VDD_IN 1451/1222 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 382/356 VDD_WIFI 343/152 VDD_DDR 229/209
RAM 931/7658MB (lfb 1461x4MB) CPU [7%@345,off,off,5%@345,3%@345,15%@345] BCPU@36.5C MCPU@36.5C GPU@35C PLL@36.5C Tboard@33C Tdiode@34.25C PMIC@100C thermal@35.9C VDD_IN 1031/1174 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 305/343 VDD_WIFI 114/142 VDD_DDR 171/199
RAM 931/7658MB (lfb 1461x4MB) CPU [2%@345,off,off,6%@345,2%@345,0%@345] BCPU@36.5C MCPU@36.5C GPU@35C PLL@36.5C Tboard@34C Tdiode@34.5C PMIC@100C thermal@35.9CVDD_IN 840/1107 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 305/335 VDD_WIFI 19/118 VDD_DDR 152/190
RAM 931/7658MB (lfb 1461x4MB) CPU [1%@345,off,off,6%@345,2%@345,1%@345] BCPU@36.5C MCPU@36.5C GPU@35C PLL@36.5C Tboard@33C Tdiode@34.25C PMIC@100C thermal@35.9C VDD_IN 878/1069 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 305/330 VDD_WIFI 0/98 VDD_DDR 152/183
RAM 931/7658MB (lfb 1461x4MB) CPU [1%@345,off,off,6%@345,0%@345,2%@345] BCPU@36.5C MCPU@36.5C GPU@35C PLL@36.5C Tboard@33C Tdiode@34.25C PMIC@100C thermal@35.7C VDD_IN 878/1041 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 305/327 VDD_WIFI 0/84 VDD_DDR 171/182

OLD TX2 32.5

RAM 1239/7850MB (lfb 1386x4MB) SWAP 0/3925MB (cached 0MB) CPU [3%@345,off,off,3%@345,2%@345,2%@345] EMC_FREQ 0% GR3D_FREQ 0% PLL@40.5C MCPU@40.5C PMIC@100C Tboard@37C GPU@40C BCPU@40.5C thermal@40.3C Tdiode@39.5C
RAM 1239/7850MB (lfb 1386x4MB) SWAP 0/3925MB (cached 0MB) CPU [1%@345,off,off,1%@345,0%@345,1%@345] EMC_FREQ 0% GR3D_FREQ 0% PLL@40.5C MCPU@40.5C PMIC@100C Tboard@37C GPU@39.5C BCPU@40.5C thermal@40.3C Tdiode@39.5C
RAM 1239/7850MB (lfb 1386x4MB) SWAP 0/3925MB (cached 0MB) CPU [1%@345,off,off,0%@345,0%@345,0%@345] EMC_FREQ 0% GR3D_FREQ 0% PLL@40.5C MCPU@40.5C PMIC@100C Tboard@37C GPU@39.5C BCPU@40.5C thermal@40.1C Tdiode@39.5C
RAM 1239/7850MB (lfb 1386x4MB) SWAP 0/3925MB (cached 0MB) CPU [1%@345,off,off,0%@345,0%@345,0%@345] EMC_FREQ 0% GR3D_FREQ 0% PLL@40.5C MCPU@40.5C PMIC@100C Tboard@37C GPU@40C BCPU@40.5C thermal@40.1C Tdiode@39.5C
RAM 1239/7850MB (lfb 1386x4MB) SWAP 0/3925MB (cached 0MB) CPU [1%@345,off,off,1%@345,0%@345,1%@345] EMC_FREQ 0% GR3D_FREQ 0% PLL@40.5C MCPU@40.5C PMIC@100C Tboard@37C GPU@39.5C BCPU@40.5C thermal@40.1C Tdiode@39.5C

NEW TX2 32.5

RAM 1098/7850MB (lfb 1539x4MB) SWAP 0/3925MB (cached 0MB) CPU [5%@345,off,off,3%@345,3%@345,3%@345] EMC_FREQ 0% GR3D_FREQ 0% PLL@32C MCPU@32C PMIC@100C Tboard@28C GPU@30C BCPU@32C thermal@31.2C Tdiode@29.25C
RAM 1098/7850MB (lfb 1539x4MB) SWAP 0/3925MB (cached 0MB) CPU [1%@345,off,off,0%@345,0%@345,0%@345] EMC_FREQ 0% GR3D_FREQ 0% PLL@32C MCPU@32C PMIC@100C Tboard@28C GPU@30C BCPU@32C thermal@31.2C Tdiode@29C
RAM 1098/7850MB (lfb 1539x4MB) SWAP 0/3925MB (cached 0MB) CPU [1%@345,off,off,0%@345,0%@345,1%@345] EMC_FREQ 0% GR3D_FREQ 0% PLL@32C MCPU@32C PMIC@100C Tboard@28C GPU@30C BCPU@32C thermal@31.2C Tdiode@29C
RAM 1098/7850MB (lfb 1539x4MB) SWAP 0/3925MB (cached 0MB) CPU [0%@345,off,off,1%@345,0%@345,0%@345] EMC_FREQ 0% GR3D_FREQ 0% PLL@32C MCPU@32C PMIC@100C Tboard@28C GPU@30C BCPU@32C thermal@31.2C Tdiode@29C
RAM 1098/7850MB (lfb 1539x4MB) SWAP 0/3925MB (cached 0MB) CPU [2%@345,off,off,4%@345,0%@345,1%@345] EMC_FREQ 0% GR3D_FREQ 9% PLL@32C MCPU@32C PMIC@100C Tboard@28C GPU@30C BCPU@32C thermal@31.2C Tdiode@29.25C

Have you managed to confirm my findings on your end?

Thanks.

hello robb_n,

thanks for sharing the info. however, I don’t see anything strange by checking those.
could you please enable sudo permission for executing tegrastats commands, it’ll populate more info for reference.

Here are my results running the tests with sudo tegrastats:

OLD TX2 28.4

RAM 866/7846MB (lfb 1537x4MB) CPU [0%@499,off,off,0%@499,0%@499,0%@499] EMC_FREQ 3%@408 GR3D_FREQ 0%@114 APE 150 BCPU@38C MCPU@38C GPU@37C PLL@38C Tboard@34C Tdiode@36.5C PMIC@100C thermal@37.6C VDD_IN 1566/1566 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 382/382 VDD_WIFI 0/0 VDD_DDR 288/288
RAM 866/7846MB (lfb 1537x4MB) CPU [3%@345,off,off,3%@345,1%@345,6%@345] EMC_FREQ 15%@102 GR3D_FREQ 0%@114 APE 150 BCPU@38C MCPU@38C GPU@37C PLL@38C Tboard@34C Tdiode@36.5C PMIC@100C thermal@37.6C VDD_IN 1528/1547 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 382/382 VDD_WIFI 19/9 VDD_DDR 249/268
RAM 866/7846MB (lfb 1537x4MB) CPU [1%@342,off,off,1%@344,1%@345,0%@345] EMC_FREQ 15%@102 GR3D_FREQ 0%@114 APE 150 BCPU@37.5C MCPU@37.5C GPU@37C PLL@37.5C Tboard@34C Tdiode@36.5C PMIC@100C thermal@37.6C VDD_IN 1451/1515 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 382/382 VDD_WIFI 0/6 VDD_DDR 230/255
RAM 866/7846MB (lfb 1537x4MB) CPU [0%@343,off,off,0%@345,1%@345,0%@345] EMC_FREQ 15%@102 GR3D_FREQ 0%@114 APE 150 BCPU@37.5C MCPU@37.5C GPU@37C PLL@37.5C Tboard@34C Tdiode@36.5C PMIC@100C thermal@37.6C VDD_IN 1451/1499 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 382/382 VDD_WIFI 0/4 VDD_DDR 230/249
RAM 866/7846MB (lfb 1537x4MB) CPU [2%@345,off,off,1%@345,1%@345,1%@345] EMC_FREQ 15%@102 GR3D_FREQ 0%@114 APE 150 BCPU@37.5C MCPU@37.5C GPU@37C PLL@37.5C Tboard@34C Tdiode@36.5C PMIC@100C thermal@37.3C VDD_IN 1795/1558 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 382/382 VDD_WIFI 307/65 VDD_DDR 249/249
RAM 866/7846MB (lfb 1537x4MB) CPU [4%@345,off,off,1%@345,2%@345,2%@345] EMC_FREQ 15%@102 GR3D_FREQ 0%@114 APE 150 BCPU@37.5C MCPU@37.5C GPU@37C PLL@37.5C Tboard@34C Tdiode@36.5C PMIC@100C thermal@37.6C VDD_IN 1681/1578 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 382/382 VDD_WIFI 192/86 VDD_DDR 249/249

NEW TX2 28.4

RAM 868/7658MB (lfb 1490x4MB) CPU [0%@499,off,off,0%@498,0%@498,0%@498] EMC_FREQ 3%@665 GR3D_FREQ 0%@114 APE 150 BCPU@31.5C MCPU@31.5C GPU@29.5C PLL@31.5C Tboard@28C Tdiode@28.75C PMIC@100C thermal@30.7C VDD_IN 1184/1184 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 382/382 VDD_WIFI 0/0 VDD_DDR 247/247
RAM 868/7658MB (lfb 1490x4MB) CPU [10%@345,off,off,4%@345,2%@345,1%@345] EMC_FREQ 5%@408 GR3D_FREQ 0%@114 APE 150 BCPU@31.5C MCPU@31.5C GPU@29.5C PLL@31.5C Tboard@28C Tdiode@28.5C PMIC@100C thermal@30.7C VDD_IN 1375/1279 VDD_CPU 229/229 VDD_GPU 152/152 VDD_SOC 382/382 VDD_WIFI 0/0 VDD_DDR 247/247
RAM 868/7658MB (lfb 1490x4MB) CPU [7%@345,off,off,3%@345,0%@345,1%@345] EMC_FREQ 21%@102 GR3D_FREQ 0%@114 APE 150 BCPU@31.5C MCPU@31.5C GPU@29.5C PLL@31.5C Tboard@28C Tdiode@28.5C PMIC@100C thermal@30.7C VDD_IN 1260/1273 VDD_CPU 152/203 VDD_GPU 152/152 VDD_SOC 382/382 VDD_WIFI 0/0 VDD_DDR 228/240
RAM 868/7658MB (lfb 1490x4MB) CPU [1%@345,off,off,0%@345,1%@345,0%@345] EMC_FREQ 21%@102 GR3D_FREQ 0%@114 APE 150 BCPU@31.5C MCPU@31.5C GPU@29.5C PLL@31.5C Tboard@28C Tdiode@28.5C PMIC@100C thermal@30.7C VDD_IN 1108/1231 VDD_CPU 152/190 VDD_GPU 152/152 VDD_SOC 305/362 VDD_WIFI 0/0 VDD_DDR 228/237
RAM 868/7658MB (lfb 1490x4MB) CPU [1%@345,off,off,0%@345,1%@345,0%@345] EMC_FREQ 21%@102 GR3D_FREQ 0%@114 APE 150 BCPU@31.5C MCPU@31.5C GPU@29.5C PLL@31.5C Tboard@28C Tdiode@28.75C PMIC@100C thermal@30.7C VDD_IN 1108/1207 VDD_CPU 152/182 VDD_GPU 152/152 VDD_SOC 305/351 VDD_WIFI 0/0 VDD_DDR 228/235
RAM 868/7658MB (lfb 1490x4MB) CPU [0%@344,off,off,0%@345,2%@345,0%@345] EMC_FREQ 21%@102 GR3D_FREQ 0%@114 APE 150 BCPU@31.5C MCPU@31.5C GPU@29.5C PLL@31.5C Tboard@28C Tdiode@28.5C PMIC@100C thermal@30.7C VDD_IN 1108/1190 VDD_CPU 152/177 VDD_GPU 152/152 VDD_SOC 305/343 VDD_WIFI 0/0 VDD_DDR 228/234
RAM 868/7658MB (lfb 1490x4MB) CPU [1%@345,off,off,0%@345,0%@346,0%@345] EMC_FREQ 21%@102 GR3D_FREQ 0%@114 APE 150 BCPU@31.5C MCPU@31.5C GPU@29.5C PLL@31.5C Tboard@28C Tdiode@28.75C PMIC@100C thermal@30.7C VDD_IN 1108/1178 VDD_CPU 152/174 VDD_GPU 152/152 VDD_SOC 305/338 VDD_WIFI 0/0 VDD_DDR 228/233

OLD TX2 32.5

RAM 1203/7850MB (lfb 1398x4MB) SWAP 0/3925MB (cached 0MB) CPU [3%@345,off,off,2%@345,0%@345,0%@345] EMC_FREQ 4%@408 GR3D_FREQ 0%@114 VIC_FREQ 0%@115 APE 150 PLL@36.5C MCPU@36.5C PMIC@100C Tboard@33C GPU@35.5C BCPU@36.5C thermal@36.1C Tdiode@35C VDD_SYS_GPU 152/152 VDD_SYS_SOC 458/458 VDD_4V0_WIFI 0/0 VDD_IN 1757/1757 VDD_SYS_CPU 229/229 VDD_SYS_DDR 326/326
RAM 1203/7850MB (lfb 1398x4MB) SWAP 0/3925MB (cached 0MB) CPU [1%@345,off,off,0%@345,0%@345,0%@345] EMC_FREQ 3%@408 GR3D_FREQ 0%@114 VIC_FREQ 0%@115 APE 150 PLL@36.5C MCPU@36.5C PMIC@100C Tboard@33C GPU@35.5C BCPU@36.5C thermal@36.4C Tdiode@35C VDD_SYS_GPU 152/152 VDD_SYS_SOC 458/458 VDD_4V0_WIFI 0/0 VDD_IN 1719/1738 VDD_SYS_CPU 229/229 VDD_SYS_DDR 307/316
RAM 1211/7850MB (lfb 1398x4MB) SWAP 0/3925MB (cached 0MB) CPU [7%@345,off,off,3%@345,2%@345,13%@345] EMC_FREQ 4%@408 GR3D_FREQ 0%@114 VIC_FREQ 0%@115 APE 150 PLL@36.5C MCPU@36.5C PMIC@100C Tboard@33C GPU@35.5C BCPU@36.5C thermal@36.1C Tdiode@35C VDD_SYS_GPU 152/152 VDD_SYS_SOC 458/458 VDD_4V0_WIFI 0/0 VDD_IN 1833/1769 VDD_SYS_CPU 229/229 VDD_SYS_DDR 345/326
RAM 1212/7850MB (lfb 1398x4MB) SWAP 0/3925MB (cached 0MB) CPU [5%@345,off,off,5%@345,4%@345,20%@345] EMC_FREQ 4%@408 GR3D_FREQ 0%@114 VIC_FREQ 0%@115 APE 150 PLL@36.5C MCPU@36.5C PMIC@100C Tboard@33C GPU@35.5C BCPU@36.5C thermal@36.3C Tdiode@35C VDD_SYS_GPU 152/152 VDD_SYS_SOC 458/458 VDD_4V0_WIFI 0/0 VDD_IN 1872/1795 VDD_SYS_CPU 229/229 VDD_SYS_DDR 364/335
RAM 1212/7850MB (lfb 1398x4MB) SWAP 0/3925MB (cached 0MB) CPU [0%@345,off,off,0%@345,0%@345,0%@345] EMC_FREQ 4%@408 GR3D_FREQ 0%@114 VIC_FREQ 0%@115 APE 150 PLL@36.5C MCPU@36.5C PMIC@100C Tboard@33C GPU@35.5C BCPU@36.5C thermal@36.1C Tdiode@35C VDD_SYS_GPU 152/152 VDD_SYS_SOC 458/458 VDD_4V0_WIFI 0/0 VDD_IN 1719/1780 VDD_SYS_CPU 229/229 VDD_SYS_DDR 326/333

NEW TX2 32.5

RAM 1226/7850MB (lfb 1376x4MB) SWAP 0/3925MB (cached 0MB) CPU [4%@345,off,off,3%@345,2%@345,4%@345] EMC_FREQ 0%@408 GR3D_FREQ 0%@114 VIC_FREQ 0%@115 APE 150 PLL@34.5C MCPU@34.5C PMIC@100C Tboard@31C GPU@33C BCPU@34.5C thermal@33.9C Tdiode@32C VDD_SYS_GPU 152/152 VDD_SYS_SOC 382/382 VDD_4V0_WIFI 0/0 VDD_IN 1146/1146 VDD_SYS_CPU 229/229 VDD_SYS_DDR 209/209
RAM 1226/7850MB (lfb 1376x4MB) SWAP 0/3925MB (cached 0MB) CPU [5%@345,off,off,0%@345,0%@345,0%@345] EMC_FREQ 0%@408 GR3D_FREQ 0%@114 VIC_FREQ 0%@115 APE 150 PLL@34.5C MCPU@34.5C PMIC@100C Tboard@31C GPU@33C BCPU@34.5C thermal@33.9C Tdiode@32C VDD_SYS_GPU 152/152 VDD_SYS_SOC 382/382 VDD_4V0_WIFI 0/0 VDD_IN 1108/1127 VDD_SYS_CPU 229/229 VDD_SYS_DDR 209/209
RAM 1226/7850MB (lfb 1376x4MB) SWAP 0/3925MB (cached 0MB) CPU [2%@345,off,off,1%@345,0%@345,0%@345] EMC_FREQ 0%@408 GR3D_FREQ 0%@114 VIC_FREQ 0%@115 APE 150 PLL@34.5C MCPU@34.5C PMIC@100C Tboard@31C GPU@33C BCPU@34.5C thermal@33.9C Tdiode@32C VDD_SYS_GPU 152/152 VDD_SYS_SOC 382/382 VDD_4V0_WIFI 0/0 VDD_IN 1069/1107 VDD_SYS_CPU 229/229 VDD_SYS_DDR 190/202
RAM 1226/7850MB (lfb 1376x4MB) SWAP 0/3925MB (cached 0MB) CPU [2%@345,off,off,0%@498,0%@498,0%@498] EMC_FREQ 0%@408 GR3D_FREQ 0%@114 VIC_FREQ 0%@115 APE 150 PLL@34.5C MCPU@34.5C PMIC@100C Tboard@31C GPU@32.5C BCPU@34.5C thermal@33.7C Tdiode@32C VDD_SYS_GPU 152/152 VDD_SYS_SOC 382/382 VDD_4V0_WIFI 0/0 VDD_IN 1108/1107 VDD_SYS_CPU 152/209 VDD_SYS_DDR 190/199
RAM 1226/7850MB (lfb 1376x4MB) SWAP 0/3925MB (cached 0MB) CPU [1%@345,off,off,1%@345,0%@345,0%@345] EMC_FREQ 0%@408 GR3D_FREQ 0%@114 VIC_FREQ 0%@115 APE 150 PLL@34.5C MCPU@34.5C PMIC@100C Tboard@31C GPU@33C BCPU@34.5C thermal@33.9C Tdiode@32C VDD_SYS_GPU 152/152 VDD_SYS_SOC 382/382 VDD_4V0_WIFI 0/0 VDD_IN 1069/1100 VDD_SYS_CPU 152/198 VDD_SYS_DDR 191/197

hello robb_n,

since new TX2 SOM has Hynix eMMC, there’s SDMMC config for speed or tuning that might be not enabled in rel-28 code-line.
could you please share bootlogs for reference, you may refer to Serial Console - NVIDIA Jetson TX2 - JetsonHacks to gather bootloader messages,
thanks

hello robb_n,

Also want to confirm that you are talking about D0x serise TX2 module affected by PCN206440 as the new modules here, right?

Hi robb_n,

Please also use below test steps to compare the result between new and old modules.

  1. apt install iozone3
  2. Serial Read/Write Perf: 
       iozone -ecI -+n -L64 -S32 -s64m -r512k -i0 -i1 -l8 -u8 -m -t8 -F /mnt/file1 /mnt/file2 /mnt/file3 /mnt/file4 /mnt/file5 /mnt/file6 /mnt/file7 /mnt/file8
   3. Random Read/Write Perf
        iozone -ecI -+n -L64 -S32 -s64m -r4k -i0 -i2 -l8 -u8 -o -m -t8 -F /mnt/file1 /mnt/file2 /mnt/file3 /mnt/file4 /mnt/file5 /mnt/file6 /mnt/file7 /mnt/file8

Hi JerryChang,

could you please share bootlogs for reference,

I’ve attached the bootlogs: tx2-bootlogs.tar.gz (94.5 KB)


Hi WayneWWW,

Also want to confirm that you are talking about D0x serise TX2 module affected by PCN206440 as the new modules here, right?

I am not 100% sure how to identify my TX2 series. There is a part number printed on the side of the SOM under the S/N. The “old TX2 SOM” ends with B02, and the “new TX2 SOM” ends with D02. The new TX2 SOMs (D02) are the ones we are observing the regressions on.

Please also use below test steps to compare the result between new and old modules.

I’ve attached the tests run with iozone3: tx2-iozone.tar.gz (2.3 KB)

I appears that iozone may test read/write performance. Read/write performance is very important to us, but so are other eMMC operations. In our case, the biggest regression observed was with the blkdiscard command, and it is an order an magnitude worse on the new TX2 SOM when running L4T 28.4.

Hi robb_n,

The blkdev discard is more of an erase operation, rather than a perf test.
Please check if this patch can help or not.

3721cce.diff.zip (1.0 KB)

Thank you, WayneWWW. The provided patch appears to resolve the regressions we were seeing with the new TX2 SOM.

For reference, here are the test results on our hardware (not the devkit):

Pre-patch:

$ time blkdiscard /dev/disk/by-partlabel/reserved 
real    0m4.345s
user    0m0.004s
sys     0m0.000s

Post-patch:

$ time blkdiscard /dev/disk/by-partlabel/reserved 
real    0m0.545s
user    0m0.000s
sys     0m0.000s
1 Like