Slow SD card access speed (read+write) with Jetson Nano production module

Hi,

A bit of the background: We are using Jetson Nano production modules for our project. We use custom carrier boards for flashing the Nanos, which do not have an HDMI support, nor any other way to verify serial logs at startup/any time. The kernel for Jetson Nano did not initially have support for SD card. From this forum, I managed to get information to generate the patch for the SD card support. But after applying this patch, it was observed that the boot time was slow. To fix this, again from the information obtained on this forum, I created a patch in which I disable CRC checks for SD card and also HDMI support (as we are not using it). Besides this, I also pass the partition on which root is to be found (/dev/mmcblk0p1) as command line argument to the kernel via a conf.common file before flashing. Please find attached the two patches.

The problem we are now facing is that the reads and writes from and to the SD card are extremely slow on the Nano. Gnome-disks benchmark test reported a (surprisingly constant) speed of 1.5 MBps for read and write. The SD cards we use are SanDisk Extreme microSD UHS-I V30, for which on an Ubuntu desktop, the read and write speeds are always above 45 MBps. Please find below the output of the command: cat /sys/kernel/debug/mmc1 when run on the Nano:

clock:		50000000 Hz
vdd:		21 (3.3 ~ 3.4 V)
bus mode:	2 (push-pull)
chip select:	0 (don't care)
power mode:	2 (on)
bus width:	2 (4 bits)
timing spec:	2 (sd high-speed)
signal voltage:	0 (3.30 V)
driver type:	0 (driver type B)

Any pointers regarding this? It seems there is a (undesired) cap on the read/write speeds to the SD card. The SD card device is located at /dev/mmcblk1. We use an ext4 partitioning scheme.

Can the swap memory have something to do with this?
patches.zip (2.09 KB)

Hi jetson_user,

I think we should file 2 separate issues

  1. Why does sdcard enabled would cause boot time regression?
    → Why do you choose to disable HDMI? (If you filed some forum topics before, please remind me…)

  2. Sdcard speed test: Could you share how you install and use sdcard benchmark? Have you tried different cards?

  1. There were CRC error checks for SD card data integrity, during the boot, which were failing, due to which the Nano was resetting at boot, prolonging the boot time to 2 min 45 seconds +.
    The discussion is available here:
    https://devtalk.nvidia.com/default/topic/1065480/jetson-nano-production-module-takes-long-time-to-boot-when-sd-card-is-inserted/?offset=41#5412808

Also, on the above page you can find why HDMI was disabled: We don’t use HDMI and there was constant probing for HDMI hardware leading to I2C timeouts, resulting in further boot delay.

  1. I used a pre-installed utility; gnome-disks, available for Ubuntu, hence directly available on the Nano with the installed OS. After running the utility as:
sudo gnome-disks

I select the disk (the 64 GB SD card) and click on “Start Benchmarking” after choosing the default data sample size and rate. It then shows the result in graphical format.
I have tried other SD cards as well, same result. I also tried this same SD card on Ubuntu 16.04 desktop which is running fine (45 MBps +).

Additional notes: Among the attached two patches, I also experimented with keeping just the first patch (sd_card.patch) fearing that the other patch (fast_boot.patch) might have something to do with the delay. But I get the same result, i.e., lower access speeds.

I tried disabling swap partitions (zram*) as well, still no effect.

Hi jetson_user,

I got it. HDMI issue does not matter here. We only need to resolve

  1. why there is CRC error and cause reboot.
  2. low read/write speed.

Please enable some debug message as this patch

https://devtalk.nvidia.com/default/topic/1067459/jetson-nano/sd-card-not-detected/post/5408253/#5408253

and then share your dmesg with us. In this test, please make sure you have sdcard connected.

Also, you could try different sdcards too.

Hi WayneWWW,

Thank you for the information. I am now trying this out, but meanwhile, could you please let me know if uhs-mask has got anything to do with it? If you look at the sd_card.patch patch file that I am using to enable SD card support, the uhs-mask is set at two places:

diff --git a/hardware/nvidia/platform/t210/common/kernel-dts/t210-common-platforms/tegra210-p2530-common.dtsi b/hardware/nvidia/platform/t210/common/kernel-dts/t210-common-platforms/tegra210-p2530-common.dtsi
--- a/hardware/nvidia/platform/t210/common/kernel-dts/t210-common-platforms/tegra210-p2530-common.dtsi	2019-09-18 13:21:25.000000000 +0200
+++ b/hardware/nvidia/platform/t210/common/kernel-dts/t210-common-platforms/tegra210-p2530-common.dtsi	2019-09-18 13:26:22.723470000 +0200
@@ -131,7 +131,7 @@
 		uhs-mask = <0x1c>;
 		power-off-rail;
 		nvidia,update-pinctrl-settings;
-		status = "disabled";
+		status = "okay";
 	};
 
 	sdhci@700b0200 {
diff --git a/hardware/nvidia/platform/t210/porg/kernel-dts/porg-plugin-manager/tegra210-porg-plugin-manager.dtsi b/hardware/nvidia/platform/t210/porg/kernel-dts/porg-plugin-manager/tegra210-porg-plugin-manager.dtsi
--- a/hardware/nvidia/platform/t210/porg/kernel-dts/porg-plugin-manager/tegra210-porg-plugin-manager.dtsi	2019-09-18 13:21:25.000000000 +0200
+++ b/hardware/nvidia/platform/t210/porg/kernel-dts/porg-plugin-manager/tegra210-porg-plugin-manager.dtsi	2019-09-18 13:28:08.251948000 +0200
@@ -313,7 +313,8 @@
 			override@1 {
 				target = <&sdhci2>;
 				_overlay_ {
-					vmmc-supply = <&max77620_ldo6>;
+					status = "okay";
+					vqmmc-supply = <&max77620_ldo6>;
 					no-sdio;
 					no-mmc;
 					sd-uhs-sdr104;
diff --git a/hardware/nvidia/platform/t210/porg/kernel-dts/tegra210-porg-p3448-common.dtsi b/hardware/nvidia/platform/t210/porg/kernel-dts/tegra210-porg-p3448-common.dtsi
--- a/hardware/nvidia/platform/t210/porg/kernel-dts/tegra210-porg-p3448-common.dtsi	2019-09-18 13:21:25.000000000 +0200
+++ b/hardware/nvidia/platform/t210/porg/kernel-dts/tegra210-porg-p3448-common.dtsi	2019-09-18 13:30:00.344454000 +0200
@@ -250,9 +250,14 @@
 	};
 
 	sdhci@700b0400 {
-		status = "disabled";
+		status = "okay";
 		/delete-property/ keep-power-in-suspend;
 		/delete-property/ non-removable;
+		mmc-ddr-1_8v;
+		mmc-ocr-mask = <3>;
+		uhs-mask = <0x0>;
+		max-clk-limit = <400000>;
+		tap-delay = <3>;
 	};
 
 	sdhci@700b0200 { /* SDMMC2 for Wifi */

I did have a look inside this file: nvidia/nvidia_sdk/JetPack_4.2.1_Linux_GA_P3448-0020/Linux_for_Tegra/sources/kernel/kernel-4.9/Documentation/devicetree/bindings/mmc/sdhci-tegra.txt,
and the possible values are not those which are used in the patch.

Hi WayneWWW,

I managed to get the dmesg output with the patch you provided, please find attached the logs.
dmesg.txt (55.8 KB)

Hi jetson_user,

The patch I shared was for CRC error but I don’t see any CRC error in your dmesg… What is the exact problem you have now? If you use any workaround to bypass, please let us know.

Hi WayneWWW,

I do not use any workaround, and I am also surprised to see no CRC errors this time. Out of the two patches I provided, I applied only the first one (sd_card.patch) to simply support SD card, then I applied the debug patch you provided. It was the second patch I provided (fast_boot.patch) which was supposed to solve the CRC errors. But this is not the main concern now.

The main problem, as I mentioned is that the SD card access is really slow. When benchmarked, as I mentioned, the speed (read and write) to this SD card was capped to 1.5 MBps, and on graph it showed this value as constant over 100 data samples (this number does not matter as long as it is significantly larger than 1). This gives me the feeling that somehow the speed is being forced to 1.5 MBps when it can go higher. I need to know how to debug this, as this is a very strange occurrence, at least for me. We need at least 40 MBps read + write speeds in order that our system functions properly, and with this category of SD cards, I am sure it is possible. I also changed the SD card (of a different category) but I still get this constant speed.

Could you please also have a look at my other questions above?

  1. Could the uhs-mask have got something to do with this.

  2. Could the ZRAMs have something to do with this access speed limitation? I switched off these swaps, still the speed did not improve.

Hi WayneWWW,

Any updates on this one yet ? I even switched to public sources 32.3.1 from the version 32.2 that I was using till now. Still the same issue.

Hi,

Still working on it. Will update to you once having a solution.

Thanks for your patience.

Hi jetson_user,

Could you directly convert the dtb file to dts file and share it here?

Could you confirm if you are using setting uhs-mask=<0x0>? This setting would not block any speed class.
Also, if the following dt properties are present
cap-mmc-highspeed;
cap-sd-highspeed;
sd-uhs-sdr104;
sd-uhs-sdr50;
sd-uhs-sdr25;
sd-uhs-sdr12;

Hi WayneWWW,

Thank you for your response.

I checked, these values are set in the device tree. I generated the dts file from the /boot/dtb/tegra210-p3448-0002-p3449-0000-b00.dtb file.

Please find attached.
tegra210-p3448-0002-p3449-0000-b00.dts.zip (38.2 KB)

Another question:

In order to carry out multiple such experiments in shorter amount of time, if there a way I could update the device tree binary without flashing? For instance, using dtc to generate dtb from .dts/.dtsi files? I would be able to come back with the results of the suggested tests quicker then.

Could I get to know the exact steps to be followed if such a thing is feasible?

Hi jetson_user,

You didn’t put below in your sdmmc3 controller.

sd-uhs-sdr104;
sd-uhs-sdr50;
sd-uhs-sdr25;
sd-uhs-sdr12;

Could I get to know the exact steps to be followed if such a thing is feasible?

This way would not be faster since the dtb needs to be signed before flashing into partition and the only way to sign the dtb is through the host tool.

Thus I would suggest you directly use “sudo ./flash -r -k DTB mmcblk0p1”

“-r” means you don’t need to re-create the system image.
“-k” means only update specific partition.

Hi WayneWWW,

I edited my patches to include the above uhs data. I can now see sd-uhs-sdrxx under sdhci@700b0400, which I suppose is the SDMMC controller. Please find attached the patches and the updated dts, generated from /boot/dtb/tegra210-p3448-0002-p3449-0000-b00.dtb file. Could you please verify these? I tried to follow as you suggested.

I still get the SD card speed as 1.5MBps, unfortunately. Is it possible you could reproduce this issue at your end and test once more?

Regarding the second question,

If I update the dtb from host using the command you gave, does it mean that the partition will be flashed and that the previous data will be erased? There are other reasons we might not want this, because we would have to call back our systems from our customers in case of an update. Could this dtb update be done remotely somehow, like I could extract the dtb built on host onto the device at a particular location, without needing the micro-USB connection?
patches.zip (1.78 KB)
tegra210-p3448-0002-p3449-0000-b00.dts.zip (38.2 KB)

Hi jetson_user,

Is the sdcard log in dmesg stable all the time as #6?

Hi WayneWWW,

I notice a difference in the dmesg log after testing with the latest update (i.e., the changes I mentioned in my previous reply).

I get a lot of :

nvgpu: 57000000.gpu      gk20a_fecs_dump_falcon_stats:206  [ERR]  FECS_FALCON_REG_SP : 0xbadfbadf

I am attaching this dmesg, but what exactly should I check with regards to stability?

Also, can you reproduce this issue at your end?
dmesg_with_all_sd_mmc_speeds_enabled.txt (64.9 KB)

Hi jetson_user,

[    1.655704] mmc1: hw tuning done ...
[    1.655826] mmc1: new ultra high speed SDR104 SDXC card at address e624
[    1.656222] mmcblk1: mmc1:e624 SN64G 59.5 GiB

We can see the card enumerating in SDR104 mode (max possible mode) which was earlier enumerating in HS mode. With SDR104 mode, the card is expected to achieve recommended perf unless interface clock is restricted to a lower value.
Could you get the ios dump again with the patch in #11
cat /sys/kernel/debug/mmc1/ios
cat /sys/kerenl/debug/mmc1/clock
cat /sys/kernel/debug/mmc1/speed

Hi WayneWWW,

Please find below the outputs:

$cat /sys/kernel/debug/mmc1/ios
clock:		204000000 Hz
vdd:		21 (3.3 ~ 3.4 V)
bus mode:	2 (push-pull)
chip select:	0 (don't care)
power mode:	2 (on)
bus width:	2 (4 bits)
timing spec:	6 (sd uhs SDR104)
signal voltage:	1 (1.80 V)
driver type:	0 (driver type B)

In the above output, I see the clock speed changed from 50MHz previously to 204MHz.

$cat /sys/kernel/debug/mmc1/clock
204000000
$cat /sys/kernel/debug/mmc1/speed
0

Also, please find attached the dmesg. All these outputs and this dmesg are with the debug patch as per the link.

Further attaching the dts as well.

Looking forward to your opinion about this.
dmesg_all_sd_mmc_speeds_enabled_debug.txt (52.7 KB)
tegra210-p3448-0002-p3449-0000-b00.dts.zip (38.2 KB)

Hi jetson_user,

Do you still have slow speed sdcard read/write when the clk is 204MHz?