Get low EMMC performance on SDMMC3

We are trying to port Sandisk SDINADF4-64G-H on SDMMC3 with our carrier board. The kernel we are using is L4T R24.2. We can bring it up but we found the write speed is not quiet good. For now, we try to configure the host controller to support HS200 with 4 bit :

sdhci@700b0400 { /* SDMMC3 for EMMC */
		tap-delay = <1>;
		trim-delay = <3>;
		nvidia,is-ddr-tap-delay;
		nvidia,ddr-tap-delay = <0>;
		mmc-ocr-mask = <0>;
		power-off-rail;
		max-clk-limit = <200000000>;
		ddr-clk-limit = <48000000>;
		uhs-mask = <0x5c>;  //Only leave HS200 available
		bus-width = <4>;
		built-in;
		nvidia,is-emmc;
		calib-3v3-offsets = <0x007D>;
		calib-1v8-offsets = <0x7B7B>;
		compad-vref-3v3 = <0x7>;
		compad-vref-1v8 = <0x7>;
		pll_source = "pll_p";
		nvidia,en-io-trim-volt;
		nvidia,en-periodic-calib;
		status = "disabled";
	};

but the clock rate is around 163 Mhz:

cat /sys/kernel/debug/clock/sdmmc3/rate

163200000

and the write speed is around 65MB/s with tested by :
dd bs=8M count=128 if=/dev/zero of=/home/myhome/test.txt conv=fdatasync

Is there any one have any idea why the clock rate is limited to 163 Mhz ?
I think the write speed should be increased if the clock rate can reach to 200 Mhz.

Is this tested on your own designed carrier board? If so, firstly you need to follow the layout guideline in OME Design Guide.

We’ve checked the design guide, the resistor of sdio_clk/sdio_cmd line are reverse. We’ve exchanged the resistor of CLK/CMD and test again, unfortunately, the eMMC clock frequency is still 163.2MHz.

The frequency is depend on the clock source and clock divider, the actual frequencies may be lower due to clock source/divider limiattions
Clock divider (CLK_RST_CONTROLLER_CLK_SOURCE_SDMMC1_0_ SDMMC1_CLK_DIVISOR) and PLL source (CLK_RST_CONTROLLER_CLK_SOURCE_SDMMC1_0_ SDMMC1_CLK_SRC)

Also, you could check the freqency with CMD as below (The SDMMC4(on board emmc) is on and clock is 199.68MHz during data access):
root@tegra-ubuntu:/sys/kernel/debug/clock# cat clock_tree | grep mmc
sdmmc4.emc off 0 1065600000 (150000000) sdmmc3.emc off 0 1065600000 (1600000000)
sdmmc4 on 0 1.0 199680000 sdmmc_legacy off 0 34.0 12000000 *sdmmc4_ddr off 0 9.0 45333334 sdmmc2_ddr off 0 9.0 45333334 sdmmc2 off 0 2.0 204000000
sdmmc3_ddr off 0 9.0 45333334
*sdmmc1_ddr off 0 9.0 45333334
sdmmc3 off 0 9.0 45333334 sdmmc1 off 0 4.0 102000000
sdmmc4.sclk $ off 0 12218750 (115000000)

Hi MKAO,

Did you try another PLL source as below:

pll_source = “pll_p”, “pll_c4_out2”;

Hi Trumany,

We just tried it and the clock rate can reach 199.68 MHz, and the write speed can reach ~76 MB/s. I think it do increase the performance, thanks for support.

Regards,
Marc

Hi Trumany,

After doing the stress test, we found the write speed is not stable. The write speed goes from 30MB/s to 75MB/s.

Here is the command we are using for testing speed:

 dd bs=8M count=128 if=/dev/zero of=/home/myhome/test.txt conv=fdatasync

We also found a post about the EMMC slow sequential write problem :

 https://devtalk.nvidia.com/default/topic/912497/?comment=4792354#

The post said need to set CPU freq and EMC freq to max speed also enable all CPU online before doing the IO testing.
Could you please share us how to set CPU and EMC freq to max and enable all CPU online on tx1 ? thanks.

Regards,
Marc

We’ve found a post about Maximizing TX1 Performance, we’ll try it first, thanks.

http://elinux.org/Jetson/TX1_Controlling_Performance#3.Maximize_EMC.28Memory_Controller.29

Hi,

The vendor has give us an information that the buffer inside the EMMC might be full to cause the low write speed. They want us to set a EXT_CSD[163] register value as “2h” to enable the BKOPS, so the buffer will be empty in background.
I’ve checked the device tree and didn’t see any configuration for the EXT_CSD register, could you please share us how to configure the EXT_CSD register ?

Regards,
Marc

BTW,

Will the driver support the “Auto BKOPs” function ?

kernel/drivers/mmc/core/mmc.c has support the feature for eMMC as below.

            /* check whether the eMMC card supports BKOPS */
            if (ext_csd[EXT_CSD_BKOPS_SUPPORT] & 0x1) {
                    card->ext_csd.bkops = 1;
                    card->ext_csd.bkops_en = ext_csd[EXT_CSD_BKOPS_EN];
                    card->ext_csd.raw_bkops_status =
                            ext_csd[EXT_CSD_BKOPS_STATUS];
                    if (!card->ext_csd.bkops_en)                                                                                                                                                                 
                            pr_info("%s: BKOPS_EN bit is not set\n",
                                    mmc_hostname(card->host));
            }

Hi Vicky,

I just print out the value of “card->ext_csd.bkops_en”, and it’s “1h”. Is there any way can set it to “2h” ?

Regards,
Marc

I’m not familiar with EXT_CSD[163] but I believe you can just check if mmc_start_bkops() is taking effect in kernel/drivers/mmc/core/core.c or not.

We’ve checked following code that related to BKOPs, and none of these are invoked during the read/write of EMMC.
Do you have any idea when will these condition be satisfied ?

In"/drivers/mmc/card/block.c"

if (brq->cmd.resp[0] & EXT_CSD_URGENT_BKOPS)
		mmc_card_set_need_bkops(card);

In “/drivers/mmc/card/queue.c”

if (mmc_card_need_bkops(mq->card))
          mmc_start_bkops(mq->card, true);

In “/drivers/mmc/core/core.c”

/*
		 * Check BKOPS urgency for each R1 response
		 */
		if (host->card && mmc_card_mmc(host->card) &&
		    ((mmc_resp_type(host->areq->mrq->cmd) == MMC_RSP_R1) ||
		     (mmc_resp_type(host->areq->mrq->cmd) == MMC_RSP_R1B)) &&
		    (host->areq->mrq->cmd->resp[0] & R1_EXCEPTION_EVENT))
			mmc_start_bkops(host->card, true);

Which EMMC are you talking about? Does it support BKOPS?

We are using “SDINADF4-64G-H” and it needs BKOPs to clean up cache to resolve the performance drop issue.

BTW,
I would like to attach a image to describe the BKOPs requirement in data sheet, but I don’t know how to do that ?
emmc_bkops.png

hello MKAO,

you’re able to attach the files by “EDIT” your own comment,
there’s a needle pin icon at the top right of the comment for you to upload the attachment.
thanks

Thanks Jerry, I can successfully attache the image.

FYI.

I just found a document that briefly introduce the EMMC SLC cache and the driver support they expected (P22) :
http://events.linuxfoundation.org/sites/events/files/slides/Storage_Alignment_To_System_Behaviour_0_0.pdf

We also see the software feature list of R24.1 has mentioned the BKOPS only available on SDMMC4, could you please also help confirm whether the BKOPS is available on SDMMC3 or not ? thanks.

http://developer.download.nvidia.com/embedded/L4T/r24_Release_v1.0/Docs/Tegra_Linux_Driver_Package_Software_Feature_List_24.1_Updated.pdf?autho=1484788406_fcf2dad150466861b2a7e75642ad1341&file=Tegra_Linux_Driver_Package_Software_Feature_List_24.1_Updated.pdf