Replacing Ethernet PHY in JAX vs JAXi

Hi all,
@WayneWWW
We have replaced the AGX PHY with ksz switch in our custom board. The switch is connected through SPI to the SoM and we are loading the driver manually using modprobe during the bootup (Using a systemd service).

We are able to load the Ethernet switch driver with JAX at the bootup in the above mention method. But when it comes to JAXi, we are seeing some kernel panic while loading the Ethernet driver.

As a work around, we are able to load the Ethernet driver with a delay of around 60sec during bootup.

What is the difference between JAX and JAXi which causes this issue?

Thanks and Regards
Ann

So is this switch on EQOS or it is on SPI?

If this is on SPI, I won’t use the word “replace” since there is no PHY on the SPI from the beginning…

The switch is connected to the EQOS and the initial register configuration of the PHY layer is done through SPI lines.

You can compare the pinmux of the SPI and see if any difference.

I compared the pinmux of the commercial and industrial, both looks similar.

Could you please help to figure out what could be the issue for this Ethernet switch between the commercial and industrial?

Meanwhile I found a extra dts file (tegra194-p2888-0008-p2822-0000.dtsi) in galen-industrial which contains some additional SPI nodes.

Does this cause any delay for loading the SPI ?

Sorry that I don’t know. We have no experience and environment to investigate that switch.

You can compare that dts, remove that and see if that affects.

If I remove the tegra194-p2888-0008-p2822-0000.dtsi contents, which is not being a part of the galen, then I am facing some boot issues.

It would be helpful if you could tell, what is the main difference between the commercial and industrial Jetson AGX SoMs

You can check the comparison document first.

I have gone through the below document, since the above document was showing an error. https://developer.download.nvidia.com/assets/embedded/secure/jetson/xavier/docs/Jetson_AGX_Xavier_Series_Comparison_Migration_AN_DA-10566-001_v1.2.pdf?EU4rzv5KCRR1ly0bPNKaasE_kLyXKjXKVuhm-y_y2Ewh4ikHmrLLnU98IqdzBPNYDFU5HXty4kn7HerKtp27iTw6LtYaZKQqsl6WC2x1EBwgfZQW9Kgn_sFiNKh5vFOyqE8ppbjzf5raF6G0PJhSaMiKi8MTLSUhaxOXfb_pbaDgcG6YjOi76s5DFCPwZ_fhZG7rGHYZYYxiip_A7z6ueMOSiWuMG6EfsnVqftjDvA&t=eyJscyI6ImdzZW8iLCJsc2QiOiJodHRwczpcL1wvd3d3Lmdvb2dsZS5jb21cLyJ9

After some workarounds I am able to make the Ethernet switch work for the industrial SoM.

As a workaround I have added a delay of 1 min for the driver to load for the AGX Industrial SoM. Then the Ethernet Switch is working fine.

Could you please help me why this delay is needed for the driver to load in the AGX Industrial SoM, which is not required for the AGX commercial SoM.

Sorry that we cannot help you since we really don’t have the same PHY/switch as your kind.

I would suggest you can at least share us the error log you see. It will be better than keep asking me “can you help? can you help?”.

Below is the error we are getting related to the Ethernet Switch with the AGX Industrail SoM

[ 7.747497] nvgpu: 17000000.gv11b tpc_pg_mask_store:843 [INFO] no value change, same mask already set
[ 7.762305] ksz9896_dsa: probe() started
[ 7.762367] ksz9897_dsa: ksz_switch_register enter
[ 7.762589] ERROR: could not get clock /spi@c260000:osc(2)
[ 7.763365] ERROR: could not get clock /spi@c260000:osc(2)
[ 7.764245] spi-tegra114 c260000.spi: CpuXfer ERROR bit set 0x8400044
[ 7.764393] spi-tegra114 c260000.spi: CpuXfer 0x43e00827:0x00000000
[ 7.764516] spi-tegra114 c260000.spi: SPI_ERR: CMD_0: 0x43e01827, FIFO_STS: 0x08400044
[ 7.764661] spi-tegra114 c260000.spi: SPI_ERR: DMA_CTL: 0x00000000, TRANS_STS: 0x00ff0040
[ 7.764817] spi-tegra114 c260000.spi: Error in Transfer
[ 7.764945] spi_master spi1: failed to transfer one message from queue
[ 7.765062] retuen value…-5
[ 7.765066] ksz9896_dsa:chip id: ffffffc6
[ 7.765133] ksz9477-switch: probe of spi1.0 failed with error -22

Also I have attached the complete log
complete_log.txt (89.6 KB)

So it is indeed coming from SPI. Which jetpack release version is in use?

Is this “could not get clock /spi@c260000:osc(2)” error still there when your driver can load up correctly?

Jetpack Version is 32.6

Ya the error comes when we load the driver correctly also.

Also while probing the SPI Chip Select line for the Ethernet Switch I have observed some difference between Industrial and Commercial SoMs. The initial state of the chip select line in Industrial SoM is high while it is low for the commercial SoM.

Hi,

Could you flash your board with jetson-agx-xavier-ind-noecc.conf and see if the issue would be gone or not?

Thanks for the reply as always.

After commenting the below lines from the exiting .conf file the Ethernet driver is loading properly

EMMC_BCT=“tegra194-mb1-bct-memcfg-4x-derated-ecc-p2888.cfg”;
MISC_CONFIG=“tegra194-mb1-bct-misc-flash-jaxi.cfg”;
MISC_COLD_BOOT_CONFIG=“tegra194-mb1-bct-misc-l4t-jaxi.cfg”;

Out of curiosity, I just want to know what these cfg files are making the difference for the Industrial SoM.

Regards
Ann Rose Antony

Hello,

I just to do some check regarding to your “what is difference between AGX and AGXI”. With the help of the error log, it turns out there is a difference in SPI.

On AGXi, there is a SCE-FW using SPI2 controller before kernel is ready. Thus, after kernel is up, you need to wait for a while so that kernel can configure it back to the state which kernel can use.