Failed to poll MAC Software reset原因分析和问题解决

I encountered this issue while debugging ORIN’s 2310000 Ethernet RGMII and repeatedly searched forums and websites without finding the cause. After communicating and debugging with the switch manufacturer, I finally found the cause of the problem. It is because the switch chip needs to be initialized, and after the initialization of the switch is completed, an RX clock will be issued. The MAC driver of ORIN needs the RX clock of the switch to complete the initialization correctly, and a TX clock will be issued. MAC RESET FAIL means that the switch does not have an RX clock.

1 Like

Thanks for your sharing to the community!

Check if the pinmux setting is correct as the document mentioned.

I have now correctly configured Pinmux and can also measure TX-CLK and RX-CLK. I can use a PC to ping ORIN and also measure RX data and RX-CTL. However, no matter how I adjust the device tree, I always cannot measure TX-CTL and TX data. That is to say, when ORIN ping a PC or reversing, it is in a pinging failure state, and I don’t know where the problem lies





这是我的所有配置了

Compare the dtsi file but not the pinmux spreadsheet.

https://docs.nvidia.com/jetson/archives/r35.5.0/DeveloperGuide/HR/JetsonModuleAdaptationAndBringUp/JetsonAgxOrinSeries.html?highlight=rgmii#for-rgmii

I have completely tried according to the configuration in the document, and my device is now in MAC to MAC mode without a PHY chip in between, so I configured it as fixed link in the device tree. Does ORIN have an operation strategy for configuring tx delay=2ns? The switch manufacturer said that delay needs to be set on the TX end

我的问题解决了,可以ping通了,且测试已经达到千兆。配置没有改,还是跟前面的图中配置的一致。就是调试时将驱动中ether_linux.c的ether_open函数中的phy_start注释掉了,在最初调试时,以为是MAC对MAC不需要PHY,不知道fixed-link是虚拟了一个PHY出来。

总结:

1、mac reset fail是因为没有RX时钟,交换机的问题,orin的MAC驱动的启动必须依赖switch能够正常启动且发出RX时钟,orin的MAC的TX时钟依赖交换机的RX时钟

2、使用fixed-link时,orin的网卡驱动不需要改任何内容,PHY的初始化程序不能注释

3、千兆网ping不同的问题,是需要orin或switch加tx、rx延时

我仍然有个疑问,就是MDIO的MDC时钟没有调试出时钟,不知道是不是MDIO的驱动必须依赖有实际的PHY,而我现在的MAC对MAC的情况就不能够正常驱动MDIO

1 Like

就是调试时将驱动中ether_linux.c的ether_open函数中的phy_start注释掉了,在最初调试时,以为是MAC对MAC不需要PHY,不知道fixed-link是虚拟了一个PHY出来。

請問一下你是說你之前把phy_start不小心拿掉了還是說你現在把phy_start拿掉?

下面我附上英文的解释,方便其他开发者使用。
My problem has been resolved and I can ping it, and the test network speed has reached gigabit. The configuration file has not been changed, and it is still consistent with the configuration in the previous figure. It means commenting out the phy_start function of the ether_open function in the ether_linux.c file of the driver during debugging. At the initial debugging stage, I thought that MAC didn’t need a PHY for MAC, and I didn’t know that the fixed link was a virtual PHY. The switch I am using is JL6107SC (manufactured by JLSemii)
Summary:

  1. The reason for the MAC reset failure is due to the lack of RX clock, which is a problem with the switch. The startup of the ORIN MAC driver must rely on the switch being able to start normally and emit an RX clock. The TX clock of the ORIN MAC depends on the RX clock of the switch.
  2. When using fixed link, the ORIN network card driver does not need to change anything, and the PHY initialization program cannot be annotated.
  3. The problem with different gigabit network pings is that it requires an orin or switch with tx or rx delay, depending on the switch.
    I still have a question, which is that SMI (MDIO/MDC) has not yet debugged the MDC clock! I don’t know if the MDIO driver must rely on the actual PHY, and my current MAC to MAC situation cannot drive MDIO normally.
1 Like

是之前调试时,以为MAC to MAC的方式不再使用PHY,而orin的驱动中有PHY的初始化部分,需要改驱动,就特意把phy_start给注释掉了,防止报phy错误。现在证明原来的想法是错的!

1 Like

我仍然有个疑问,就是MDIO的MDC时钟没有调试出时钟,不知道是不是MDIO的驱动必须依赖有实际的PHY,而我现在的MAC对MAC的情况就不能够正常驱动MDIO

請問現在MDIO那個node是不是也從DT中移除掉了?

MDIO的node是否移除都不会造成影响了,看起来像是fixed-link只要配置上,就会导致mdio的配置失效;phy-hande和fixed-link不能共存,phy-hande会令fixed-link失效,然后启动时报find phy addr err。我反复配置,想寻找可以是fixed-link和mdio可以同时生效的方式,结果都是失败。我现在是使用GPIO模拟MDIO的方式对交换机完成的初始化,就是GPIO的时钟频率太慢,只有5KHZ。

1 Like

还有,其他开发者看到这个帖子,遇到这个问题时,可以先在core_common.c的poll_check函数中,先将
if ((*value & bit_check) == OSI_NONE) {
cond = COND_MET;
} else {
osi_core->osd_ops.udelay(OSI_DELAY_1000US);
}
改为 cond = COND_MET;
先屏蔽掉这个错误,然后orin的MAC驱动就能启动成功,这时就会有TX时钟产生了,等调试好交换机,测量RX时钟、RX_CTL、RC_DATA都正常之后,再执行ifconfig eth0 down & ifconfig eth0 up,重新加载orin的MAC驱动,也是可以成功的。这样做的目的可以解决交换机初始化依赖orin的TX时钟的情况。

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.