What did we do:
On Jetson AGX Xavier board, we replaced the original Ethernet PHY 88E1512PB2 with Realtek’s RTL8211FI-CG, which is the same as Jetson Xavier NX reference design.
HW connections are kept same as AGX Xavier, while SW side adds following device tree configuration (from Xavier NX):
ether_qos@2490000 {
nvidia,phy-reset-post-delay = <224>;
nvidia,phy-reset-duration = <10000>;
mdio {
compatible = “nvidia,eqos-mdio”; #address-cells = <1>; #size-cells = <0>;
phy0: ethernet-phy@0 {
reg = <1>;
};
};
};
What is the issue
The board failed to run the function eqos_open.
Add comments (in bold) in original code to show what happened:
static int eqos_open(struct net_device *dev)
{
…
pr_info(“–>eqos_open\n”); // See this message in log
......
/* Reset the PHY */
if (gpio_is_valid(pdata->phy_reset_gpio)) {
gpio_set_value(pdata->phy_reset_gpio, 0);
usleep_range(pdata->phy_reset_duration,
pdata->phy_reset_duration + 1);
gpio_set_value(pdata->phy_reset_gpio, 1);
msleep(pdata->phy_reset_post_delay);
} **// Find GPIO went low and high, and 10ms duration from Oscilloscope**
ret = eqos_clock_enable(pdata);
if (ret) {
dev_err(&dev->dev, "failed to enable clocks\n"); // Not see this error message
return ret;
}
/* issue CAR reset to device */
ret = hw_if->car_reset(pdata);
if (ret < 0) {
ret = -ENODEV;
dev_err(&dev->dev, "Failed to reset MAC\n"); **// See this error message**
goto err_mac_rst;
}
......
}
On the other hand, eqos_car_reset in the function eqos_probe works fine, returns 0, and can even read back RTL8211 device ID 0x1cc916 through MDIO.
What are the questions
From the code of eqos_car_reset, function reset_control_reset will send bpmp reset message id 17 (<bpmp_resets 17U>) first, wait for 10 usec, and then check address “eqos_base_addr + 0x1000” (0x2491000) bit 0.
So, questions are:
What does bpmp processor do after receiving reset message id 17?
What does register at “eqos_base_addr + 0x1000” (0x2491000) bit 0 mean?
The interesting thing is, I can see eqos_proble works well, while eqos_open fails always.
And eqos_open stops at function eqos_car_reset, which sends mail to bpmp and then waits for 1 bit, as explained in the first message. However, seems bpmp code is not released, and eqos manual is not either.
The file shared (tegra194-p2888-0001-p2822-0000.dtb.dts.tmp) is exactly dts file.
In a word, it is exactly the same as original dts (tegra194-p2888-0001-p2822-0000), just add following, like mentioned at the first day.
ether_qos@2490000 {
nvidia,phy-reset-post-delay = <224>;
nvidia,phy-reset-duration = <10000>;
mdio {
compatible = “nvidia,eqos-mdio”; #address-cells = <1>; #size-cells = <0>;
phy0: ethernet-phy@0 {
reg = <1>;
};
};
};
BTW, anything else you may need? It is not so efficient to ask such questions 1 by 1.
How about have a quick check on what bpmp does after receiving <bpmp_resets 17U> in function eqos_car_reset?
We are blocked since eqos_car_reset returns -1, but we did not change any code in it, nor did we change the device tree related with it.
Sorry that it seems you didn’t get my point.
My dts file is exactly same as the original 32.7.2, guess you may have access to Nvidia’s own dts file? Thanks.
If you still don’t understand what I am talking about, please tell.
I don’t want to argue about that. Can you just use dtc tool to convert your dtb back to dts and attach that dts ?
This is to make sure the dts content from you. I don’t know if you make any minor change to it. This could prevent any diff between what I see on my side and what you are doing on your side.
Let me explain why we do not feel so comfortable: if you really checked our questions, you might find they did not rely on device tree at all.
2 questions are listed at the first mail.
The 1st question is on user manual, which has nothing to do with device tree: What does register at “eqos_base_addr + 0x1000” (0x2491000) bit 0 mean?
The 2nd question is on bpmp’s behaviour: what does bpmp processor do after receiving reset message id 17? We have confirmed the meesage ID 17 already.
How about just spend some time, and answer straightforward.
This is just a debug procedure here and we are just following that procedure first to provide necessary info.
I am not able to reply your question directly right now. But I will deliver this info to internal team and they will decide whether what you are asking is the right direction for debugging this issue or not.
For now, we just need you to provide device tree and full dmesg. Thank you in advance if you are willing to share them.