Ethernet Cluster Trouble!

josiase · February 5, 2021, 8:26am

Hi all! Really hope someone can lend a hand here :)

I’m having trouble trying to build a small cluster of Nano’s (not connected to the internet). The main problem being the erratic behaviour of the network. I’m using 1 tp-link 5-port gigabit switch and two Nano’s. (I have also tried using another switch and another router)

So when I create a wired network profile and set manual IP’s (I used 10.0.0.11-12 and network mask of 255.255.255.0) pinging the other device delivers erratic responses. Sometimes it just works, the other time there is 100% packages dropped, and this morning it was around 50% dropped. Also, any attempt to ssh into either board fails.

When I tried the router, I tried using the automatic network settings. For some reason though the boards where assigned the exact same IP! I then also set some manual IP’s (which where 192.168.8.111-112) to no avail. Additionally I have tried connecting my laptop to the same router/switch, and found that pinging from the laptop followed the same pattern of working, not working, and somewhat working, with ssh never working.

If anyone can lend me some thoughts, I would be really grateful!

Josias

linuxdev · February 5, 2021, 6:22pm

It does sound odd. After some errors have occurred, can you show the output from “ifconfig” for each Jetson?

Though it probably wouldn’t be too useful right now, you could also include the output for each for the command “route” (which may become useful later). Especially important would be if this changes between manual IP setting and automatic IP setting.

josiase · February 15, 2021, 1:03pm

Hi Linuxdev, thanks for the reply! Please see attached.

Additionally, sorry for the delay in replying. I was shifted to another project last week, but I’m back on the nano’s this week.

NetworkDetails.txt (3.6 KB)

linuxdev · February 15, 2021, 5:13pm

I see that in all cases eth0 is sending and receiving packets without any issue. The route used by each side is correctly set up as well. Whatever the problem is, it seems that the ethernet is not the reason for the issue. The exception would be if the “ifconfig” output was prior to getting the errors with dropped packets. Were those logs taken after the problem showed up, or before? If before, then can you provide the same information after there has been a significant error/drop?

Honey_Patouceul · February 15, 2021, 5:31pm

I do see in your logs your both nanos showing same mac address 00:00:00:00:00:01 for wired ethernet so you may have a look to this topic.

linuxdev · February 15, 2021, 5:34pm

Good catch!!! Oddly though, I would have expected to see collisions, but that would have been true only if both systems were running at the same time.

Honey_Patouceul · February 15, 2021, 5:36pm

That puzzled me, too… I guess you’ve found a simple explanation !

josiase · February 17, 2021, 6:19am

You guys are legends. It was indeed the Mac address, and the cluster is coming together! I really appreciate the help @linuxdev and @Honey_Patouceul !