I have three Jetson Nano 4GB A02 boards installed with the 4.6.1-2022/02/23
sdcard image. Setup static IP addresses and installed OpenMP. ssh, scp and
ping work between the systems. However, when I run the CUDA simpleMPI
example program the second node (192.168.1.52) reports:
mpiexec --hostfile …/clusterfile ./simpleMPI
Running on 12 nodes
Open MPI detected an inbound MPI TCP connection request from a peer
that appears to be part of this MPI job (i.e., it identified itself as
part of this Open MPI job), but it is from an IP address that is
unexpected. This is highly unusual.
The inbound connection has been dropped, and the peer should simply
try again with a different IP interface (i.e., the job should
hopefully be able to continue).
Local host: nano2
Local PID: 6963
Peer hostname: nano2 ([[52067,1],4])
Source IP of socket: 192.168.1.52
Known IPs of peer:
/etc/hosts on that system:
127.0.0.1 localhost
127.0.1.1 nano2
192.168.1.52 nano2
The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
192.168.1.51 nano1
192.168.1.53 nano3
Is there something I’m missing in the network setup?