I’m getting this issue bash: IP: No such file or directory when running Step 5 of the Run NCCL communication test. I’m getting an IP6, not 4 in the return, so I’m thinking it does not know how to set the variable. How can I get the IP4 returned?
sytwo@spark-bca5:~/nccl-tests$ ip addr show enp1s0f0np0
ip addr show enp1s0f1np1
3: enp1s0f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether 4c:bb:47:2d:bc:a6 brd ff:ff:ff:ff:ff:ff
4: enp1s0f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 4c:bb:47:2d:bc:a7 brd ff:ff:ff:ff:ff:ff
sytwo@spark-bca5:~/nccl-tests$ # Set network interface environment variables (use your active interface)
export UCX_NET_DEVICES=enp1s0f1np1
export NCCL_SOCKET_IFNAME=enp1s0f1np1
export OMPI_MCA_btl_tcp_if_include=enp1s0f1np1
# Run the all_gather performance test across both nodes
mpirun -np 2 -H <IP for Node 1>:1,<IP for Node 2>:1 \
–mca plm_rsh_agent “ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no” \
-x LD_LIBRARY_PATH=$LD_LIBRARY_PATH \
$HOME/nccl-tests/build/all_gather_perf -b 16G -e 16G -f 2
bash: IP: No such file or directory