We are going to buy new deep learning hardware. To accommodate multiple users, the ideal setup would be 4 individual machines with A100 GPU on each. When occasionally we need to train a very large model, we would like to use fast infiniband for distributed training (if this is possible between 4 desktops)
Thank you for posting your inquiry on the NVIDIA Networking Community.
Based on the information provided, we do see no issues in facilitating 50GbE (Ethernet) or 100Gb/s (InfiniBand) on this setup, as long as the system board can facilitate PCIe Gen3 x16 slots to house them.
For the overall solutions, we do recommend to open a support ticket with NVIDIA Enterprise support, to have the Sales team validate the configuration based on your requirements. Most important aspect related to desktop stations is the cooling for the components.
Majority of our certification is only done on server-grade platforms.
You can send an email to firstname.lastname@example.org
Thank you and regards,
~NVIDIA Networking Technical Support