Has anyone considered clustering several nanos together like they do with raspberry pi’s. I’m wondering if there are any applications that would benefit. Or even if it would provide a good learning experience for hpc?
Sure, you could leverage the integrated CUDA GPU (up to 472 GFLOPS compute) for low-power workloads. Jetson comes with the NVIDIA libraries available on the larger HPC discrete GPU, so applications are portable.
Would recommend looking into a distributed multiprocessing framework like OpenMPI to help manage it if you plan to have lots of nodes. The primary interconnect would likely be Gigabit Ethernet.
Along these lines, have a question about hardware options for the Nano.
Approach 1 - use the Nano developer board and have some mechanism for holding the boards, power connections, Ethernet connections, etc…much like you can find for the Raspberry Pi.
Approach 2 - find, buy, build…some chassis which can interface the Nano module itself as part of a multi-module chassis. Is anyone aware of any 3rd party chassis here or developing which can house one or more Nano modules in a single, high density form factor? Not sure it would be worth the effort for this…but can see a nice way to have a chassis to take modules as plug in elements, where each module can have an Ethernet port, maybe single USB, but share a common power supply and cooling fan. PXE booting would make it complete ;-)
< and ignore the fact the dev boards are actually cheaper ;-) >
The carrier boards for NXP Colibri modules - I can’t find the module pin interface, but they have multi module boards, and support Nvidia modules (it’s unclear if they mean their own modules with Nvidia chips, or Nvidia modules, or both). Might be worth a gander.
Toradex has some older Tegra modules in DIMM format (a rather nice layout, though it lacks enough pins for newer Tegra). There is a Tegra 3 version, a TK1 version, and several other non-NVIDIA modules with the same layout capable of using similar carrier boards (each version would need its own device tree, but much of their hardware is more or less interchangeable). Along with this they have design docs for carrier boards. Mostly this is incompatible with a Nano or the newer module format Tegra products. NXP happens to be one of the popular embedded solutions available in that same DIMM format, but is unrelated to Tegra.
ConnectTech has something like this for TX1/TX2, their 24-module array server:
You can prototype the architecture using the devkits and wired switches, and then integrate it into a rack system to reduce the size.
What would be a good way to power a cluster like this? I am aware that each nano would need 5V⎓4A, but how can I distribute that to multiple Nanos with one single power supply like a desktop PSU for example? What exactly would I need?
Thinking of making a cluster made of Nanos here too.
Each nano does NOT always need 5V⎓4A, according to JetsonHacks. You can also use 5V⎓2A power option if you are powering the module only, without any peripherals like mouse,
keyboards, and displays.
Currently, I am considering to build a cluster with a configuration of 1 master Nano node, and the rest being slave Nano nodes. I will use the master Nano node to control and monitor the cluster performance, so I will need peripherals like mouse, keyboard, displays, etc attached, hence will require a power supply of 5V⎓4A. For the slave Nano nodes, I will require a bunch of 5V⎓2A, which will use micro-USB, so I will get a USB hub that each port can give 10A output. One I found is this product by Anker, and I am currently checking if this product will be suitable. If anyone can suggest other products or has opinions on the Anker product I listed here, please let me know :)
@ascala: you’ve probably long-solved this, but in case you haven’t, we’d highly recommend PoE as a method of powering your cluster. With only 5V required for each, you can easily get by with a PoE switch attached to your cluster that both networks and powers each unit. This is one we’ve used lately for the same thing with a custom pi cluster: https://www.amazon.com/gp/product/B076HZFY3F/ref=ppx_yo_dt_b_asin_title_o03_s02?ie=UTF8&psc=1
So, I actually have a 2 node nano running slurm 19.
I am having issues getting the gres.conf to recognize
the cpu’s. The python code for tensorflow seems to
work ok. Any thoughts ?