Setting up a GPU Cluster

Hi everyone! I was hoping to get some advice on mounting a "personal HPC”, but I am far from being a tech guru so some help would be awesome.

I am trying to develop a code for fluid dynamics simulations in python. I’m doing this on my own, so the idea was to set up a small cluster with 2 computers (8 or 12 cores). Doing some research on parallelizing a python code, I came up with CUDA. After reading a little bit and getting to now that GPUs are being used for scientific research and fluid dynamics simulations, it started to gain my interest.

I’ve been looking up some info on the topic, but I have realized that I have an additional problem… I am from Argentina and here it is literally impossible to get certain models of Nvidia GPUs, the hardware available is kind of limited. I managed to found a couple of kits for mining bitcoins (an Asrock H110 Pro motherboard, some middle range GeForce GPUs). So, the actual question is if it’s possible to set up a cluster with this kind of hardware after all mining and simulations are mathematical operations. And if it is, what recommendations do you have for RAM, processor…

Just to round up the idea, I’m not looking forward to setting up a supercluster, but it would be nice to have the opportunity to start with one GPU and increase the number of GPUs with time (and obviously money).

I don’t see why not. for highly salable codes (like mine, http://mcx.space/gpubench/ ), a cluster of mining rigs would be ideal - it is relatively low cost and high GPU core density.

I have a GPU server running on a mining motherboard with 12 PCIEx1 slots. I put 12x 1080Ti on it for Monte Carlo simulation, got almost linear scale up. My code does not need double precision, so consumer grade GPU works fine for me. Total cost was about $9000 back in last year including all GPUs and 3x 1200W power supplies.

The issue with your clustered simulation is that you will need low latency, high bandwidth links between your nodes. Because most likely each of your GPUs will be simulating just a small volume of your larger fluid simulation - wherever these volumes touch you will have to exchange the state of the simulation between the neighboring GPUs before doing the next iterative step.

The professional supercomputer clusters use interconnects like Infiniband, and internally in the computers it’s high bandwidth links like the proprietary nVLink or the more powerful NVSwitch networking fabric.

Without going into the super high price range, you could consider 10Gig Ethernet networking to interconnect your nodes. But your GPUs within a single compute node might be bottlenecked by having to talk to each other through the PCIe bus.

Another limiting factor may be the fact that consumer GPUs simply do not have enough RAM to do larger simulations. A memory of 32GB like found in the professional Quadro and Tesla compute cards would lower the number of required GPUs significantly and will also lower the communication overhead for a simulation of a given size. But it may also be cost prohibitive (e.g. the Quadro GV100 GPU costs more than 8000 Euros where I live).