How to Build a GPU-Accelerated Research Cluster

Originally published at:

Some of the fastest computers in the world are cluster computers. A cluster is a computer system comprising two or more computers (“nodes”) connected with a high-speed network. Cluster computers can achieve higher availability, reliability, and scalability than is possible with an individual computer. With the increasing adoption of GPUs in high performance computing (HPC), NVIDIA GPUs…

hi, myself medha, Ph.D student

Working in area of publish subscribe distributed system . I am interested in building GPU accelerated research cluster for my research in the area of design of high performance pub/sub using MPI and CUDA. Can u give specification of infrastructure like node or GPU for purchase. Also I wanted to discuss with u my research area .can u help?

thanks for ur valuable post.

Thanks for your interest in building a research cluster. The basic inputs about choosing the Nodes (Workstation or server) and GPUs are given in point 1 of my blog above. You can choose either to buy any standard OEM machine or assemble any machine which fulfills the specs given. Please let me know if you have any specific questions about choosing the hardware, I will be happy to answer. Please let me know about your research area and points of discussion, I would be happy to discuss more on that.

thanks for the reply and the interest shown. I am working in the area of publish subscribe system where publishers publishes the work and subscriber subscribes the things of his interest. EXample is stock trading, where subscriber can subscribe to any stock when some conditions satisfies. Matching of subscriptions with publications is called matching algorithm . I am trying to port this pub/sub system on HPC platform, I want to perform hybrid parallelism by using MPI and CUDA . My idea is one node will do the task of clustering and send the subscriptions according to clusters formed to individual work stations. Every work node will have cuda card. Matching will be done by GPGPU. As the publications arrived , the node who does the clustering will approximately choose the node where subscriptions can be found. If this cluster is formed then I can check about latency bandwidth , MPI communications bandwidth etc,
Now my questions are:-

No one has done the porting of pub/sub system on MPI and CUDA.yet. I haven't found any IEEE paper on it. can I go with this idea of forming research cluster and deploying pub/sub system on that? or my concept is itself wrong?
I am pursuing Ph.D and my work is to make pub/sub system parallel and scalable by using HPC.
I have implemented CUDA content matching algorithm and results are promising. Now I want to make it distributed with combination of MPI and cuda.
Also I want to test this system on hadoop and storm which is event processing system. and then conclude about which architecture is suitable for pub/sub system
Pls guide me regarding this. Thanks for everything.


Please drop an email to, we can discuss in detail on that about your research work.

Will this cluster provide any acceleration for molecular dynamics (or docking) software (Amber, schrodinger, MOE etc)

If your application is getting better performance with GPUs and also scales well across nodes, cluster can help you in getting a good acceleration

I wanna build a low cost GPU +CPU cluster , im very much confused in selecting the right board . can any help?

Hi, Hung from Hong Kong.
Teacher in a middle school.

I find the link has been removed. Can you tell from where I can watch your video record and slide for your talk?



Please see recording - search at http://on-demand-gtc.gputec...

Search GTC, 2013 and with Title, it will take you a page that will show this talk and will have recording link.

Slides are at http://on-demand.gputechcon...

Get it.
Thanks for your kindness.

Chun Hung

2015年8月7日週五中國標準時間上午8:49 Disqus 的來信﹕

Can I use GeForce GTX card instead of Tesla

I Karishma Bansole.I am doing Mtech.My dessertation work in Parallel computing.I need to establish MPI-GPU Cluster.
Uptill now I have made rock cluster of one node.Now I want to add Cuda roll on rock cluster How should I do? And I dont have infiband.So I would like to know how to established MPI-GPU cluster without infiband?.Or Infiniband is needed for making the MPI-GPU Cluster

Could you please email to about your requirements. We will get back to you.

Multiple GPU's on a single node (Ex: 4-in-one/8-in-one)
one/two GPU per node.
What is the trade-off? Where does it actually make a difference?

Hi Pradeep,
Any thoughts on how to install a computing cluster for Matlab distributed computing?