I’m working on a research grant to take a VBa program used to model energies between Isomers and a Stationary phase molecule.
The idea is to take the program from VBa to C++ and then to CUDA in order to speed up the process of gaining results. I currently have all of CUDA setup to compile correctly and run programs.
I have the program fully converted to C++ and I am beginning the step to CUDA, however I have a few questions that I can’t find answers for in the CUDA by example book or online. Also I was hoping for some suggestions on how to easily convert the program into a parallel solution.
The code is designed to run 1 Million or more iterations outputting the energy calculated and xyz/theta xyz positions. What I want to be able to do is have the program run multiple instances of the program all at once, while possibly speeding up each run through to some extent.
The program itself is currently smaller than 1000 lines, so it should not for the most part be very difficult to complete this step.
Some of these questions may be simple but I want to be sure that I am not confused before beginning.
Question 1. Is there a simple way to have CUDA spawn multiple instances of the program, each one running on separate blocks to utilize the whole system (C2050) with local variables to that instance in the end printing out the results. Basically taking the code already written and instead of just running one time 1 million iterations worth, having say 100 programs each running 1 million iterations all together.
Question 2. If I have multiple instances of the same code running in parallel how do I avoid variables getting over written, say my counter variables to keep track of the iteration that it is currently on, is the only way to solve that problem to use arrays in store the values in different memory locations?
Also any suggestions would be great to hear, as I said i’m very new to parallel programming, but have no problem picking up on new ideas or ways of working through code.
Thanks for reading!