Hey guys, I’m currently working on enhancing an existing library my team has put together in C++ called GSAP (Generic Software Architecture for Prognostics, https://github.com/nasa/GSAP/.
I won’t get too much into the details of the library itself, but most of our prognostic problems have a model associated with them we use in monte carlo simulation. In an attempt to produce more accurate results and also speed up the GSAP models, I’m looking to use CUDA to run the particles from the simulation in parallel. As of now I’ve had success with porting one of our models over to CUDA and had fantastic performance results.
The problem here lies in the need to create separate modules for each prognostic model. This effectively doubles the work each time one of the models is updated, or a new one is created as a model will now have both GPU and CPU versions. My group has collectively decided we’re better off trying to find a way around this, ideally keeping our library to a single module per model and somehow combining the GPU and CPU models.
I’ve done a bit of looking around on this topic and the only thing I could find that was really related was this article here: http://alanpryorjr.com/2017-02-11-Flexible-CUDA/
I was able to extend this a bit and use a bash script to determine if the machine in which compilation takes place on has a CUDA enabled device, pointing to different makefile targets in which the GPU target uses compiler directives to select the proper model version. Unfortunately while this is a solution to the problem of dynamically deciding which model to use (GPU if possible, otherwise just default to CPU), it is not a solution to the issue of requiring multiple modules for each model version.
My question to you guys is this an existing issue that others have had to work around, and if so what is the recommended approach here? Hopefully what I’m looking to do has been described well enough above but if you’d like me to expand at all please let me know.