I’m considering rewriting some of my ecological modelling code into a CUDA accelerated format.
However, I’m wondering if any speed gains would be significant given my setup, some advice would be welcome.
To sketch my setup:
Ecological model drivers usually come in some gridded format (.nc / .netcdf / .hdf) with a z-axis as time (x, y as space). These data are fed into a function (model) and the output is either a vector(s) of length (z), or some summary statistics (<z). This is a rather embarrassingly parallel problem.
However, past experience has led me to believe that if the model is rather simple the bottleneck of this model setup sits at reading / writing the data from / to disk. For example, when doing an uncertainty analysis I hardly see a speed penalty for running my model 30 times with different parameters (when not reading in the same data 30 times).
My current workflow looks like this:
- read in data (a,b,c,d) chunk to memory -> this is slow!
- execute function to get results ( results = f(a,b,c,d) ) -> this could be GPU based
- write results and free memory (writing this down I realized that I can do this asynchronously I guess)
- move to next chunk
So, before I take the CUDA plunge I would want to know how I can resolve the disk reading / writing I/O problem, and optimize it in such a way that I can keep feeding the GPU enough data.
Any help would be appreciated.