Bulky File Processing using GPU


I am in middle of a problem, where 8 files of around 125MB = total 1GB are generated in three minutes. Each file contains, 250,000 records to be processed. The data is generated by the SWAT program for climate model I am working on for the Earth Science Department of IUPUI.
I know some GPU programming and know its usage, I also know that GPU is based on SIMD architecture and they are not good for heavy file/IO based work.
But I am just curious, is there any way I can pull the GPU computing power to process this massive file. A normal core i3 CPU takes, around 3 mins to process a single file. for Eight file it will be 24 mins. that will make the total time 24+3= 27 mins. and I don’t want to make my user wait for 27 mins.
My idea is to load the file on the GPU, create thousands of threads, which will process the entire file from thousands of locations. Well it is kind of Single instruction to be processed on different data.

any suggestion will be helpful,


If the processing you are trying to do can run in parallel, I dont see why you shouldnt see speed up with the gpu.

One thing to check though is how much time is spent processing and how much time is spent reading and writing your files to the hard disk.


Thanks for responding. The problem is I am not able to transfer the entire file directly to the GPU, I am just a novice CUDA programmer.

I am using cudamemcpy to copy the file, but the problem is the host machine can’t read the entire file at the same time, so loading entire file to GPU is difficult.
I tried loading it into chunks, but the memcpy program, is unable to load it, in chunks.

The only solutions I came is loading it in pieces, process it then load again. But I want to do this while loading entire file, for better speedup.

Is there any way to load everything directly or indirectly.