time usage of cudamalloc

schmeing · August 13, 2009, 1:29pm

Hi!

I played a bit with cuda and several time measurements shows that the cudamalloc functions take about 60ms (independent of the size, I try 5 000 Bytes, 50 000 Bytes and 5 000 000 Bytes). This means that a vector addition of two vectors (float a[xxx] + float b[xxx]) is always slower with cuda. Is that right? Without Cuda the cpu solve the probelm in less than 60ms.

Is there any alternativ to improve the duration of malloc? Or a special technic to avoid cudamalloc?

Please help me! It’s very important for me.

Thanks in advance!

YDD · August 13, 2009, 5:52pm

Do more things on the GPU, so that you don’t get choked by cudaMalloc (and probably cudaMemcpy as well). Adding two vectors together is trivial, so the extra overhead of the mallocs, frees and cpys will eat up any speed gain offered by the GPU. The ultimate goal is to place the data on the GPU at the start of the program, and collect the results from it at the end. Not always possible, but it’s where you want to be going.

schmeing · August 14, 2009, 7:46am

Hi!

Thank you for this answer! But this lead me to an other question: Is there any way to load file-data DIRECT into a global or device function? I don’t want to use cudamemcpy or malloc more than required. I can’t find any advices in the documetation.

Thanks in advance!

MMB · August 14, 2009, 2:36pm

One of the recent nVidia seminars talks about “zero copy”. It supposedly allows the GPU to read from the CPU memory - provided it is page-locked. If you try it, please let me know.

YDD · August 14, 2009, 4:01pm

No (although the above mention of zero-copy could be a way around this). However, I think you need to describe what you’re really trying to do, so people can help you more easily. If shifting data from the disc to the GPU is your bottleneck, then you could probably dump your GPU and perform all computations using a 486 with minimal performance impact. I suspect that this isn’t the case, so… what problem are you really trying to solve?