cudaFree does not work fine!

Hello all,

I am developing an application using CUDA C, I need to manage large data amount so I decided free memory on each iteration of my main for loop. I mean, on each iteration of that loop I allocate memory, compute the kernel and then copy the memory to the host memory and free that memory. But not all memory is free and memory crash at several iterations. I do not know what is happening and why this behaivor occurs…
Im using CUDA 6 and Unified Memory, if somebody can help me and need more information just let me know.

Thank you in advance.


How do you know not all memory is free? And what constitutes “memory crash at several iterations”?

Post some source code if you want answers ;)

Well, I am checking the allocated memory with these instructions:

cudaMemGetInfo(&allocMem, &totalMem);
printf(“Memory avaliable: Free: %lu, Total: %lu, Consumed: %lu\n”,allocMem, totalMem,(freeMem - allocMem));

I think the problem is the unified memory

I am doing something like that:

for(int order = 1; order <= MAX_ORDER; order++){

cudaMallocManaged(&array1, sizeof(mirror));

Kernel1<<<blocksPerGrid, threadsPerBlock>>>(array1, array2, w, v, order);

cudaMallocManaged(&array2, sizeof(mirror));

//copy the array1 into array2


Kernel2<<<blocks, threads>>>(array2, w, v);


mirror is one of my structure…

and the problem is if the MAX_ORDER is too big then the memory crash because is full, but it shouldnt because I am seting free some memory on each iteration (there are enough memory)

Is this the code you have used?

Your ordering seems out of place - you seemingly reference array 2 before having allocated it; you seem to free array 2 before completing the copy of array 2 into 1; and you seem to again reference array2 after having freed it…?

arrray2 was allocated on the unified memory before to start for loop. The code works, believe me…

if “the code works”, then why the post…?

I have not really used unified memory much, I admit; so, are you saying that, with unified memory, it is possible to call cudafree() for memory allocated, and still reference the same memory thereafter? And that it is possible to reference memory not yet allocated?
I suppose I am missing something here

I did some mistakes when posted the code. yes, yes and no… you have to allocated memory previously. I just develope the code using “pinned memory” and it works, now I can do much more iterations…

The question is, how do cudaFree() work with Unified Memory?? Is sync or not?? I think there are many things to clear related with unified memory yet.

In my experience, the compiler does not always warn you if you are about to step into not-allocated memory (because of your code); the debugger normally throws an exception/ error when this occurs; so you could verify such cases by stepping the code (with the debugger)