Yesterday I posted a topic about a memory allocation problem.
The problem is the following : I do 4 cudaMalloc. The sum of the memory allocated is function of 3 variables N, M, and D. What I do is to use my function with different values of M (actually from 256 to 3072). N=38400 and D=96.
Before, when I used M=2560, the total memory allocated was 410MB, and among the 4 malloc, one tryied to allocate 394MB. In this case, this malloc failed in spite of that the free memory is 750MB.
Now, I use the value returned by cudaMalloc to manage errors during memory allocation. Moreover, I do the biggest memory allocation in the beginning. With these small modifications, the memory allocations seem to work well.
I don’t understand why memory allocation failed.
Do you think that the order of cudaMalloc can change anything?
Do anybody have this kind of problem?