The first call to cudaMalloc in the following code fails with cudaErrorMemoryAllocation because the amount of memory is too large. But the second call for only 1024 bytes fails, too.
cudaError_t err;
cudaThreadSynchronize(); // Initialize
// Determine free memory
size_t free, total;
err = cudaMemGetInfo(&free, &total);
if (err != cudaSuccess) {
cout << "Error cudaMemGetInfo" << endl;
} else {
cout << "Free=" << free << ", total=" << total << endl;
}
void* ptr;
// The first malloc fails, because the requested block is too large
err = cudaMalloc(&ptr, free);
if (err != cudaSuccess) {
cout << "Error cudaMalloc large block" << endl;
}
cudaFree(ptr);
// The second call fails, too
err = cudaMalloc(&ptr, 1024);
if (err != cudaSuccess) {
cout << "Error cudaMalloc small block" << endl;
}
cudaFree(ptr);
So after the first call the device is in an usable state. I did not expect this behaviour. Is this a bug or a feature? If it is a feature than it should be included in the documentation. I ran this on Windows 7 64 Bit, CUDA 3.2, WDDM Driver 270.81 and Visual Studio 2008.
Is there a way around this? I can’t reset the device with cudaThreadExit(), because in my application there are other buffers on the device and already allocated. And i have to use CUDA 3.2 because of my customer.
I read in another thread, that the author uses a conservative estimates of 80% of free memory. Is this the only way?
The output:
Free=1505865728, total=1576468480
Error cudaMalloc large block
Error cudaMalloc small block
Best regards,
Joern Dinkla