<1% Free Memory (Reset Programmatically?

I had a rouge “for loop” allocating >500 MB of device memory each time. The program crashed without being able to run cudaFree() and it seems the memory is locked up on the device. I am compiling with CUDA ART 2000 on CentOS with a Tesla C870 (soon migrating to a C1060). Obviously I can just reset the system, but my concern is when this become deployed, is it possible to free the GPU memory programatically prior to initializing the main computations to ensure the state of the device as well as the amount of memory available for computation?

Additionally, why does cudaGetDeviceProperties.totalGlobalMem report different value than cuMemGetInfo

Device: Tesla C870, 1350 MHz clock, 1536 MB memory.
^^^^ Free : 42 bytes (0 KB) (0 MB)
^^^^ Total: 2513032896 bytes (2454133 KB) (2396 MB)
^^^^ 0.000002% free, 99.999998% used

The driver is supposed to free device memory automatically after your program terminates, even if you don’t call cudaFree(). Are you sure the crashed program has terminated, and isn’t stuck in a zombie state? If memory is staying allocated after program termination, then that is definitely a driver bug.

ps and top show no “zombie” processes and I am the only user logged into the machine

Same happened to me and I had to restart the host and devices (I have an S1070).