cudaMemGetInfo() gets error 304 after 50+ hours of stress test

I’m running a stress test with two processes rendering servers based on CUDA and a few other processes pushing lots of data at them. First one process gets a 304 error and then about 30 minutes later the second process gets the same error: Runtime API error 304: OS call failed or operation not supported on this OS.

The VM is using 2 Tesla M60 GPUs
CUDA 10.20
Driver: 443.66
System: Windows Server 2012 R2

The process memory is very stable. The CUDA memory has 1GB free and has not gotten any alloc errors.

The available system memory has fallen from 12,6GB to under 8GB for no reason that I can see.

Has anyone else seen anything like this or have any hints to follow?

Thanks!