What happens if two processes running with MPS need more memory than available on GPU

Hi, I have a question regarding how much memory processes running on an MPS enabled GPU can use.

My scenario is where there are two different processes each using tensorflow to run on the GPU. The GPU is configured to use MPS so two processes can run run concurrently.

What happens if each tensorflow models is 5 GB (so two models will need 10 GB memory in total) but the GPU only has 8GB available?

Thank you,

One of the processes will fail (on the cudaMalloc call buried in the TF code) where the memory is allocated. MPS doesn’t fix/sort this out for you. The general situation is possibly worse on TF because TF in some cases may use a greedy allocator - allocating most of the GPU memory whether it needs it or not.