Program stuck with MPS

I have two codes, let say program A and program B (Program B uses openCV. Both programs use CUDA)

When I run A and B in parallel, they work fine. But if I have MPS enabled, only one program is able to run at a time. The other program gets stucked in cuInit(). So for example if I start program A first and then program B, then program B gets stuck in cuInit(), till program A exits and then it resumes.

Can someone tell when why this happens?

(Strangely, even with MPS enabled, two separate instances of program A are able to run in parallel).