Mps not work like i think in multi thread

i am using mps in multi thread mode, each thread with a cuda context, i wonder know why it performance bad compare with non-mps service with one cuda context and multi streams

MPS handles multi-process situations

There is no suggestion that MPS is a better or equivalent approach to doing the same work in a single process with a single context and multiple streams.

what really mps do? I think it will reduce the mutex in multi cuda contexts, not only in multi processes,in our test, it helps a lot when using tensorflow involve multi cuda contexts and cuda streams, am i right?

There is documentation.

You said:

I said: