Hi,
I am testing MPS performance. On a K80, I can run 4 processes, and the 5th will get error: all CUDA-capable devices are busy or unavailable. While on a M60, 3 processes are working well, not the 4th. Why does M60 support less processes than K80 ? How about P40 , P100 & V100 according to my program?
PS: I can see 11G mem on K80 with nvidia-smi cmd, while M60 has 8G (16G on V100)
On M60, when I ran 3 processes, I got:
[2019-07-02 06:23:09.490 Control 1] NEW SERVER 16: Ready
[2019-07-02 06:23:09.490 Other 16] MPS Server: Received new client request
[2019-07-02 06:23:09.491 Other 16] MPS Server: worker created
[2019-07-02 06:25:16.662 Control 1] Accepting connection…
[2019-07-02 06:25:16.662 Control 1] User did not send valid credentials
[2019-07-02 06:25:16.662 Control 1] Accepting connection…
[2019-07-02 06:25:16.662 Control 1] NEW CLIENT 0 from user 0: Server already exists
[2019-07-02 06:25:16.662 Other 16] MPS Server: Received new client request
[2019-07-02 06:25:16.662 Other 16] MPS Server: worker created
[2019-07-02 06:25:32.844 Control 1] Accepting connection…
[2019-07-02 06:25:32.844 Control 1] User did not send valid credentials
[2019-07-02 06:25:32.844 Control 1] Accepting connection…
[2019-07-02 06:25:32.844 Control 1] NEW CLIENT 0 from user 0: Server already exists
[2019-07-02 06:25:32.844 Other 16] MPS Server: Received new client request
[2019-07-02 06:25:32.844 Other 16] MPS Server: worker created
[2019-07-02 06:27:56.316 Control 1] Accepting connection…
[2019-07-02 06:27:56.316 Control 1] NEW UI
[2019-07-02 06:27:56.316 Control 1] Cmd:get_client_list 16
[2019-07-02 06:27:56.316 Control 1] 10
[2019-07-02 06:27:56.316 Control 1] 9
[2019-07-02 06:27:56.316 Control 1] UI closed
[2019-07-02 06:33:07.969 Other 16] Client 10 disconnected
[2019-07-02 06:33:08.031 Other 16] Client 9 disconnected
[2019-07-02 06:33:08.147 Other 16] Client 9 disconnected
Only 2 clients listed here. And there are 2 client 9. Actually, 3 processes are the same process but from different tty (same user id, all root, 2 are from one tty and the other from another tty)
How can I know the actual MPS client count ?