k80 Vs P100

Hi,

We have two servers, one is equipped with two k80’s, and the second with a P100. Our use case is to process lots of small jobs per second. Meaning, its more the volume than the load. Are we better with two k80’s, or one P100?

In general, what tool(s) can we use to best monitor the performance, understand exactly what its doing, its bottlenecks, and perhaps benchmarking/calculating its max capability?

The K80 Server:
±----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90 Driver Version: 384.90 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 On | 00000000:83:00.0 Off | 0 |
| N/A 62C P0 95W / 149W | 7210MiB / 11439MiB | 50% Default |
±------------------------------±---------------------±---------------------+
| 1 Tesla K80 On | 00000000:84:00.0 Off | 0 |
| N/A 50C P0 129W / 149W | 7210MiB / 11439MiB | 90% Default |
±------------------------------±---------------------±---------------------+
| 2 Tesla K80 On | 00000000:87:00.0 Off | 0 |
| N/A 60C P0 105W / 149W | 7177MiB / 11439MiB | 62% Default |
±------------------------------±---------------------±---------------------+
| 3 Tesla K80 On | 00000000:88:00.0 Off | 0 |
| N/A 50C P0 102W / 149W | 7207MiB / 11439MiB | 30% Default |
±------------------------------±---------------------±---------------------+

The P100 server (currently utilized)
±----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90 Driver Version: 384.90 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE… Off | 00000000:81:00.0 Off | 0 |
| N/A 26C P0 29W / 250W | 8227MiB / 16276MiB | 0% Default |
±------------------------------±---------------------±---------------------+