I have a Python program that performs deep network inference on images using TensorFlow. When a single program runs the GPU is not fully utilized. However, when running several programs simultaneously, the inference time per image is slower compared to running a single program. What could be the reason for that and what can be done to improve the inference time for multiple programs? Currently I am on windows (can move to Linux if needed) with GTX 1070 and have set the gpu_options.per_process_gpu_memory_fraction parameter of TesorFlow.
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| the inference time increases linearly when running more than 2 tensorrt instance on single GPU | 1 | 1606 | April 4, 2019 | |
| Running multiple instances of a sample code on a gpu | 0 | 611 | February 23, 2011 | |
| Weird multiGPU performance About 10 times slower than single GPU | 10 | 3995 | November 25, 2009 | |
| Multiple GPU speed problem | 4 | 1780 | November 23, 2009 | |
| Cuda Kernels running slow | 0 | 492 | November 9, 2018 | |
| Why Multi-GPU slower than single GPU? | 2 | 7637 | September 14, 2011 | |
| Running Multiple DetectNets on Jetson TX1 | 2 | 895 | October 18, 2021 | |
| simpleMultiGPU processing time slower on dual than single? | 4 | 2301 | November 30, 2008 | |
| Performance drop when using the processes using different gpus on one machine | 1 | 903 | March 28, 2015 | |
| Behaviour in running two programs on single GPU(Tesla K40m)? | 2 | 738 | December 5, 2014 |