Unable to utilize all GPU memory when using tensorflow, failed to alloate memory

I’m working on a live object detection using tensorflow and pretrained COCO-models. It seems like tensorflow can’t utilize the available GPU memory(24GB) which leads to poor running times.

I use the commands:
config = tf.ConfigProto()
config.gpu_options.allow_growth = True

so that some memory are left for other purposes. When checking the performance monitor while the program runs, I observe that only 6/24 GB are in use.

When using the command
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.3)

Only 30 % of the GPU memory may be used (which it is). By adjusting the gpu_memory_fraction to 0.33, I get a lot of error messages. These are included below.

Anyone else who have experienced these problems?

2018-09-24 13:37:19.803414: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-09-24 13:37:19.989651: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1405] Found device 0 with properties:
name: GRID P40-24Q major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:02:02.0
totalMemory: 24.00GiB freeMemory: 18.95GiB
2018-09-24 13:37:19.999513: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1484] Adding visible gpu devices: 0
2018-09-24 13:37:20.834480: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-24 13:37:20.840108: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0
2018-09-24 13:37:20.844254: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:984] 0: N
2018-09-24 13:37:20.848161: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8110 MB memory) -> physical GPU (device: 0, name: GRID P40-24Q, pci bus id: 0000:02:02.0, compute capability: 6.1)
2018-09-24 13:37:30.541515: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 7.92G (8504035072 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.607998: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 7.13G (7653631488 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.624044: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 6.42G (6888268288 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.796166: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 5.77G (6199441408 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.812917: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 5.20G (5579496960 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.852114: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 4.68G (5021547008 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.860134: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 4.21G (4519392256 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.941254: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 3.79G (4067452928 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.969638: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 3.41G (3660707584 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.022152: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 3.07G (3294636800 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.033550: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 2.76G (2965172992 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.042318: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 2.49G (2668655616 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.050281: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 2.24G (2401789952 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.058889: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 2.01G (2161611008 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.067494: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 1.81G (1945449984 bytes) from device: CUDA_ERROR_UNKNOWN

It’s not clear what you are expecting. 6/24 GB may be completely normal for the particular training you are doing. It will be a function of model size and many other factors.

If you adjust the available memory too small, then allocation failures may occur