Unable to utilize all GPU memory when using tensorflow, failed to alloate memory

hans.olav.vm · September 24, 2018, 12:08pm

I’m working on a live object detection using tensorflow and pretrained COCO-models. It seems like tensorflow can’t utilize the available GPU memory(24GB) which leads to poor running times.

I use the commands:
config = tf.ConfigProto()
config.gpu_options.allow_growth = True

so that some memory are left for other purposes. When checking the performance monitor while the program runs, I observe that only 6/24 GB are in use.

When using the command
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.3)

Only 30 % of the GPU memory may be used (which it is). By adjusting the gpu_memory_fraction to 0.33, I get a lot of error messages. These are included below.

Anyone else who have experienced these problems?

2018-09-24 13:37:19.803414: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-09-24 13:37:19.989651: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1405] Found device 0 with properties:
name: GRID P40-24Q major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:02:02.0
totalMemory: 24.00GiB freeMemory: 18.95GiB
2018-09-24 13:37:19.999513: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1484] Adding visible gpu devices: 0
2018-09-24 13:37:20.834480: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-24 13:37:20.840108: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0
2018-09-24 13:37:20.844254: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:984] 0: N
2018-09-24 13:37:20.848161: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8110 MB memory) → physical GPU (device: 0, name: GRID P40-24Q, pci bus id: 0000:02:02.0, compute capability: 6.1)
2018-09-24 13:37:30.541515: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 7.92G (8504035072 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.607998: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 7.13G (7653631488 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.624044: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 6.42G (6888268288 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.796166: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 5.77G (6199441408 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.812917: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 5.20G (5579496960 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.852114: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 4.68G (5021547008 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.860134: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 4.21G (4519392256 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.941254: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 3.79G (4067452928 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:30.969638: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 3.41G (3660707584 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.022152: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 3.07G (3294636800 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.033550: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 2.76G (2965172992 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.042318: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 2.49G (2668655616 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.050281: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 2.24G (2401789952 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.058889: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 2.01G (2161611008 bytes) from device: CUDA_ERROR_UNKNOWN
2018-09-24 13:37:31.067494: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:903] failed to allocate 1.81G (1945449984 bytes) from device: CUDA_ERROR_UNKNOWN

Robert_Crovella · October 8, 2018, 4:19pm

It’s not clear what you are expecting. 6/24 GB may be completely normal for the particular training you are doing. It will be a function of model size and many other factors.

If you adjust the available memory too small, then allocation failures may occur

Topic		Replies	Views
cuda_driver failed_to_allocate problem CUDA_ERROR_OUT_OF_MEMORY CUDA Programming and Performance	0	1741	April 18, 2019
CUB segmented reduce errortoo many resources requested for launch Jetson TX2	2	1569	October 18, 2021
Ran out of memory on GPU Frameworks tensorflow	2	6487	July 2, 2021
CUDA_ERROR_OUT_OF_MEMORY: out of memory cuDNN cuda , tensorflow , windows-driver	1	1752	July 31, 2023
CUDA_ERROR_OUT_OF_MEMORY: out of memory on Nvidia Quadro 8000, with more than enough available memory Frameworks tensorflow	3	2828	October 6, 2020
CUDA_ERROR_OUT_OF_MEMORY: out of memory when there is actually no such a large tensor to allocate cuDNN	1	12796	December 28, 2019
tensorflow/stream_executor/cuda/cuda_driver.cc:965] failed to alloc 2304 bytes on host: CUDA_ERROR_UNKNOWN Jetson TX2	4	2706	October 18, 2021
TensorFlow GPU not working in Nano Jetson Nano	3	3078	October 18, 2021
Memory Alloc Error when using GPU computation Jetson TX2	5	1972	December 28, 2017
CUDA driver version is insufficient for CUDA runtime version CUDA Setup and Installation	0	1827	May 13, 2018

Unable to utilize all GPU memory when using tensorflow, failed to alloate memory

Related topics