2.5GB of video memory missing in TensorFlow on both Linux and Windows [RTX 3080]

dmitry_hrybov · September 26, 2021, 1:55pm

Description

I have a 10GB 3080RTX GPU, nvidia-smi reports 10014MiB memory, Tensorflow reports:

Created device /job:localhost/replica:0/task:0/device:GPU:0 with 7591 MB memory

After initial research I was convinced that this is related to Windows 10 OS limitations, so I installed Ubuntu 20.04 in dual boot. It didn’t change anything, I tried various versions of Tensorflow, Cuda, Cudnn.
I tried using:

physical_devices = tf.config.list_physical_devices('GPU')
for gpu_instance in physical_devices:
    tf.config.experimental.set_memory_growth(gpu_instance, True)

It didn’t fix the problem. Also, I tried:

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 1.0
session = InteractiveSession(config=config)

And indeed, TensorFlow started to report proper full 10GB of memory in ‘Created device’ message, so tf should see the memory properly. With this method I was able to push memory to something like 8GB and it even allowed me to run slightly higher batch size. But, if I specify fraction of more than 0.8 (it may slightly vary from run-to-run) than i have:

2021-09-26 12:48:26.691479: F tensorflow/core/util/cuda_solvers.cc:115] Check failed: cusolverDnCreate(&cusolver_dn_handle) == CUSOLVER_STATUS_SUCCESS Failed to create cuSolverDN instance.

One important thing to note, is that while TensorFlow is reporting a device with ~7.5GB, in nvidia-smi it is reporting more than 9GB by /usr/bin/python3! I am not running any other Python script in parallel.

So, the memory usage in reality is reaching its limits while I am able to use only 7.5GB, which is even less than known 81% limitation for Windows 10 users! Why am I being allocated almost extra 2GB on top which I can’t use?

I was trying to fix it for a long time and really don’t have any idea what to do now. Other people’s problems with missing tf memory that I found on Internet were related to Windows OS, mine is not. Am I missing something? I would really appreciate any idea on what is going on.
Thank you in advance.

Environment

GPU Type: MSI RTX 3080 10GB
Nvidia Driver Version: 470.63.01
CUDA Version: 11.2, 11.4
CUDNN Version: 8.1.0, 8.2.4
Operating System + Version: Ubuntu 20.04, Windows 10
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable): 2.5.0, 2.5.1, 2.6.0

Steps To Reproduce

The problem arises both for my custom code and for sample tensorflow scripts from official tutorials. So it should be not code-dependent.

spolisetty · September 27, 2021, 2:53am

Hi,

This forum talks more about updates and issues related to TensorRT. We recommend you to please post your concern on Tensorflow related platform to get better help.

Thank you.

dmitry.hrybov · September 28, 2021, 9:02pm

I noticed that other people also have this problem with RTX 3000 series cards. I tried using RTX 2000 series and I don’t have that big memory allocation. Can the problem be related to Cuda/CuDNN?

dmitry.hrybov · September 28, 2021, 9:04pm

I raised an issue in TensorFlow repository about this problem too, will write updates here If I get some info.

spolisetty · September 29, 2021, 4:05am

Hi,

We suggest to check with Tensorflow team first, if they have some workaround for this, found similar issues in tensorflow related platform. Based to their inputs/suggestion you can reach out to Nvidia.

github.com/tensorflow/tensorflow

Performance issue with 3070

opened 10:53AM - 17 Nov 20 UTC

closed 06:49PM - 05 Oct 22 UTC

SestoAle

stat:awaiting response stale comp:gpu type:performance TF 2.5

**System information** - Have I written custom code (as opposed to using a stoc…k example script provided in TensorFlow): yes - OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.10 - TensorFlow installed from (source or binary): pip install tf-nightly-gpu - TensorFlow version (use command below): tf-nightly-gpu==2.5.0.dev20201117 - Python version: 3.8.5 - CUDA/cuDNN version: CUDA 11.1.0 / cuDNN 8.0.4 - GPU model and memory: 3070 driver 455.38 Hi, I recently bought an RTX 3070 and, after some struggling I managed to install tf-nightly-gpu with all dependencies (following issue #43947), but the performances compared to my previous 1050 are way worse. Is this something relating CUDA and/or NVIDIA drivers (for which I must just wait) or there is something I need to do? I also noticed that, in nvidia-smi, the percentage of Volatile GPU-Util is very low

Thank you.

dmitry.hrybov · October 2, 2021, 8:07pm

I tested some networks on PyTorch 1.9.1 + CUDA 11.1 and facing similar issue. For example, on Windows right before running the script I have 520MiB / 10240MiB allocated:

After I run the training I get:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 10.00 GiB total capacity; 7.39 GiB already allocated; 0 bytes free; 7.44 GiB reserved in total by PyTorch)

nvidia-smi shows that almos all available memory is allocated:

PyTorch info:

So again very similar issue. Pytorch allocated 7.39 GiB, nvidia-smi shows memory usage increased by ~9.5GiB after I launched the script, extra 2GB is taken for unknown reason, as with Tensorflow.

spolisetty · October 4, 2021, 9:37am

Hi,

This doesn’t looks like tensorrt related. Which container image are you using ?

Thank you.

dmitry.hrybov · October 4, 2021, 9:46am

Hi, I’m not using container but running locally from Windows and clean installed Ubuntu 20.04

spolisetty · October 4, 2021, 9:48am

Hi,

This looks like out of scope for TensorRT. We recommend you to please post your concern on forum related to library with you’re facing an issue.

Thank you.

AndreiAlexTa · August 6, 2022, 3:21pm

Any updates? I have the same issue with 3070

Topic		Replies	Views
Training a DQN agent with tensorflow. RTX 3060 GPU memory reporting lower than it should be CUDA Programming and Performance cuda , tensorflow , python , rtx	1	891	April 26, 2023
CUDA Out of Memory on RTX 3060 with TF/Pytorch cuDNN	4	6205	August 26, 2021
Tensorflow2.8 and p100 gpu memory TensorRT ubuntu , ai-training , gpu-computing	1	565	September 8, 2022
Allocator (GPU_0_bfc) ran out of memory trying to allocate 325.33MiB with freed_by_count=0 Jetson Nano tensorflow , tf-trt , gpu	2	7409	October 15, 2021
Quadro rtx 8000 Out of Memory with everything CUDA Programming and Performance	2	1041	January 19, 2020
Why need a more gpu memory rtx than gtx? Frameworks tensorflow , gpu	2	854	January 4, 2022
Tensorrt take much more cpu ram in RTX3070 GPU-Accelerated Libraries cublas	7	1825	October 15, 2021
CUDA_ERROR_OUT_OF_MEMORY: out of memory on Nvidia Quadro 8000, with more than enough available memory Frameworks tensorflow	3	2847	October 6, 2020
GTX 980TI [6GB] only have 4.97GB memory available with WIN10/64 CUDA Programming and Performance	4	2987	April 6, 2017
Two RTX 2070 SUPER GPUs connected with SLI Bridge exhaust memory in Tensorflow container CUDA Setup and Installation cuda , tensorflow	0	527	April 29, 2020

2.5GB of video memory missing in TensorFlow on both Linux and Windows [RTX 3080]

Description

Environment

Steps To Reproduce

Related topics