Dear NVIDIA developer team,
This week, I have recently just updated my graphic cards from rtx2060 to rtx3060 because it has more VRAM, so that I could train deep learning experiments faster.
The problem is, now, I cannot even training with the new GPU due to constant OOM issue. I have tested that both Pytorch (1.7.1cu11.0, 1.8.0cu11.1) and Tensorflow-gpu (2.4.3 cu11.1) give the same OOM error.
But from my observation, the GPU usage rises with Tensorflow-gpu (although in the end it cries OOM) to 9.xx G from the available 12GB of VRAM. However, I didn’t observe any spike in the GPU memory usage when using Pytorch-gpu.
Hence, I am wondering, is this might be an issue in the cuda driver itself, which probably doesn’t support RTX3060 (yet, since it is <1 month old)?
Reproduce the issue
To test pytorch, here.
To test tensorflow:
test_tf.py (2.5 KB)