Hi, I have a problem with my RTX 2070.
I can’t train tensorflow models. Each time when I try to train model using GPU I get “nan” loss (on CPU a don’t have such problem). Also I have such problem with nvidia-docker when execute TF-docker examples .
(https://sun9-8.userapi.com/impf/gGLZLZR8i6pw_tLMiveMO6lPEsL82SkYEoXIhA/MVjUFDyBDzo.jpg?size=0x0&quality=90&proxy=1&sign=dd594952260f338228f2a6ee73c3186b)
Tue Oct 27 21:23:57 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.95.01 Driver Version: 440.95.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2070 Off | 00000000:07:00.0 On | N/A |
| 0% 52C P8 21W / 185W | 361MiB / 7979MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1105 G /usr/lib/xorg/Xorg 133MiB |
| 0 2239 G cinnamon 51MiB |
| 0 2657 G ...AAAAAAAAAAAACAAAAAAAAAA= --shared-files 174MiB |
+-----------------------------------------------------------------------------+
uname -a
\Linux COMP 5.4.0-48-generic #52~18.04.1-Ubuntu SMP Thu Sep 10 12:50:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux