RTX 3070 with CUDA10.0 compatibility [UbuntuOS, any version]

kalagulla111 · February 20, 2021, 2:50am

Hello Nvidia team and community,
I have new system with RTX3070 and going to install Ubuntu [can suggest for version] and before installing other drivers I want to know that will RTX3070 supports CUDA10.0? and which reference I should follow to install CUDA10.0 and Nvidia display drivers for this GPU.

Thanks!

generix · February 20, 2021, 3:05am

Notebook or desktop system?

kalagulla111 · February 20, 2021, 4:02am

Desktop system
CPU: intel core i7 (10th gen)

generix · February 21, 2021, 12:17pm

As a rule of thumb, the cuda version being current on release of a new gpu architecture should be used, in your case cuda 11. Up to now, no compatibility issues of Ampere gpus with cuda 10 are known, so you should be safe.
For cuda 10, packages are provided for Ubuntu 18.04, in case of Ubuntu 20.04, you could use the runfile installer.
In both cases, it is important to not install the bundled driver but instead use the one that’s provide by Ubuntu repos. In case of package based install, you’re required to run
sudo apt install cuda-toolkit-10-0
instead of
apt install cuda
in case of the runfile, just skip the driver install when the installer asks for it.

kalagulla111 · February 22, 2021, 9:53am

Hello Sir, Thanks for your response.
I had installed CUDA10.0, lastest NVIDIA display drivers, CUDNN, tf1.15.0.
We start running our training script, in nvidia-smi prompt its showing around 90% GPU memory usages but 0% GPU volatile ECC util.
Does that mean my training is not yet started on GPU? Its very slow process. And the same script is running very fast with 100% volatile with GTX 1080ti in another machine so I don’t think that there is issue in script.

Thanks in advance for your help!

generix · February 22, 2021, 9:59am

There’s no such thing as “volatile ecc util.”, I suppose you confused it with volatile ECC errors. Geforce type cards don’t have ECC memory, so that value should always be N/A.

kalagulla111 · February 22, 2021, 10:03am

Hello Sir, It was my error. It was Volatile GPU Util which is showing 0%.

generix · February 22, 2021, 12:15pm

I guess it’s not using the gpu at all. This looks like TF 1.15 doesn’t really work ootb with an RTX 3090
https://github.com/tensorflow/tensorflow/issues/44200
Here’s some instructions on how to make it work:
https://www.pugetsystems.com/labs/hpc/How-To-Install-TensorFlow-1-15-for-NVIDIA-RTX30-GPUs-without-docker-or-CUDA-install-2005/

kalagulla111 · February 25, 2021, 8:14am

Hello thanks for your reply. I am able to use RTX 3070 with conda env , tf1.15, CUDA 10.0.
But now my concern is that same model script running faster with GPU 1080ti compare to RTX 3070.
Performance analysis for 1 hr training on both machines: on GTX 1080ti = 25k steps, on RTX 3070 = 12k steps.
I am giving same batch size = 16 to both machines and same RAM size. Only difference is different CPUs but I am sure that machine with RTX 3070 has higher CPU compare to GTX 1080ti.
Why RTX 3070 performing slower than GTX 1080ti?

Thanks!

generix · February 25, 2021, 8:33am

Are you running the machines headless, i.e. without an Xserver being started on the nvidia gpu? If that’s the case, please check if the persistence daemon (nvidia-persistenced) is started.

kalagulla111 · February 25, 2021, 9:38am

Hey, this is logs of “** sudo systemctl status nvidia-persistenced
**”

admin-u@atig0:~$ sudo systemctl status nvidia-persistenced
● nvidia-persistenced.service - NVIDIA Persistence Daemon
Loaded: loaded (/lib/systemd/system/nvidia-persistenced.service; static; vendor preset: enabled)
Active: active (running) since Thu 2021-02-25 08:58:47 IST; 6h ago
Main PID: 1005 (nvidia-persiste)
Tasks: 1 (limit: 4915)
CGroup: /system.slice/nvidia-persistenced.service
└─1005 /usr/bin/nvidia-persistenced --user nvidia-persistenced --no-persistence-mode --verbose

Feb 25 08:58:47 atig0 systemd[1]: Starting NVIDIA Persistence Daemon…
Feb 25 08:58:47 atig0 nvidia-persistenced[1005]: Verbose syslog connection opened
Feb 25 08:58:47 atig0 nvidia-persistenced[1005]: Now running with user ID 122 and group ID 127
Feb 25 08:58:47 atig0 nvidia-persistenced[1005]: Started (1005)
Feb 25 08:58:47 atig0 nvidia-persistenced[1005]: device 0000:01:00.0 - registered
Feb 25 08:58:47 atig0 nvidia-persistenced[1005]: Local RPC services initialized
Feb 25 08:58:47 atig0 systemd[1]: Started NVIDIA Persistence Daemon.
admin-u@atig0:~$

#--------------------------------------------
#--------------and output of "sudo gedit /lib/systemd/system/nvidia-persistenced.service
"----------------------
[Unit]
Description=NVIDIA Persistence Daemon
Wants=syslog.target
StopWhenUnneeded=true
Before=systemd-backlight@backlight:nvidia_0.service

[Service]
Type=forking
ExecStart=/usr/bin/nvidia-persistenced --user nvidia-persistenced --no-persistence-mode --verbose
ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced

generix · February 25, 2021, 10:21am

Looks fine.
What’s the gpu utilization during training?

kalagulla111 · February 25, 2021, 10:33am

generix · February 25, 2021, 11:59am

The reason might be the smaller memory size of the 3070. Furthermore, I guess you should change to fp16 to make use of the tensorcores.

kalagulla111 · February 25, 2021, 12:37pm

Can you elaborate more on what is fp16?

generix · February 25, 2021, 3:11pm

Since you’re working with tensorflow, you should now that and about data types and precision in general. fp16=16-bit floating point. Though google just told me that ampere introduced a new TF32 data type for use on tensor cores.

Topic		Replies	Views
RTX 3070 with CUDA10.0 compatibility [Windows 10] CUDA Programming and Performance	1	4645	February 28, 2022
Tensorflow1.14 is not working on RTX3090 inside the Docker container of Ubuntu18.04 and CUDA10.0 with Python2 CUDA Programming and Performance cuda , tensorflow , ubuntu , docker	11	5463	April 2, 2022
Rtx 3050 desktop cuda compatibility CUDA Setup and Installation	10	3568	May 2, 2024
Cuda performance - Parallel computing CUDA Programming and Performance	8	716	October 26, 2022
Cannot nvidia-smi Geforce 1070 anymore suddenly. Linux	9	1615	October 12, 2021
GTX1070 performance issue CUDA Programming and Performance	10	3453	May 19, 2017
I'm novice, please help -- pure performance CUDA Programming and Performance	17	56	October 30, 2024
First time with CUDA -> CUDA driver version is insufficient for CUDA runtime version - Result = FAIL CUDA Setup and Installation	2	67	December 26, 2024
Setting up nvidia-persistenced CUDA Setup and Installation	12	45713	July 19, 2020
Ubuntu20.04 + nvidia-driver-470 - card drops out after minimal use /delay after first cuda command but works w/nvidia-driver-460 CUDA Setup and Installation	3	1341	November 24, 2021

RTX 3070 with CUDA10.0 compatibility [UbuntuOS, any version]

Related topics