Multiple versions of Cuda, side-by-side in the same one environment with Python 3.7 and Ubuntu 22.04

alexeik.wlu · November 8, 2023, 9:05pm

I run a 2-year old program from github which only works with Python 3.7 (does not work with Python 3.8, 3.9, 3.10) and uses tensorflow , torch, spacy all with GPU support and many other modules. I was able to run the program ok without GPU. Without GPU hardware, with torch=1.13.1 and TF=2.9.0 it gives warnings that CUDA is not available, but otherwise runs without errors and does produce correct results.

I spent a week trying to make it work with GPU. With Python 3.7, TF is upper-limited to Cuda=11.2 and Cudnn=8.1, yet there is no torch+cu112. It means I have to have two different versions of Cuda at the same time.
conda install -c conda-forge cudatoolkit=11.2.2 cudnn=8.1.0 # for TF and Spacy
pip install spacy[cuda112]
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
`(py37) $ python -c “import torch; print(f’PyTorch version {torch.version} has CUDA : {torch.cuda.is_available()}')”
PyTorch version 1.12.1+cu113 has CUDA : True’

When I run the program, depending on torch+cuda version, I get various torch errors. For example:

RuntimeError: CUDA out of memory. Tried to allocate 148.00 MiB (GPU 0; 15.71 GiB total capacity; 1.33 GiB already allocated; 50.50 MiB free; 1.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

In addition to torch==1.12.1+cu113, I tried other torch+cuda versions. Some of them install ok and appear to recognize my GPU ok, but all fail with various torch errors when I run the program.

I am aware that Python 3.7 is no longer officially supported yet hope to get GPU to work. Is running without GPU my only option?

Is it in principle a good idea to run side by side different (multiple) versions of Cuda? Would I be better off if running a single Cuda version 11.0 with Cudnn 8.0 for both TF and torch (and spacy)?
conda install -c conda-forge -c nvidia cudatoolkit=11.0 cudnn=8.0 # for TF and Spacy
pip install spacy[cuda110]

pip install torch==1.*+cu110 torchvision==*+cu110 torchaudio==* --extra-index-url https://download.pytorch.org/whl/cu110

Collecting environment information…
PyTorch version: 1.12.1+cu113
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.3 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Libc version: glibc-2.10

Python version: 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:21) [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-6.2.0-36-generic-x86_64-with-debian-bookworm-sid
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to:
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4060 Ti=16Gb
Nvidia driver version: 535.129.03
cuDNN version: Could not collect
Is XNNPACK available: True

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Vendor ID: GenuineIntel
Model name: 12th Gen Intel(R) Core™ i9-12900K
CPU family: 6
Model: 151
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 2
CPU max MHz: 5200.0000
CPU min MHz: 800.0000
BogoMIPS: 6374.40

Virtualization: VT-x
L1d cache: 640 KiB (16 instances)
L1i cache: 768 KiB (16 instances)
L2 cache: 14 MiB (10 instances)
L3 cache: 30 MiB (1 instance)
NUMA node(s): 1
NUMA node0 CPU(s): 0-23

Versions of relevant libraries:
[pip3] numpy==1.21.6
[pip3] torch==1.12.1+cu113
[pip3] torchaudio==0.12.1+cu113
[pip3] torchvision==0.13.1+cu113
[conda] cudatoolkit 11.2.2 hc23eb0c_12 conda-forge
[conda] numpy 1.21.6 pypi_0 pypi
[conda] torch 1.12.1+cu113 pypi_0 pypi
[conda] torchaudio 0.12.1+cu113 pypi_0 pypi
[conda] torchvision 0.13.1+cu113 pypi_0 pypi

rs277 · November 8, 2023, 11:53pm

You might be in an unfortunate situation then, as the 4060 (Ada), requires a minimum of Cuda 11.8.

alexeik.wlu · November 9, 2023, 2:55am

conda install -c conda-forge cudatoolkit=11.2.2 cudnn=8.1.0

pip install tensorflow=2.9.0

python3 -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
tf.Tensor(-100.2298, shape=(), dtype=float32)

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Does the above confirm that “GeForce RTX 4060 Ti=16Gb” works with Cuda=11.2.2 ?
There is also a newer Cuda 11.2.72 available from conda channels

How can I check which Cuda version is actually being used by TF ? (if there are multiple Cuda versions present in the system)

rs277 · November 9, 2023, 4:06am

I’m not familiar with tensorflow and my comment above was based on 11.8 being the first Toolkit to support your card. If tensorflow is using the GPU, (check in nvidia-smi), then it may be not taking full advantage of features available in your GPU.

Topic		Replies	Views
CuDnn fails to work with TF 1.12 & Python 3.6 cuDNN	0	517	January 11, 2019
CUDA 10.0 and CUDA 11 instalation in same machine with TitanXP\|request for help to configure both CUDA for Tensorflow 1.14 and 2.3 CUDA Setup and Installation tensorflow	0	1243	December 21, 2020
RTX 3070, CUDA 11.1, CUDADNN 8.05 and Tensorflow CUDA Setup and Installation tensorflow	2	5660	December 15, 2020
How to determine correct cuda dependencies for tensorflow and downgrade Frameworks (archived)	0	1614	February 9, 2021
cuDNN/CUDA/TensorFlow setup prroblem CUDA Setup and Installation	2	1121	March 17, 2020
Cuda 10.0 installation, installs 10.1 CUDA Setup and Installation	4	8099	May 29, 2019
There's no well matched CUdnn's version for CUDA 11.5? cuDNN	6	10415	November 16, 2021
Forcing Cuda 11.2+Cudnn 8.1 work with Ada (Sm_89) Linux+Python 3.7 CUDA Setup and Installation cuda , cudnn	0	738	November 15, 2023
Cuda - tensorflow compatibility issue CUDA Setup and Installation	12	5404	June 9, 2018
Tensorflow does not recognize GPU (Windows 10, 1060) CUDA Setup and Installation	3	7217	April 30, 2018

Multiple versions of Cuda, side-by-side in the same one environment with Python 3.7 and Ubuntu 22.04

Related topics