Guys i am new to AI and the company working as intern gave me the older tesla k40m. I want train my model and get inference from it. What i have is a server with server 2016 installed on it. I have installed the driver. I have also run command nvidia-smi that gives info about gpu. I want to use it with pytorch and tensorflow. I installed cuda 12.0.1 along with cudnn. But unfortunately the tensor or the pytorch is not detecting the gpu. What could be the reason? Can any give me proper guidance?
server 2016 you mean Windows server?
PyTorch and CUDA on Windows is only supported through packages supplied by PyTorch themselves, check out their dowlnoad and installation instructions. As I see it it only supports up to CUDA 11.8.
How did you determine that TensorFlow did not work? It might still need a specific version with built in GPU support, but then you can check with
import tensorflow as tf print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
if your GPU is detected.
Thank you for the response yes Windows Server 2016. Now what i have tried today is i installed CUDA 11.2.0 with cuDNN 8.1.0 and driver 461.33. I installed tf==2.10 and it detected my gpu. Here is the output:
2023-07-11 20:15:29.076862: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-07-11 20:15:33.702636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /device:GPU:0 with 10807 MB memory: → device: 0, name: Tesla K40m, pci bus id: 0000:42:00.0, compute capability: 3.5
, name: “/device:GPU:0”
physical_device_desc: “device: 0, name: Tesla K40m, pci bus id: 0000:42:00.0, compute capability: 3.5”
now the problem torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’) is not detecting the cuda. my torch version is 2.0.1+cpu. What can be the reason ? what are the best combination of cuda,cudnn and driver. I am just using CLI no IDE my server is offline and the python version i am using is 3.9.0
You should follow the instructions on the PyTorch pages I linked above, I can repeat it here:
That should give you a working environment.
Make sure to start from a fresh Conda environment if you are using Conda to avoid any possible residual driver/library mismatches.