Hello,
I’m trying to set up a specific environment on my university’s HPC, which restricts sudo access. The HPC has Python >=3.9 and CUDA >=11.7. For my project, I need Python 3.6 and PyTorch 0.4.1, compatible with CUDA 9.2 and cuDNN 7.2.1.
What I’ve done:
- Created a conda environment with Python 3.6.
- Installed cudatoolkit=9.2 and cudnn=7.2.1.
- Installed PyTorch 0.4.1 using
conda install pytorch=0.4.1 cuda92 -c pytorch
.
Issues:
- When installing pytorch 0.4.1 in this env i got env conflicts, so i created a python venv inside the conda env and installed 0.4.1 using pip.
- When running
nvcc --version
, it shows CUDA 9.2. Also torch.cuda.version returns 9.2 which is good. torch.version.cuda
shows 9.2, buttorch.backends.cudnn.version()
returns 7.1 instead of 7.2.1.- During training, I encounter the error:
RuntimeError: CuDNN error: CUDNN_STATUS_EXECUTION_FAILED
.
Questions:
- Why is cuDNN version 7.1 instead of 7.2.1?
- How can I correctly set up the environment to avoid conflicts?
- How to solve CUDNN error?
Thank you for your help!