Same issue as the others.
Running latest on WSL2, Ubuntu 22.04, CUDA 12.2. 4x RTX 6000 Ada.
Using export CUDA_VISIBLE_DEVICES=0,1,2,3
Any combination of 1 and 3 fail.
0,1,2,3 fail
0,1,2 pass
0,1,3 fail
0,2,3 pass
1,2,3 fail
0,1 pass
0,2 pass
0,3 pass
1,2 pass
1,3 fail
2,3 pass
See related posts here:
opened 02:45PM - 07 Jul 23 UTC
GPU
### Windows Version
Windows11
### WSL Version
1.2.5.0
### Are you using WSL … 1 or WSL 2?
- [X] WSL 2
- [ ] WSL 1
### Kernel Version
5.15.90.1
### Distro Version
Ubuntu -default
### Other Software
I am writing to report an unexpected behavior I’ve encountered when working with PyTorch and CUDA on a wsl2 on Windows 11 system equipped with multiple NVIDIA RTX 3090 GPUs.
Environment Details:
Operating System: Windows 11
CUDA Version: 12.2
WSL Version: 2
GPUs: 4x NVIDIA RTX 3090
PyTorch Version: 2.01 (CUDA 11.8)
Problem Statement: When I set the CUDA_VISIBLE_DEVICES environment variable to enable all the GPUs (0,1,2,3) on the system and then run a PyTorch script that calls torch.cuda.is_available(), I encounter an “Out of Memory” error. Notably, this error does not occur if I only enable GPU 1 or a combination of GPU 0,2,3. Furthermore, this error can be circumvented if I call torch.cuda.device_count() before torch.cuda.is_available().
### Repro Steps
Steps to Reproduce:
Set the environment variable: export CUDA_VISIBLE_DEVICES=0,1,2,3
Run a Python script that imports PyTorch and calls torch.cuda.is_available()
### Expected Behavior
Expected Behavior: The torch.cuda.is_available() function should return True if GPUs are available and accessible.
### Actual Behavior
Observed Behavior: An “Out of Memory” error is triggered internally at ../c10/cuda/CUDAFunctions.cpp:109. The torch.cuda.is_available() function returns False.
Workaround: I found that calling torch.cuda.device_count() before torch.cuda.is_available() circumvents the error. However, this workaround requires modifying each script to include this extra call.
While the workaround is effective, it may be beneficial to investigate and address the root cause of this issue. I wanted to bring this to your attention and look forward to any insights or potential solutions you might provide.
### Diagnostic Logs
_No response_
opened 06:33AM - 21 Dec 21 UTC
### Version
Microsoft Windows [Version 10.0.22000.376]
### WSL Version
… - [X] WSL 2
- [ ] WSL 1
### Kernel Version
5.10.60.1
### Distro Version
Ubuntu 20.04 and Ubuntu 18.04
### Other Software
CPU: Intel(R) Core(TM) i9-9900X
GPU: Nvidia Titan RTX * 4 (driver 510.06)
RAM: 128GB
### Repro Steps
Install CUDA on WSL
```
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
sudo sh -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /" > /etc/apt/sources.list.d/cuda.list'
sudo apt-get update
sudo apt-get install -y cuda-toolkit-11-0
```
Run samples
```
cd /usr/local/cuda-11.0/samples/4_Finance/BlackScholes
sudo make
./BlackScholes
```
```
cd /usr/local/cuda-11.0/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery
```
### Expected Behavior
Return success
### Actual Behavior
```
[./BlackScholes] - Starting...
CUDA error at ../../common/inc/helper_cuda.h:777 code=2(cudaErrorMemoryAllocation) "cudaGetDeviceCount(&device_count)"
```
```
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 2
-> out of memory
Result = FAIL
```
### Diagnostic Logs
_No response_
I’ve also found an odd partial workaround, which indicates that it is an initialization issue of some sort at the driver level. It doesn’t necessarily work reliably, but it allows you to get the system GPU configurations.
Toggling GPU performance counters to unrestricted, applying the setting, and then switching back allows the GPU check functions to complete.