Problem Summary:
I am facing an issue when trying to start a container in a remote environment using NVIDIA AI Workbench after building it. Used:
PyTorch
A Pytorch 2.1 Base with CUDA 12.2v1.0.2 | Ubuntu 22.04 | Python3
Upon launching the environment, I receive the following error message:
No GPUs Available
Not enough GPU resources are available. You can continue without GPUs. Additionally, you can cancel to manually stop projects to free up GPU resources.
This error occurs despite having GPUs available on the server. The output from the nvidia-smi
command confirms that both GPUs (NVIDIA GeForce RTX 3060 and GeForce RTX 3060 Ti) are detected and show minimal memory usage:
| NVIDIA-SMI 550.107.02 Driver Version: 550.107.02 CUDA Version: 12.4 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:01:00.0 Off | N/A |
| 0% 39C P8 18W / 170W | 2MiB / 12288MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
| 1 NVIDIA GeForce RTX 3060 Ti Off | 00000000:04:00.0 Off | N/A |
| 0% 59C P8 21W / 200W | 2MiB / 8192MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
Interestingly, when I run the container directly from the terminal, it successfully recognizes the GPUs. This indicates that there may be an issue with how the GPU resources are allocated or recognized specifically within NVIDIA AI Workbench.
I would appreciate any assistance in troubleshooting this issue to ensure the GPUs are available for the container in the AI Workbench environment.
Please tick the appropriate box to help us categorize your post
[0] Bug or Error
Feature Request
Documentation Issue
Other
logs.txt (7.9 KB)