I have two RTX 3090 GPUs. One from gigabyte and one from Zotac. I am using them to run OpenACC/CUDA jobs on Linux Mint. I have nvidia hpc SDK 23.1. I am getting ERR! under fan speed in nvidia-smi after I finish one application running. The GPU gets stuck in P5 state. The only way to reset it is to restart the system. This error happens to the GPU in slot 0, ie the gigabyte one. Any help?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Nvidia-smi show ERR! on FAN | 0 | 974 | July 13, 2023 | |
| ERR! on Fan in nvidia-smi with driver 410.93 | 1 | 1926 | June 28, 2019 | |
| NVIDIA-SMI Shows ERR! on both Fan and Power Usage | 32 | 48651 | August 30, 2022 | |
| only one RTX 2080Ti can be found after reboot when there is fan err and voltage err by nvidia-smi | 2 | 537 | April 24, 2019 | |
| Nvidia-smi showing !ERR in all fields for one of the GPUs(A40) | 8 | 9081 | November 7, 2022 | |
| Nvidia-smi GPU Fan ERR! | 2 | 3394 | March 22, 2019 | |
| Nvidia-smi gives ERR! under Fan section when GPU temperature is low enough | 2 | 2288 | May 14, 2022 | |
| Error running nvidia-smi | 0 | 728 | April 22, 2020 | |
| why "all CUDA-capable devices are busy or unavailable" ? | 34 | 64773 | April 20, 2011 | |
| System hangs with drivers 319.23, 319.32, 325.08 and others - simple test case included | 17 | 9578 | July 1, 2014 |