Hi there! Problem: card consistently crashes after around 2-3 days of normal working (regardless of whether it is used or not). System spec: Centos 7.9.2009, driver version: 460.107, CUDA 11.2. Checked PCI connection and cables of power supply: all ok. In nvidia-bug-report can’t found any errors or XIDs. Pls, help.
nvidia-bug-report.log.gz (456.3 KB)
NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x26:0xffff:1290)
This seems to be a headless system?
Please install/enable the nvidia-persistenced daemon to start on boot.
Yep! Persistence mode solved the problem, thanks!