Hello,
I’m getting an error message when I type nvidia-smi (2 devices NVIDIA GeForce RTX 2080 Ti ):
- Unable to determine the device handle for GPU 0000:0A:00.0: Unknown Error
Also when I’m trying to connect with docker I get this one:
- docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #1:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: detection error: nvml error: unknown error: unknown.
I did a reboot and then started again training a deep learning model but after two epochs I got the same error. You can find attached the log file from the bug.
Thank you in advance for your help!
nvidia-bug-report.log (1).gz (313.1 KB)