Hi,
I am using RTX A5000 on ubuntu OS to run GPU jobs. Jobs were running fine but often I am seeing the following error messages connecting to GPU
→ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
NVML library version: 535.129
→ nvidia-detector
nvidia-driver-535
→ nvidia-settings
ERROR: The control display is undefined; please run nvidia-settings --help
for usage information.
If there is any issue with version why should even the jobs run for long time and encountering such mismatch error every now and then.
Could you please help me to fix this issue so that i should not have such problems in future. Thanks