I have 6x RTX 4090s installed on a headless Linux machine with driver nvidia-headless-535 (535.86.05-0ubuntu0.22.04.1 amd64). However, they all stuck in the P0 statue with an error reported about the fan speed, even without any process running on them.
Besides, the GPU utilization rate is not correctly reflected either. It always shows 0%.
I attached the report generated by nvidia-bug-report.sh
here.
nvidia-bug-report.log.gz (1.8 MB)