535.86.05 issue with 4090. 4090 stuck in P0 state, Fan speed shows ERR!

I have 6x RTX 4090s installed on a headless Linux machine with driver nvidia-headless-535 (535.86.05-0ubuntu0.22.04.1 amd64). However, they all stuck in the P0 statue with an error reported about the fan speed, even without any process running on them.

Besides, the GPU utilization rate is not correctly reflected either. It always shows 0%.

I attached the report generated by nvidia-bug-report.sh here.
nvidia-bug-report.log.gz (1.8 MB)

1 Like

It looks like the mother board somehow cannot handle so many PCIE 4.0 devices properly. I switched to PCIE 3.0 and the problem solved.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.