GPU won't boost as high as previously [Driver 460.32.03]

Hi, I’m using a laptop with an NVIDIA GTX 1070 Max-Q inside as a secondary GPU (MSI GS65 Stealth Thin 8RF to be exact) running Arch Linux.

I’m working on a visualizer for high-resolution medical images which utilizes a raytracing-like method to approximate a voxel grid visualization using a proxy mesh to accelerate the raycasting. As such, I need to squeeze all the performance I can get out of my GPU.

Over the weekend, I updated my system (last known nvidia driver was 455.XX, last updated at the beginning of December if I’m not mistaken) to the latest driver 460.32.03 (2nd revision according to pacman which reports the version number as 460.32.03-2). I immediately notice a slight drop in performance for my application, but my whole system is extremely laggy as well (at least .25 seconds of latency for mouse movement when the OpenGL application is presenting, a bit less if it’s in the background).

When using the nvidia-smi program, I notice the GPU won’t use more than 15-17W of power even though the laptop is plugged in (where it used to use upwards of 30W under heavy load previously). Was there any change in the power governor between 455.XX and 460.32 ? Could I set a variable (in nvidia-settings maybe?) to make it use as much power as before when plugged in ?

Would this regression in power usage maybe be caused by an external program ? I’ve used tlp for as long as I’ve had this laptop to manage battery life, but

nvidia-settings->Powermizer
Make sure AC power is detected and check performance setting Auto/Adaptive/Performance
Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

I’ve just tried setting the Powermizer to prefer performance, and it hasn’t changed the power usage reported by nvidia-smi. GPU still won’t use more than 15-17W of performance. Attached is a bug report I generated right after closing my application.

EDIT : although, disabling my compositor (picom) did improve performance of the system slightly. Didn’t change the power usage, though.

nvidia-bug-report.log.gz (469.4 KB)

        SW Thermal Slowdown               : Active

Plese check your gpu temperature using nvidia-settings or nvidia-smi after a fresh boot.

Before shutdown, GPU was at 48-50°C (slowdown temp is 94°C according to nvidia-settings). After being shutdown during my lunch break, turned my laptop back on and the GPU was at 38-40°C while idle.

Tried to run my program again, this time the GPU can boost up the way it used to. Temperatures are now in the 70-80°C range under full load, and it can consume up to 60-70W. Seems a reboot was all it needed (weird, since I rebooted this morning before the issue appeared).

Maybe Powermizer checked the wrong temperature probe to initiate a slowdown ?

Any way, I’ll mark this as solved, and open it back up if the issue comes up again. Thank you for your help.

@generix Same issue happened once again this morning.

It appears that if I start my X session on battery, no matter the power state set afterwards, the GPU will stay locked on performance level P0 (or P1). However, (re)starting the X server while plugged in allows to boost up to P4 (maximum performance level for my card according to nvidia-settings).

If you need any more information, feel free to send a message.