What is nv_queue and why is it the top process on my system?

I did an experiment. Rebooted the computer in the morning and after running it 8 hours completely idle, here are the top processes:

$ ps -ef|awk '{print $7 " " $8}'|sort -r|head
TIME CMD
00:25:03 [<b>nv_queue</b>]
00:13:45 /usr/libexec/Xorg
00:06:02 gkrellm
00:02:47 marco
00:01:22 audacious
00:00:52 [<b>nvidia-modeset/</b>]
00:00:35 /usr/libexec/mate-panel/clock-applet
00:00:33 [<b>irq/140-nvidia</b>]
00:00:12 [kworker/2:3-events_power_efficient]

There’s a new kernel thread with version 430 called nv_queue and that eats up 5% CPU non-stop.

It’s the same with 418.74, only there’s no “nv_queue” process, but instead, a “kworker/x:y-events” process, with x corresponding to the CPU which receives IRQ 140 (irq/140-nvidia as seen above). In 430.14, the nv_queue thread can actually be scheduled on different cores, instead of always putting the load on the core receiving IRQ from GPU - so this is an improvement compared to 418, but still - why does it need 5% CPU when the GPU does almost nothing?

Interestingly, this 5% load happens ONLY IN P8 performance mode. When the card is in P0/P2/P5 mode, nv_queue (or kworker in older versions) suddenly goes silent. If I set the card to performance mode with nvidia-settings, nv_queue disappears from top, permanently.

So what is nv_queue and why does it use up my CPU?

Or perhaps it shouldn’t do that and my hardware is broken in some way? I did a lot of testing, switching GPU BIOS, disabling PCIeGen3, MSI, ACPI, APM… nothing helps. Unfortunately I don’t have hardware to swap out GPU/CPU/motherboard and try. But I think it’s fine, everything just works ;)

Full specs:

Fedora 30, kernel 5.0.17, nvidia driver 430.14 from rpmfusion for rawhide (yum --releasever=31 update *nvidia*). On 418.74 same issue manifests with kworker thread instead of nv_queue. Haven’t tried pre-5.0 kernel or older nvidia driver (yet).

Hardware is a GTX 1080 Ti paired with an i7-9700K on a Z390 board.
nvidia-bug-report.log.gz (1.2 MB)

I was literally about to ask the same thing. Using Arch and 430.14. I’m using the Adaptive Power scheme and it does seem to occur mostly during idle states. 1070ti on kernel 5.1.5. Seems to be a lot more IRQ’s with either the kernel, driver or both.

IDK, maybe a new side effect of Force(Full)CompositionPipeline?

Why do you think it could be related to composition pipeline?

Based on experience with other issues,

  1. Force(Full)CompositionPipeline is an option with some unexpected side effects
  2. Lamieur is using the old pipeline (UseNvKmsCompositionPipeline=false) which previously showed signs of degradation
  3. I’m not using that option, I’m not observing that issue
  4. IDK

I’ve commented out Option “UseNvKmsCompositionPipeline” “false”.

Now nv_queue uses up not 5% but 12%-20% of a core! (Again: this pops up only when the card is in P8 state)

Also, nvidia-settings -a CurrentMetaMode=… omitting ‘ForceCompositionPipeline=On, ForceFullCompositionPipeline=On’ drops it back to the ~5% CPU, NOT to 0.

This makes sense - the new NvKmsCompositionPipeline is still problematic, but I can remove its effects by either disabling that in xorg.conf, or not using Full Composition Pipeline options in the first place.

But even with that thing disabled, nv_queue is still much too active for my liking.

generix, if you don’t see this effect, there must be some other difference between our systems, hardware or software.

Or maybe you’ve just never noticed it - like me, it never bothered me until I saw it and now it triggers the hell out of me! :)

Let the system sit completely idle after logging in - no web browser or anything running, leave it for a minute until nvidia-smi shows P8, then check top.

I put an eye on it when you first posted, let it idle and watch top. There are three nv_queue processes and they’re rarely ever using any cpu, only jumping shortly to 1% when something moves on the screen (top) and the aggregated running time over the course of some hours is 0-3 seconds. Driver 430.14.
Which DE/WM are you using?

Lamieur. Mine is jumping around also, and I really don’t think it’s your hardware, you’d be experiencing a lot more issues then just a 5% spike here and there. There’s nothing you can do to fix/negate at this point in time. 5% CPU spikes is no need for concern.

Not so bad here:

2222 root     -51                           S   1.0         2:56.71 irq/39-nvidia

i.e. ~3 minutes of CPU time after 5 hours of uptime.

Section "Device"
    Identifier  "Videocard0"
    BusID       "PCI:1:0:0"
    Driver      "nvidia"
    VendorName  "NVIDIA"
    BoardName   "NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)"
    Option      "Coolbits" "28"
    Option      "metamodes" "nvidia-auto-select +0+0 {ForceCompositionPipeline=On, ForceFullCompositionPipeline=On}"
    Option      "RegistryDwords" "PowerMizerEnable=0x1; PerfLevelSrc=0x2222; PowerMizerLevel=0x3; PowerMizerDefault=0x3; PowerMizerDefaultAC=0x3"
    Option      "UseNvKmsCompositionPipeline" "Off"
    Option      "TripleBuffer" "On"
EndSection

I don’t use/change any kernel module parameters.

# uname -a
Linux localhost 5.0.19-ic64 #10 SMP PREEMPT x86_64 x86_64 x86_64 GNU/Linux

Is there any update on this? I am experiencing the same thing (constant 20% CPU utilisation by nv_queue) with driver 430.40. System is running Centos 7 and the GPU is a Titan V.

1 Like

Hi there!

Searching google i found this thread. I’m having the same issue. About 10-12% of cpu usage in random cores.

OS: Manjaro 19.01 XFCE
nvidia driver: tried 440.59 then downgraded to 435.21.
kernel: downgraded to LTS one (5.4).
Did not help.

However disabling “force composition pipeline” and “force full composition pipeline” , helps on this issue. But doing so, brings back screen tearing (at least on XFCE).
Is there something i can do?

cheers

Solved by adding these lines to XORG

Option      "metamodes" "nvidia-auto-select +0+0 {ForceCompositionPipeline=On, ForceFullCompositionPipeline=On}"
    Option      "UseNvKmsCompositionPipeline" "Off"
    Option      "TripleBuffer" "On"

As instructed by the great Manjaro folks :)

https://forum.manjaro.org/t/nv-queue-using-10-cpu-on-idle/126997/2

I think you missed the part when I complained that 5% CPU is still too much for doing literally nothing.

What you did was dropping from 20% to 5% (numbers from my system). But I started with UseNvKmsCompositionPipeline disabled when reporting this, and confirmed it’s much much worse with that crap enabled. Which is the default. Which is so bad it made you look for solutions. But with that work-around the driver wastes less, but still wastes a lot of CPU cycles when the GPU is idle.

I know this is really small amount of actual power wasted (compared to keeping the GPU in P5, this is still 20W+ cheaper on my 1080 Ti, even if the CPU can’t go to sleep), but it still triggers my OCD.

(BTW, behavior not changed since posting until today, now kernel 5.5 and driver 440.64)

Now in 2021, I’m having a similar problem under CUDA 11.4 and the latest 470.57.02 driver. The nvidia-smi command takes about 20 minutes to run through all eight V100 cards. The cards are completely idle, and this behavior is exhibited even after a system reboot.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3423 root 20 0 0 0 0 R 69.2 0.0 345:12.60 nv_queue
2429 root 20 0 0 0 0 R 61.2 0.0 233:26.21 nv_queue
3038 root 20 0 0 0 0 R 58.3 0.0 314:05.02 nv_queue
2375 root 20 0 0 0 0 D 55.8 0.0 120:53.79 nv_queue
2408 root 20 0 0 0 0 S 54.8 0.0 204:47.78 nv_queue
2483 root 20 0 0 0 0 R 45.8 0.0 205:49.71 nv_queue
2624 root 20 0 0 0 0 S 42.6 0.0 186:32.07 nv_queue
2747 root 20 0 0 0 0 S 34.6 0.0 213:28.95 nv_queue

That’s after about eight hours of uptime.

I suppose I’ll try using NVreg_GpuBlacklist to enable one GPU at a time to see if it’s a problem with a particular card.

2 Likes

Now in 2024, I’m seeing the same thing with an 8 GPU system.
Did you find out what is going on?

PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                               

3459 root 20 0 0 0 0 R 28.1 0.0 366:56.32 nv_queue
2869 root 20 0 0 0 0 S 24.8 0.0 353:25.75 nv_queue
3449 root 20 0 0 0 0 S 21.5 0.0 364:55.66 nv_queue
3398 root 20 0 0 0 0 S 20.2 0.0 290:00.29 nv_queue
3439 root 20 0 0 0 0 S 19.2 0.0 284:34.17 nv_queue
3469 root 20 0 0 0 0 R 19.2 0.0 319:37.85 nv_queue
3228 root 20 0 0 0 0 S 18.5 0.0 304:58.38 nv_queue
3388 root 20 0 0 0 0 S 18.2 0.0 276:26.78 nv_queue

Cheers.