GUI Lag under GPU load with RTX 3090 in Linux

My system’s UI (with RTX 3090) lags horribly when under any heavy CUDA computation even when a fraction of the total VRAM is being used. The Volatile GPU-Util is 100% however I am unclear as to how does that relate to the sluggishness. The system becomes pretty unusable as everything (mouse cursors, windows) lags terribly.
My system specifications are as follows -

System:
  Kernel: 5.10.52-1-MANJARO x86_64 bits: 64 compiler: gcc v: 11.1.0 
  parameters: BOOT_IMAGE=/boot/vmlinuz-5.10-x86_64 
  root=UUID=ea7605c0-b50c-450a-9352-4eb157326167 rw processor.max_cstate=5 
  rcu_nocbs=0-11 quiet splash iommt=pt quiet apparmor=1 security=apparmor 
  resume=UUID=e2bf2cd5-6f9a-461f-aece-916853371e98 udev.log_priority=3 
  pcie_aspm=off idle=nomwait 
  Desktop: KDE Plasma 5.22.4 tk: Qt 5.15.2 wm: kwin_x11 vt: 1 dm: SDDM 
  Distro: Manjaro Linux base: Arch Linux 
Machine:
  Type: Desktop Mobo: Micro-Star model: MPG X570 GAMING EDGE WIFI (MS-7C37) 
  v: 1.0 serial: <filter> UEFI: American Megatrends LLC. v: 1.D0 
  date: 01/25/2021 
Battery:
  Message: No system battery data found. Is one present? 
Memory:
  RAM: total: 31.34 GiB used: 10.77 GiB (34.4%) 
  RAM Report: permissions: Unable to run dmidecode. Root privileges required. 
CPU:
  Info: 8-Core model: AMD Ryzen 7 5800X bits: 64 type: MT MCP arch: Zen 3 
  family: 19 (25) model-id: 21 (33) stepping: 0 microcode: A201009 cache: 
  L2: 4 MiB bogomips: 121654 
  Speed: 3600 MHz min/max: 2200/3800 MHz boost: enabled Core speeds (MHz): 
  1: 3600 2: 2873 3: 2880 4: 2876 5: 2870 6: 2872 7: 2878 8: 2879 9: 2880 
  10: 2874 11: 2880 12: 2879 13: 2880 14: 2881 15: 3584 16: 2974 
  Flags: 3dnowprefetch abm adx aes aperfmperf apic arat avic avx avx2 bmi1 
  bmi2 bpext cat_l3 cdp_l3 clflush clflushopt clwb clzero cmov cmp_legacy 
  constant_tsc cpb cpuid cqm cqm_llc cqm_mbm_local cqm_mbm_total cqm_occup_llc 
  cr8_legacy cx16 cx8 de decodeassists erms extapic extd_apicid f16c 
  flushbyasid fma fpu fsgsbase fsrm fxsr fxsr_opt ht hw_pstate ibpb ibrs ibs 
  invpcid irperf lahf_lm lbrv lm mba mca mce misalignsse mmx mmxext monitor 
  movbe msr mtrr mwaitx nonstop_tsc nopl npt nrip_save nx ospke osvw 
  overflow_recov pae pat pausefilter pclmulqdq pdpe1gb perfctr_core 
  perfctr_llc perfctr_nb pfthreshold pge pku pni popcnt pse pse36 rdpid rdpru 
  rdrand rdseed rdt_a rdtscp rep_good sep sha_ni skinit smap smca smep ssbd 
  sse sse2 sse4_1 sse4_2 sse4a ssse3 stibp succor svm svm_lock syscall tce 
  topoext tsc tsc_scale umip v_vmsave_vmload vaes vgif vmcb_clean vme vmmcall 
  vpclmulqdq wbnoinvd wdt xgetbv1 xsave xsavec xsaveerptr xsaveopt xsaves 
  Vulnerabilities: Type: itlb_multihit status: Not affected 
  Type: l1tf status: Not affected 
  Type: mds status: Not affected 
  Type: meltdown status: Not affected 
  Type: spec_store_bypass 
  mitigation: Speculative Store Bypass disabled via prctl and seccomp 
  Type: spectre_v1 
  mitigation: usercopy/swapgs barriers and __user pointer sanitization 
  Type: spectre_v2 mitigation: Full AMD retpoline, IBPB: conditional, IBRS_FW, 
  STIBP: always-on, RSB filling 
  Type: srbds status: Not affected 
  Type: tsx_async_abort status: Not affected 
Graphics:
  Device-1: NVIDIA GA102 [GeForce RTX 3090] driver: nvidia v: 470.57.02 
  alternate: nouveau,nvidia_drm bus-ID: 2d:00.0 chip-ID: 10de:2204 
  class-ID: 0300 
  Display: x11 server: X.Org 1.20.11 compositor: kwin_x11 driver: 
  loaded: nvidia note: n/a (using device driver) unloaded: nvidia 
  display-ID: :0 screens: 1 
  Screen-1: 0 s-res: 5460x2880 s-dpi: 144 s-size: 963x508mm (37.9x20.0") 
  s-diag: 1089mm (42.9") 
  Monitor-1: HDMI-0 res: 1620x2880 hz: 60 
  Monitor-2: DP-0 res: 3840x2160 hz: 60 dpi: 163 size: 600x340mm (23.6x13.4") 
  diag: 690mm (27.2") 
  OpenGL: renderer: NVIDIA GeForce RTX 3090/PCIe/SSE2 
  v: 4.6.0 NVIDIA 470.57.02 direct render: Yes 
Audio:
  Device-1: NVIDIA GA102 High Definition Audio driver: snd_hda_intel v: kernel 
  bus-ID: 2d:00.1 chip-ID: 10de:1aef class-ID: 0403 
  Device-2: AMD Starship/Matisse HD Audio vendor: Micro-Star MSI 
  driver: snd_hda_intel v: kernel bus-ID: 2f:00.4 chip-ID: 1022:1487 
  class-ID: 0403 
  Sound Server-1: ALSA v: k5.10.52-1-MANJARO running: yes 
  Sound Server-2: JACK v: 1.9.19 running: no 
  Sound Server-3: PulseAudio v: 14.2 running: yes 
  Sound Server-4: PipeWire v: 0.3.32 running: yes 
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet 
  vendor: Micro-Star MSI X570-A PRO driver: r8169 v: kernel port: d000 
  bus-ID: 27:00.0 chip-ID: 10ec:8168 class-ID: 0200 
  IF: enp39s0 state: down mac: <filter> 
  Device-2: Intel Dual Band Wireless-AC 3168NGW [Stone Peak] driver: iwlwifi 
  v: kernel port: d000 bus-ID: 29:00.0 chip-ID: 8086:24fb class-ID: 0280 
  IF: wlp41s0 state: up mac: <filter> 
  IP v4: <filter> type: dynamic noprefixroute scope: global 
  broadcast: <filter> 
  IP v6: <filter> type: noprefixroute scope: link 
  IF-ID-1: tun0 state: unknown speed: 10 Mbps duplex: full mac: N/A 
  IP v4: <filter> scope: global 
  WAN IP: <filter> 
Bluetooth:
  Device-1: Intel Wireless-AC 3168 Bluetooth type: USB driver: btusb v: 0.8 
  bus-ID: 1-4:2 chip-ID: 8087:0aa7 class-ID: e001 
  Report: rfkill ID: hci0 rfk-id: 2 state: up address: see --recommends 
Logical:
  Message: No logical block device data found. 
RAID:
  Message: No RAID data found. 
Drives:
  Local Storage: total: 2.27 TiB used: 736.59 GiB (31.6%) 
  SMART Message: Unable to run smartctl. Root privileges required. 
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Western Digital 
  model: WDS500G3X0C-00SJG0 size: 465.76 GiB block-size: physical: 512 B 
  logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter> 
  rev: 111110WD temp: 45.9 C scheme: GPT 
  ID-2: /dev/sda maj-min: 8:0 vendor: Seagate model: ST32000644NS 
  size: 1.82 TiB block-size: physical: 512 B logical: 512 B speed: 3.0 Gb/s 
  type: HDD rpm: 7200 serial: <filter> rev: KA06 scheme: GPT 
  Message: No optical or floppy data found. 
Partition:
  ID-1: / raw-size: 430.99 GiB size: 423.22 GiB (98.20%) 
  used: 147.87 GiB (34.9%) fs: ext4 dev: /dev/nvme0n1p2 maj-min: 259:2 
  label: N/A uuid: ea7605c0-b50c-450a-9352-4eb157326167 
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%) 
  used: 3.8 MiB (1.3%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1 label: N/A 
  uuid: 9566-45E4 
  ID-3: /mnt/sda1 raw-size: 1.82 TiB size: 1.79 TiB (98.38%) 
  used: 588.71 GiB (32.1%) fs: ext4 dev: /dev/sda1 maj-min: 8:1 label: N/A 
  uuid: ec318b9e-67f0-4bfe-a13d-1e7ec0a62e57 
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default) 
  ID-1: swap-1 type: partition size: 34.48 GiB used: 512 KiB (0.0%) 
  priority: -2 dev: /dev/nvme0n1p3 maj-min: 259:3 label: N/A 
  uuid: e2bf2cd5-6f9a-461f-aece-916853371e98 
Unmounted:
  Message: No unmounted partitions found. 
USB:
  Hub-1: 1-0:1 info: Full speed (or root) Hub ports: 6 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Device-1: 1-4:2 info: Intel Wireless-AC 3168 Bluetooth type: Bluetooth 
  driver: btusb interfaces: 2 rev: 2.0 speed: 12 Mb/s power: 100mA 
  chip-ID: 8087:0aa7 class-ID: e001 
  Hub-2: 2-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.1 speed: 10 Gb/s 
  chip-ID: 1d6b:0003 class-ID: 0900 
  Hub-3: 3-0:1 info: Full speed (or root) Hub ports: 6 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Device-1: 3-5:2 info: Micro Star MYSTIC LIGHT type: HID 
  driver: hid-generic,usbhid interfaces: 1 rev: 1.1 speed: 12 Mb/s 
  power: 500mA chip-ID: 1462:7c37 class-ID: 0300 serial: <filter> 
  Hub-4: 3-6:3 info: Genesys Logic Hub ports: 4 rev: 2.0 speed: 480 Mb/s 
  power: 100mA chip-ID: 05e3:0608 class-ID: 0900 
  Hub-5: 4-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.1 speed: 10 Gb/s 
  chip-ID: 1d6b:0003 class-ID: 0900 
  Hub-6: 5-0:1 info: Full speed (or root) Hub ports: 4 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Hub-7: 5-3:26 info: Terminus Hub ports: 4 rev: 2.0 speed: 480 Mb/s 
  power: 100mA chip-ID: 1a40:0101 class-ID: 0900 
  Device-1: 5-3.1:27 info: Logitech M105 Optical Mouse type: Mouse 
  driver: hid-generic,usbhid interfaces: 1 rev: 2.0 speed: 1.5 Mb/s 
  power: 100mA chip-ID: 046d:c077 class-ID: 0301 
  Device-2: 5-3.2:28 info: Razer USA BlackWidow Elite type: Keyboard,Mouse 
  driver: razerkbd,usbhid interfaces: 3 rev: 2.0 speed: 12 Mb/s power: 500mA 
  chip-ID: 1532:0228 class-ID: 0300 
  Hub-8: 6-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.1 speed: 10 Gb/s 
  chip-ID: 1d6b:0003 class-ID: 0900 
Sensors:
  System Temperatures: cpu: 45.2 C mobo: 37.0 C gpu: nvidia temp: 42 C 
  Fan Speeds (RPM): fan-1: 1638 fan-2: 798 fan-3: 1114 fan-4: 1040 fan-5: 1073 
  fan-6: 1022 fan-7: 0 gpu: nvidia fan: 0% 
Info:
  Processes: 385 Uptime: 5d 17h 45m wakeups: 0 Init: systemd v: 248 
  tool: systemctl Compilers: gcc: 11.1.0 alt: 10 clang: 12.0.1 Packages: 
  pacman: 1540 lib: 382 flatpak: 0 Shell: Zsh v: 5.8 running-in: yakuake 
  inxi: 3.3.06

Results of nvidia-smi when running a CUDA heavy code (and super sluggish UI)

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:2D:00.0  On |                  N/A |
|100%   49C    P0   255W / 350W |   6041MiB / 24234MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    831702      G   /usr/lib/Xorg                     571MiB |
|    0   N/A  N/A    834035      G   /usr/lib/firefox/firefox          275MiB |
|    0   N/A  N/A    834125      G   /usr/lib/firefox/firefox            4MiB |
|    0   N/A  N/A   1139272      G   /usr/lib/firefox/firefox            4MiB |
|    0   N/A  N/A   1373772      G   /usr/bin/kwin_x11                   9MiB |
|    0   N/A  N/A   1776538      G   /usr/lib/firefox/firefox            4MiB |
|    0   N/A  N/A   2053977      G   ...AAAAAAAAA= --shared-files      183MiB |
|    0   N/A  N/A   2111313      G   /usr/lib/firefox/firefox            4MiB |
|    0   N/A  N/A   2213402      G   /usr/lib/firefox/firefox            4MiB |
|    0   N/A  N/A   2224251      C   nsfminer                         4797MiB |
|    0   N/A  N/A   2225088      G   /usr/bin/plasmashell              149MiB |
|    0   N/A  N/A   2225678      G   ...ellRnoDYD.13.slave-socket        4MiB |
+-----------------------------------------------------------------------------+

What is the probable cause of this extreme lag? How can I fix this?

1 Like