Low performance in newer Kernels

Hello everyone!

I have installed fedora 41 on my Lenovo Legion 5i Pro (2022, gen 7) as a daily driver OS. It functioned well out-of-the-box in all scenarios so far, except gaming. The laptop uses a MUXed optimus setup, with a display output switch between iGPU and dGPU in the BIOS, with advanced optimus capability in Windows.

Output from ‘inxi -Farzy’ on affected kernel:

System:
  Kernel: 6.13.6-200.fc41.x86_64 arch: x86_64 bits: 64 compiler: gcc v: 14.2.1
    clocksource: tsc avail: acpi_pm
    parameters: BOOT_IMAGE=(hd1,gpt3)/vmlinuz-6.13.6-200.fc41.x86_64
    root=UUID=301a0013-fcf1-4cf4-9b18-9fee9b3fe413 ro rootflags=subvol=root
    resume=UUID=e58e2e02-ecb9-4eae-a716-87ebcd673291 rhgb nouveau.modeset=0
    splash hibernate=nocompress nvidia_drm.modeset=1 nvidia_drm.fbdev=1
    ibt=off rd.driver.blacklist=nouveau modprobe.blacklist=nouveau
    "acpi_osi=Windows 2022"
  Desktop: KDE Plasma v: 6.3.3 tk: Qt v: N/A info: frameworks v: 6.12.0
    wm: kwin_wayland vt: 2 dm: SDDM Distro: Fedora Linux 41 (KDE Plasma)
Machine:
  Type: Laptop System: LENOVO product: 82RF v: Legion 5 Pro 16IAH7H
    serial: <superuser required> Chassis: type: 10 v: Legion 5 Pro 16IAH7H
    serial: <superuser required>
  Mobo: LENOVO model: LNVNB161216 v: NO DPK serial: <superuser required>
    part-nu: LENOVO_MT_82RF_BU_idea_FM_Legion 5 Pro 16IAH7H
    uuid: <superuser required> UEFI: LENOVO v: J2CN57WW date: 01/08/2024
Battery:
  ID-1: BAT0 charge: 62.1 Wh (77.0%) condition: 80.7/80.0 Wh (100.8%)
    volts: 16.2 min: 15.4 model: Sunwoda L21D4PC1 type: Li-poly serial: <filter>
    status: not charging cycles: 26
CPU:
  Info: model: 12th Gen Intel Core i7-12700H bits: 64 type: MST AMCP
    arch: Alder Lake gen: core 12 level: v3 note: check built: 2021+
    process: Intel 7 (10nm ESF) family: 6 model-id: 0x9A (154) stepping: 3
    microcode: 0x436
  Topology: cpus: 1x dies: 1 clusters: 8 cores: 14 threads: 20 mt: 6 tpc: 2
    st: 8 smt: enabled cache: L1: 1.2 MiB desc: d-8x32 KiB, 6x48 KiB; i-6x32
    KiB, 8x64 KiB L2: 11.5 MiB desc: 6x1.2 MiB, 2x2 MiB L3: 24 MiB
    desc: 1x24 MiB
  Speed (MHz): avg: 400 min/max: 400/4600:4700:3500 scaling:
    driver: intel_pstate governor: powersave cores: 1: 400 2: 400 3: 400 4: 400
    5: 400 6: 400 7: 400 8: 400 9: 400 10: 400 11: 400 12: 400 13: 400 14: 400
    15: 400 16: 400 17: 400 18: 400 19: 400 20: 400 bogomips: 107520
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: reg_file_data_sampling mitigation: Clear Register File
  Type: retbleed status: Not affected
  Type: spec_rstack_overflow status: Not affected
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Enhanced / Automatic IBRS; IBPB: conditional;
    RSB filling; PBRSB-eIBRS: SW sequence; BHI: BHI_DIS_S
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: Intel Alder Lake-P GT2 [Iris Xe Graphics] vendor: Lenovo
    driver: i915 v: kernel alternate: xe arch: Xe process: Intel 10nm
    built: 2021-22+ ports: active: eDP-1 empty: DP-1,DP-2 bus-ID: 00:02.0
    chip-ID: 8086:46a6 class-ID: 0300
  Device-2: NVIDIA GA104M [GeForce RTX 3070 Mobile / Max-Q] vendor: Lenovo
    driver: nvidia v: 570.124.04 alternate: nouveau,nvidia_drm
    non-free: 550/565.xx+ status: current (as of 2025-01; EOL~2026-12-xx)
    arch: Ampere code: GAxxx process: TSMC n7 (7nm) built: 2020-2023 pcie:
    gen: 1 speed: 2.5 GT/s lanes: 8 link-max: gen: 4 speed: 16 GT/s lanes: 16
    ports: active: none empty: DP-3, DP-4, HDMI-A-1, eDP-2 bus-ID: 01:00.0
    chip-ID: 10de:24dd class-ID: 0300
  Display: wayland server: Xwayland v: 24.1.6 compositor: kwin_wayland
    driver: X: loaded: modesetting,nvidia unloaded: nouveau
    alternate: fbdev,nv,vesa dri: iris gpu: i915 display-ID: 0
  Monitor-1: eDP-1 model: California Institute of eDP-1-0x1612 built: 2021
    res: mode: 2560x1600 hz: 165 scale: 120% (1.2) to: 2133x1333 dpi: 188
    gamma: 1.2 size: 345x215mm (13.58x8.46") diag: 407mm (16") ratio: 16:10
    modes: 2560x1600
  API: EGL v: 1.5 hw: drv: intel iris drv: nvidia platforms: device: 0
    drv: nvidia gbm: drv: nvidia surfaceless: drv: nvidia wayland: drv: iris x11:
    drv: iris
  API: OpenGL v: 4.6.0 compat-v: 4.6 vendor: intel mesa v: 25.0.1 glx-v: 1.4
    direct-render: yes renderer: Mesa Intel Iris Xe Graphics (ADL GT2)
    device-ID: 8086:46a6 memory: 15.17 GiB unified: yes display-ID: :0.0
  API: Vulkan v: 1.4.304 layers: 11 device: 0 type: integrated-gpu name: Intel
    Iris Xe Graphics (ADL GT2) driver: N/A device-ID: 8086:46a6
    surfaces: xcb,xlib,wayland device: 1 type: discrete-gpu name: NVIDIA
    GeForce RTX 3070 Laptop GPU driver: N/A device-ID: 10de:24dd
    surfaces: xcb,xlib,wayland device: 2 type: cpu name: llvmpipe (LLVM 19.1.7
    256 bits) driver: N/A device-ID: 10005:0000 surfaces: xcb,xlib,wayland
  Info: Tools: api: clinfo, eglinfo, glxinfo, vulkaninfo
    de: kscreen-console,kscreen-doctor gpu: nvidia-settings,nvidia-smi
    wl: wayland-info x11: xdriinfo, xdpyinfo, xprop, xrandr
Audio:
  Device-1: Intel Alder Lake PCH-P High Definition Audio vendor: Lenovo
    driver: snd_hda_intel v: kernel alternate: snd_soc_avs,snd_sof_pci_intel_tgl
    bus-ID: 00:1f.3 chip-ID: 8086:51c8 class-ID: 0403
  Device-2: NVIDIA GA104 High Definition Audio vendor: Lenovo
    driver: snd_hda_intel v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 8
    link-max: gen: 4 speed: 16 GT/s lanes: 16 bus-ID: 01:00.1
    chip-ID: 10de:228b class-ID: 0403
  API: ALSA v: k6.13.6-200.fc41.x86_64 status: kernel-api
    tools: alsactl,alsamixer,amixer
  Server-1: PipeWire v: 1.2.7 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel Alder Lake-P PCH CNVi WiFi driver: iwlwifi v: kernel
    bus-ID: 00:14.3 chip-ID: 8086:51f0 class-ID: 0280
  IF: wlp0s20f3 state: up mac: <filter>
  Device-2: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet
    vendor: Lenovo driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1
    port: 3000 bus-ID: 34:00.0 chip-ID: 10ec:8168 class-ID: 0200
  IF: enp52s0 state: down mac: <filter>
  Info: services: NetworkManager,wpa_supplicant
Bluetooth:
  Device-1: Intel AX211 Bluetooth driver: btusb v: 0.8 type: USB rev: 2.0
    speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 3-10:6 chip-ID: 8087:0033
    class-ID: e001
  Report: btmgmt ID: hci0 rfk-id: 2 state: down bt-service: enabled,running
    rfk-block: hardware: no software: yes address: <filter> bt-v: 5.3 lmp-v: 12
    status: discoverable: no pairing: no
Drives:
  Local Storage: total: 1.84 TiB used: 568.28 GiB (30.1%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:2 vendor: SanDisk model: SC930 PRO 1TB
    size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 63.2 Gb/s
    lanes: 4 tech: SSD serial: <filter> fw-rev: SN12362 temp: 44.9 C
    scheme: GPT
  ID-2: /dev/nvme1n1 maj-min: 259:0 vendor: Samsung model: MZVL21T0HCLR-00BL2
    size: 953.87 GiB block-size: physical: 512 B logical: 512 B speed: 63.2 Gb/s
    lanes: 4 tech: SSD serial: <filter> fw-rev: CL1QGXA7 temp: 36.9 C
    scheme: GPT
Partition:
  ID-1: / raw-size: 897.73 GiB size: 897.73 GiB (100.00%)
    used: 567.86 GiB (63.3%) fs: btrfs dev: /dev/nvme0n1p5 maj-min: 259:12
  ID-2: /boot raw-size: 1024 MiB size: 973.4 MiB (95.06%)
    used: 347 MiB (35.6%) fs: ext4 dev: /dev/nvme0n1p3 maj-min: 259:10
  ID-3: /boot/efi raw-size: 800 MiB size: 256 MiB (32.00%)
    used: 80.8 MiB (31.5%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:9
  ID-4: /home raw-size: 897.73 GiB size: 897.73 GiB (100.00%)
    used: 567.86 GiB (63.3%) fs: btrfs dev: /dev/nvme0n1p5 maj-min: 259:12
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default) zswap: no
  ID-1: swap-1 type: zram size: 8 GiB used: 0 KiB (0.0%) priority: 100
    comp: lzo-rle avail: lzo,lz4,lz4hc,zstd,deflate,842 max-streams: 20
    dev: /dev/zram0
  ID-2: swap-2 type: partition size: 32 GiB used: 0 KiB (0.0%) priority: -2
    dev: /dev/nvme0n1p4 maj-min: 259:11
Sensors:
  System Temperatures: cpu: 61.0 C mobo: N/A
  Fan Speeds (rpm): N/A
Repos:
  Packages: pm: dpkg pkgs: 0 pm: rpm pkgs: N/A note: see --rpm tools: dnf,yum
    pm: flatpak pkgs: 40
  No active dnf repos in: /etc/dnf/dnf.conf
  Active yum repos in: /etc/yum.repos.d/_copr:copr.fedorainfracloud.org:gloriouseggroll:nobara-41.repo
    1: copr:copr.fedorainfracloud.org:gloriouseggroll:nobara-41 ~ https://download.copr.fedorainfracloud.org/results/gloriouseggroll/nobara-41/fedora-$releasever-$basearch/
  Active yum repos in: /etc/yum.repos.d/_copr:copr.fedorainfracloud.org:kwizart:kernel-longterm-6.1.repo
    1: copr:copr.fedorainfracloud.org:kwizart:kernel-longterm-6.1 ~ https://download.copr.fedorainfracloud.org/results/kwizart/kernel-longterm-6.1/fedora-$releasever-$basearch/
  No active yum repos in: /etc/yum.repos.d/_copr:copr.fedorainfracloud.org:kwizart:kernel-longterm-6.6.repo
  Active yum repos in: /etc/yum.repos.d/_copr:copr.fedorainfracloud.org:mrduarte:LenovoLegionLinux.repo
    1: copr:copr.fedorainfracloud.org:mrduarte:LenovoLegionLinux ~ https://download.copr.fedorainfracloud.org/results/mrduarte/LenovoLegionLinux/fedora-$releasever-$basearch/
  Active yum repos in: /etc/yum.repos.d/_copr:copr.fedorainfracloud.org:phracek:PyCharm.repo
    1: copr:copr.fedorainfracloud.org:phracek:PyCharm ~ https://download.copr.fedorainfracloud.org/results/phracek/PyCharm/fedora-$releasever-$basearch/
  No active yum repos in: /etc/yum.repos.d/_copr:copr.fedorainfracloud.org:rmnscnce:kernel-lqx.repo
  Active yum repos in: /etc/yum.repos.d/fedora-cisco-openh264.repo
    1: fedora-cisco-openh264 ~ https://mirrors.fedoraproject.org/metalink?repo=fedora-cisco-openh264-$releasever&arch=$basearch
  No active yum repos in: /etc/yum.repos.d/fedora-rawhide.repo
  No active yum repos in: /etc/yum.repos.d/fedora-updates-testing.repo
  Active yum repos in: /etc/yum.repos.d/fedora-updates.repo
    1: updates ~ https://mirrors.nobaraproject.org/fedora-updates
  Active yum repos in: /etc/yum.repos.d/fedora.repo
    1: fedora ~ https://mirrors.nobaraproject.org/fedora
  Active yum repos in: /etc/yum.repos.d/google-chrome.repo
    1: google-chrome ~ https://dl.google.com/linux/chrome/rpm/stable/x86_64
  Active yum repos in: /etc/yum.repos.d/hardware:razer.repo
    1: hardware_razer ~ https://download.opensuse.org/repositories/hardware:/razer/Fedora_$releasever/
  Active yum repos in: /etc/yum.repos.d/rpmfusion-nonfree-nvidia-driver.repo
    1: rpmfusion-nonfree-nvidia-driver ~ https://mirrors.rpmfusion.org/metalink?repo=nonfree-fedora-nvidia-driver-$releasever&arch=$basearch
  Active yum repos in: /etc/yum.repos.d/rpmfusion-nonfree-steam.repo
    1: rpmfusion-nonfree-steam ~ https://mirrors.rpmfusion.org/metalink?repo=nonfree-fedora-steam-$releasever&arch=$basearch
  Active yum repos in: /etc/yum.repos.d/rpmfusion-nonfree-tainted.repo
    1: rpmfusion-nonfree-tainted ~ https://mirrors.rpmfusion.org/metalink?repo=nonfree-fedora-tainted-$releasever&arch=$basearch
  No active yum repos in: /etc/yum.repos.d/rpmfusion-nonfree-updates-testing.repo
  Active yum repos in: /etc/yum.repos.d/rpmfusion-nonfree-updates.repo
    1: rpmfusion-nonfree-updates ~ https://mirrors.rpmfusion.org/metalink?repo=nonfree-fedora-updates-released-$releasever&arch=$basearch
  Active yum repos in: /etc/yum.repos.d/rpmfusion-nonfree.repo
    1: rpmfusion-nonfree ~ https://mirrors.rpmfusion.org/metalink?repo=nonfree-fedora-$releasever&arch=$basearch
Info:
  Memory: total: 32 GiB note: est. available: 31.06 GiB used: 3.57 GiB (11.5%)
  Processes: 448 Power: uptime: 1m states: freeze,mem,disk suspend: deep
    avail: s2idle wakeups: 0 hibernate: platform avail: shutdown, reboot,
    suspend, test_resume image: 12.41 GiB services: org_kde_powerdevil,upowerd
    Init: systemd v: 256 target: graphical (5) default: graphical
    tool: systemctl
  Compilers: gcc: 14.2.1 Shell: Bash v: 5.2.32 running-in: yakuake
    inxi: 3.3.37

On the latest kernel packages from fedora, there is a massive dip in performance in games, with Palworld sitting at ~50fps at High-Epic settings with DLSS Quality, where with kernel 6.1 it sits at ~72fps. Even vkcube does not lock to 165fps. Note also the strangely high power consumption.

Screenshot from 6.13:

Checking journalctl logs, I find some messages from pnp regarding io overlaps with my nvidia gpu that I don’t see with kernel 6.1:

Mar 16 15:40:58 fedora kernel: pnp: PnP ACPI init
Mar 16 15:40:58 fedora kernel: pnp 00:00: disabling [io  0x002e-0x002f] because it overlaps 0000:01:00.0 BAR 5 [io  0x0000-0x007f]
Mar 16 15:40:58 fedora kernel: pnp 00:00: disabling [io  0x004e-0x004f] because it overlaps 0000:01:00.0 BAR 5 [io  0x0000-0x007f]
Mar 16 15:40:58 fedora kernel: pnp 00:00: disabling [io  0x0061] because it overlaps 0000:01:00.0 BAR 5 [io  0x0000-0x007f]
Mar 16 15:40:58 fedora kernel: pnp 00:00: disabling [io  0x0063] because it overlaps 0000:01:00.0 BAR 5 [io  0x0000-0x007f]
Mar 16 15:40:58 fedora kernel: pnp 00:00: disabling [io  0x0065] because it overlaps 0000:01:00.0 BAR 5 [io  0x0000-0x007f]
Mar 16 15:40:58 fedora kernel: pnp 00:00: disabling [io  0x0067] because it overlaps 0000:01:00.0 BAR 5 [io  0x0000-0x007f]
Mar 16 15:40:58 fedora kernel: pnp 00:00: disabling [io  0x0070] because it overlaps 0000:01:00.0 BAR 5 [io  0x0000-0x007f]
Mar 16 15:40:58 fedora kernel: system 00:00: [io  0x0680-0x069f] has been reserved
Mar 16 15:40:58 fedora kernel: system 00:00: [io  0x164e-0x164f] has been reserved
Mar 16 15:40:58 fedora kernel: system 00:01: [io  0x1854-0x1857] has been reserved
Mar 16 15:40:58 fedora kernel: pnp 00:03: disabling [mem 0xc0000000-0xcfffffff] because it overlaps 0000:00:02.0 BAR 9 [mem 0x00000000-0xdfffffff 64bit pref]
Mar 16 15:40:58 fedora kernel: system 00:03: [mem 0xfedc0000-0xfedc7fff] has been reserved
Mar 16 15:40:58 fedora kernel: system 00:03: [mem 0xfeda0000-0xfeda0fff] has been reserved
Mar 16 15:40:58 fedora kernel: system 00:03: [mem 0xfeda1000-0xfeda1fff] has been reserved
Mar 16 15:40:58 fedora kernel: system 00:03: [mem 0xfed20000-0xfed7ffff] could not be reserved
Mar 16 15:40:58 fedora kernel: system 00:03: [mem 0xfed90000-0xfed93fff] could not be reserved
Mar 16 15:40:58 fedora kernel: system 00:03: [mem 0xfed45000-0xfed8ffff] could not be reserved
Mar 16 15:40:58 fedora kernel: system 00:03: [mem 0xfee00000-0xfeefffff] has been reserved
Mar 16 15:40:58 fedora kernel: system 00:04: [io  0x2000-0x20fe] has been reserved
Mar 16 15:40:58 fedora kernel: pnp: PnP ACPI: found 6 devices
0000:01:00.0 -> nvidia dGPU
0000:00:02.0 -> intel iGPU
system 00:03: -> ??? # I have no idea what this is

There seem to be no other discussions that I am able to find, and anything regarding the pnp overlaps seem to give me inactive threads or old patches from 2008. As things are right now, I make do with kernel 6.1, at the expense of S3 sleep, the Xe iGPU driver, and generally any new driver features and optimisations.

Output from ‘lspci -vv’ on 6.13:

01:00.0 VGA compatible controller: NVIDIA Corporation GA104M [GeForce RTX 3070 Mobile / Max-Q] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device 3833
        Physical Slot: 1
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 210
        IOMMU group: 16
        Region 0: Memory at 60000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: Memory at 4200000000 (64-bit, prefetchable) [size=8G]
        Region 3: Memory at 4100000000 (64-bit, prefetchable) [size=32M]
        Region 5: I/O ports at 5000 [size=128]
        Expansion ROM at 61000000 [virtual] [disabled] [size=512K]
        Capabilities: <access denied>
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_drm, nvidia

Full 6.13 dmesg log here: Pastebin

In case it helps, this has happened across multiple distros (Manjaro, Nobara Live image, fedora) on two different lenovo laptops since about September 2024. The issue was first noticed after a kernel update on Manjaro on the old laptop, which prompted a distro-hop.

Previous laptop specs:
Lenovo Ideapad Gaming 3i 15IMH05
Intel Core i7 10th gen H-series
Nvidia GTX 1650 Ti (MUXless Optimus)

Adding second screenshot using the unaffected 6.1 kernel here due to embedded media limitation for new users:

Screenshot from 6.1:

Is vkcube ran the same way in the screenshots? Because the Window border looks different, one is under X and one is under Wayland?

Anyway, if you run under wayland, make sure nvidia-drm module is properly loaded and that modeset is being used.
cat /sys/module/nvidia_drm/parameters/modeset

Should return Y

No, both are under wayland and modesetting is enabled.
Fedora 41 doesn’t offer an X session anymore.

The command has indeed returned ‘Y’

Window border thing is likely due to my use of Khronkite, a script to make Plasma behave like a tiling window manager.

Interesting issue tho, I went to try a 6,1 kernel but the latest Nvidia driver just crashed.
There is some pci kernel parameter you can try:

See pci realloc:

Tried this with kernel 6.13.7. Unfortunately, it made no difference. pnp errors continue and performance remains kneecapped.

cat /proc/cmdline
BOOT_IMAGE=(hd1,gpt3)/vmlinuz-6.13.7-200.fc41.x86_64 root=UUID=301a0013-fcf1-4cf4-9b18-9fee9b3fe413 ro rootflags=subvol=root resume=UUID=e58e2e02-ecb9-4eae-a716-87ebcd673291 rhgb nouveau.modeset=0 splash hibernate=nocompress ibt=off rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia_drm.modeset=1 nvidia_drm.fbdev=0 pci=realloc=off

I don’t seem to be getting any of the PCI I/O region errors from NVRM that this parameter fixes

Any particular reason for the crash? Any relation to the errors on my end being on every kernel except 6.1?

The 570 driver didn’t run on kernel 6.1 for me, i saw the sddm login screen briefly then it disappeared. Checking dmesg contained some RIPs about the nvidia driver

But even if I did get it working I might not be able to verify your issue… as I don’t have “resource conflicts”

Interesting development:

I compared dmesg outputs from kernel 6.1 and 6.13. I’m seeing a limited PCIe bandwidth with 6.13 for some reason. Is this the cause instead of the pnp errors? (I see the BARs get assigned regardless of what pnp spits out)

6.1:

abijay@fedora:~$ sudo journalctl -b 0 -g "pci 0000:01"
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: [10de:24dd] type 00 class 0x030000
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00ffffff]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x1ffffffff 64bit pref]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: reg 0x1c: [mem 0x00000000-0x01ffffff 64bit pref]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: reg 0x24: [io  0x0000-0x007f]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: Max Payload Size set to 256 (was 128, max 256)
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: PME# supported from D0 D3hot
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: 126.024 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:00:01.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: Adding to iommu group 19
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: vgaarb: bridge control possible
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.1: [10de:228b] type 00 class 0x040300
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.1: reg 0x10: [mem 0x00000000-0x00003fff]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.1: Max Payload Size set to 256 (was 128, max 256)
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.1: Adding to iommu group 19
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: BAR 1: assigned [mem 0x6000000000-0x61ffffffff 64bit pref]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: BAR 3: assigned [mem 0x6200000000-0x6201ffffff 64bit pref]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: BAR 0: assigned [mem 0x60000000-0x60ffffff]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: BAR 6: assigned [mem 0x61000000-0x6107ffff pref]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.1: BAR 0: assigned [mem 0x61080000-0x61083fff]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.0: BAR 5: assigned [io  0x5000-0x507f]
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.1: extending delay after power-on from D3hot to 20 msec
Mar 28 13:26:19 fedora kernel: pci 0000:01:00.1: D0 power state depends on 0000:01:00.0
Mar 28 15:50:29 fedora sudo[20814]:   abijay : TTY=pts/4 ; PWD=/home/abijay ; USER=root ; COMMAND=/usr/bin/journalctl -b - -g 'pci 0000:01'
Mar 28 15:50:37 fedora sudo[20848]:   abijay : TTY=pts/4 ; PWD=/home/abijay ; USER=root ; COMMAND=/usr/bin/journalctl -b 0 -g 'pci 0000:01'
Mar 28 15:52:14 fedora sudo[20886]:   abijay : TTY=pts/4 ; PWD=/home/abijay ; USER=root ; COMMAND=/usr/bin/journalctl -b 0 -g 'pci 0000:01'
Mar 28 15:52:24 fedora sudo[20917]:   abijay : TTY=pts/4 ; PWD=/home/abijay ; USER=root ; COMMAND=/usr/bin/journalctl -b -3 -g 'pci 0000:01'
Mar 28 15:54:00 fedora sudo[21388]:   abijay : TTY=pts/4 ; PWD=/home/abijay ; USER=root ; COMMAND=/usr/bin/journalctl -b 0 -g 'pci 0000:01'

6.13:

abijay@fedora:~$ sudo journalctl -b -3 -g "pci 0000:01"
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: [10de:24dd] type 00 class 0x030000 PCIe Legacy Endpoint
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x00ffffff]
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: BAR 1 [mem 0x00000000-0x0fffffff 64bit pref]
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: BAR 3 [mem 0x00000000-0x01ffffff 64bit pref]
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: BAR 5 [io  0x0000-0x007f]
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: ROM [mem 0x00000000-0x0007ffff pref]
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: Max Payload Size set to 256 (was 128, max 256)
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: PME# supported from D0 D3hot
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: 16.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x8 link at 0000:00:01.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.1: [10de:228b] type 00 class 0x040300 PCIe Endpoint
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.1: BAR 0 [mem 0x00000000-0x00003fff]
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.1: Max Payload Size set to 256 (was 128, max 256)
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: vgaarb: bridge control possible
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: BAR 1 [mem 0x6000000000-0x600fffffff 64bit pref]: assigned
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: BAR 3 [mem 0x6010000000-0x6011ffffff 64bit pref]: assigned
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: BAR 0 [mem 0x60000000-0x60ffffff]: assigned
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: ROM [mem 0x61000000-0x6107ffff pref]: assigned
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.1: BAR 0 [mem 0x61080000-0x61083fff]: assigned
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: BAR 5 [io  0x5000-0x507f]: assigned
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.1: extending delay after power-on from D3hot to 20 msec
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.1: D0 power state depends on 0000:01:00.0
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.0: Adding to iommu group 16
Mar 28 12:15:04 fedora kernel: pci 0000:01:00.1: Adding to iommu group 16

Okay, looking at lspci -vv outputs between both kernels, there’s clearly an issue with ‘LnkCap’ between the kernels:

6.1:

01:00.0 VGA compatible controller: NVIDIA Corporation GA104M [GeForce RTX 3070 Mobile / Max-Q] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Lenovo Device 3833
	Physical Slot: 1
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 181
	IOMMU group: 19
	Region 0: Memory at 60000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at 6000000000 (64-bit, prefetchable) [size=8G]
	Region 3: Memory at 6200000000 (64-bit, prefetchable) [size=32M]
	Region 5: I/O ports at 5000 [size=128]
	Expansion ROM at 61000000 [virtual] [disabled] [size=512K]
	Capabilities: [60] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee009b8  Data: 0000
	Capabilities: [78] Express (v2) Legacy Endpoint, IntMsgNum 0
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ TEE-IO-
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <1us, L1 <4us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk-
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 16GT/s, Width x8 (downgraded)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp+ 10BitTagReq+ OBFF Via message, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
			 AtomicOpsCtl: ReqEn-
			 IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq-
			 10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
		LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
			 EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [b4] Vendor Specific Information: Len=14 <?>
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Capabilities: [250 v1] Latency Tolerance Reporting
		Max snoop latency: 34326183936ns
		Max no snoop latency: 34326183936ns
	Capabilities: [258 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=255us PortTPowerOnTime=10us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=0ns
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [128 v1] Power Budgeting <?>
	Capabilities: [420 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
			ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
			PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
			ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
			PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
			ECRC- UnsupReq- ACSViol- UncorrIntErr+ BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
			PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CorrIntErr- HeaderOF-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CorrIntErr- HeaderOF+
		AERCap:	First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn- PerformEqu-
		LaneErrStat: 0
	Capabilities: [bb0 v1] Physical Resizable BAR
		BAR 0: current size: 16MB, supported: 16MB
		BAR 1: current size: 8GB, supported: 64MB 128MB 256MB 512MB 1GB 2GB 4GB 8GB
		BAR 3: current size: 32MB, supported: 32MB
	Capabilities: [c1c v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [d00 v1] Lane Margining at the Receiver
		PortCap: Uses Driver+
		PortSta: MargReady+ MargSoftReady+
	Capabilities: [e00 v1] Data Link Feature <?>
	Kernel driver in use: nvidia
	Kernel modules: nouveau, nvidia_drm, nvidia

6.13:

01:00.0 VGA compatible controller: NVIDIA Corporation GA104M [GeForce RTX 3070 Mobile / Max-Q] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Lenovo Device 3833
	Physical Slot: 1
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 211
	IOMMU group: 16
	Region 0: Memory at 60000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at 6000000000 (64-bit, prefetchable) [size=256M]
	Region 3: Memory at 6010000000 (64-bit, prefetchable) [size=32M]
	Region 5: I/O ports at 5000 [size=128]
	Expansion ROM at 61000000 [virtual] [disabled] [size=512K]
	Capabilities: [60] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee00d78  Data: 0000
	Capabilities: [78] Express (v1) Legacy Endpoint, IntMsgNum 0
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ TEE-IO-
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk-
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x8 (downgraded)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Capabilities: [b4] Vendor Specific Information: Len=14 <?>
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Capabilities: [250 v1] Latency Tolerance Reporting
		Max snoop latency: 34326183936ns
		Max no snoop latency: 34326183936ns
	Capabilities: [258 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=255us PortTPowerOnTime=10us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=0ns
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [128 v1] Power Budgeting <?>
	Capabilities: [420 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
			ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
			PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
			ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
			PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
			ECRC- UnsupReq- ACSViol- UncorrIntErr+ BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
			PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CorrIntErr- HeaderOF-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CorrIntErr- HeaderOF+
		AERCap:	First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900 v1] Null
	Capabilities: [bb0 v1] Physical Resizable BAR
		BAR 0: current size: 16MB, supported: 16MB
		BAR 1: current size: 256MB, supported: 64MB 128MB 256MB 512MB 1GB 2GB 4GB 8GB
		BAR 3: current size: 32MB, supported: 32MB
	Capabilities: [c1c v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [d00 v1] Lane Margining at the Receiver
		PortCap: Uses Driver+
		PortSta: MargReady- MargSoftReady-
	Capabilities: [e00 v1] Data Link Feature <?>
	Kernel driver in use: nvidia
	Kernel modules: nouveau, nvidia_drm, nvidia

Can someone from the Driver team please take a look and confirm?

EDIT: I’ve also looked at other PCIe devices (M.2 NVMe SSD) with the newer kernel, and they are not affected like the dGPU is, i.e. LinkCap says 16GT/s x4 as expected on 6.13.

would you be able to check some kernels in between 6.1 and 6.13 to see more or less which kernel version first introduced the issue? Thanks!

Slight wrinkle. Fedora’s koji archive no longer has the kernel versions I want to test with, so I’ll be downloading old Manjaro live ISOs with nvidia drivers preinstalled. Will that work here?

I have checked lspci -vv and nvidia x server settings on Manjaro live 24.0 (kernel 6.9) and 23.0 (kernel 6.5). Both with the issue present. Screenshots attached:

Kernel 6.5:

Kernel 6.9:

Also, I just checked the lspci -vv output of the PCIe bridges and found that they too are limited to Gen1 speeds as well, with google saying that this is the likely cause of my GPU being limited. I’ve pasted the full output for 6.13 here: Pastebin

Output for 6.1 (Bridges at full Gen4 speeds) here: Pastebin

As an aside, is the problem described here related? (TLDR: Lenovo discovered a PCIe bug in January for hotpluggable devices that brings them down to Gen 1. Commit IDs included.)

1 Like

I am at my wit’s end. Can someone on the driver team please help?