This isnt about fixing your specific issue rather this entire thread.
I’ve just read every post, bug report and log extract on this thread.
This is a super easy fix.
Firstly to clarify;
Linux is not supposed to work out of box.
Thats a Closed Source Market Standard.
The Open Source End User is “FREE”;
to “finish” the Open Source product to a Closed Source Market definition of “state of finish”
To use VA-API on XWayland, use the --use-gl=egl flag. Currently exhibits choppiness FS#67035. It could be solved by enabling #Native Wayland support.
To use VA-API on Xorg, use the --use-gl=desktop flag.
Starting in Chromium 86, there will be support for VA-API when using the ANGLE gl renderer. Use the --enable-accelerated-video-decode to enable it on an Intel GPU."
BTW, Hows ARCH working out for ya!?
If the your system isn’t configured and integrated as per / The Book and the above Browsers aren’t dury rigged with workarounds then this WILL exponentially exacerbate and exploit the poor system integration, configuration coupled with the lack of support in the kernel or other such issues.
Correct Bios Settings are critical.
Correct Kernel parameters are critical
I also saw multiple posts using PowerSave aswell in some form.
This affects the nVidia driver aswell. It wants to ramp up and is getting choked.
Disbale all power management for PCI express.
The biggest issue is BIOS DMA Buffer/ VM / IOMMU and xHCI settings and support.
Is xHCI handover still enabled in the BIOS?
USB 2/ PS/2 Legacy support uses VM is the BIOS.
Kernel 5.8
In Arch Linux and Manjaro 5.8+ kernel has issues with Renesas USB controllers due to a FW version check issue.
Kernel EDID patch: 20201203
Removed the Skylake/Kabylake platform detection logic and makes the edid function work on all platforms. Regardless, with the patch, a kernel oops occurs on the function intel_vgpu_reg_rw_edid in drivers/drm/i915/kvmgt.c.
So … the bug began on nvidia driver 450, not only Arch Linux, kernel 5.4 and 4.19 are also hit and the crash happened without any browser in some cases (vlc for example).
For the Hardware video acceleration, without this feature, the crash happened too (yes, I tried again and again and now I just compile the 435 for my kernel and I don’t have any crash).
I don’t use the PowerSave (because of nvidia, this problem is here for few years now), the problem with the firmware version is solved for few months (and to clarify, previous kernel have the same bug with nvidia driver).
Why this bug don’t hit the nvidia’s driver before 450 and why without chrome or whatever with the Hardware acceleration, this bug still happened ?
My questions are not really questions, It’s just to compare with the post above @abelits
Please stop. The kernel crash on dereferencing a NULL pointer in a driver’s function is probably the most conclusive and unambiguous indication that a bug is in the driver.
Most likely because it was introduced in that version.
Because “not using hardware acceleration” option in one userspace program does not reliably prevent any particular piece of driver’s functionality from being used, especially in a modern desktop that uses compositing for everything. Also the problem is probably in the implementation of some basic functionality, most likely a race condition in something very common. The number of calls may affect the likeliness of a crash, however it can’t be eliminated entirely.
If a guess made by @generix is correct, and preemption is either the necessary condition or it greatly increases the probability of a crash being triggered, it would strongly indicate a race condition.
Following recent posts here I have been testing today running kernel 5.10.18 compiled with CONFIG_PREEMPT_NONE=y set, otherwise default config, and the latest 460.39 driver.
So far I have been unable to reproduce the crash whilst watching video in Kodi (For me always the trigger of crash) in around 6 hours.
But of course it is not absolutely reliable to reproduce in such a time frame, having said that previously I could not get past 3/4 days uptime whilst using kodi each day before getting the crash. So I will see how it goes and report back if I run into it in the coming days.
I installed 450.56 on manjaro almost immediately after notification about the post here and still running it. no freezings yet. will report here in a few days.
Hi, i use Arch Linux with a Intel i5-3450 CPU and a GTX 650 Ti. I know i am almost to the fucking cutting line of the support for the driver… But i reinstalled my Arch Linux last week after a fuck up with BTRFS. My PC was running fine, was updating everyday while also hibernating for the night until i got an issue with chrome freezing (it always happen with so much tabs i got). So i applied more update and rebooted… Now, with the latest update, i opened chrome and noticed everything started to fuck up… Youtube was using 80% of the GPU with memory leaking. I soon realise Discord, Steam and all video player was also suffering intensely. It seem video acceleration issue started up on the update… In the last 10 days or so… So off course it was driving me crazy. I downgraded a shit load of video package and everything’s fine now. All i could find back on google was this thread… I use the chaotic-aur’s TKG suit of Kernel/Nvidia Driver/Mesa and more… So off course when i was reading this thread i realise YOU GUYS FUCK UP THE PATCH FOR THAT ISSUE… Making people who had no issue with the video driver now being throw with everyone else having issue… Off course it’s upstream issue. I am fairly sure it’s not TKG since all they do’s reapply the patch that was working fine for month from your source… Off course i am in dire need of upgrading my GPU, i got a lot of vintage keyboard i plan to trade for PC part including GPU… But idk, i might just go AMD if that was not them also having issue once in a while. I kinda just wanna game too… One of my friend offered me more recent nvidia GPU but i am very unsure if i wanna bite the dust now with that upgrade. Since i had downgraded, i cannot remake the issue without upgrading back, but since i am moving on monday, i need to have my PC working, so i will abstain from doing it.
TL;DR : You guys introduced Video Acceleration Issue on my GTX 650 Ti running Arch Linux.
For the sake of everything that is, was, will be or might be sacred, instead of this wall of text, post:
Nvidia driver version number.
Kernel version number (and better the output of uname -a).
Types of failures (graphics distortion, uneven video playback speed, slowdown, high CPU load, graphics or full computer lockup, kernel panic if kernel or logs are collected).
Software used.
Right, Nvidia recommendations are not very useful and their collections scripts are often not accessible at the time of failure. Nevertheless, please post something that qualifies as a bug report.
“Linux HNT-Quad-ROS 5.10.15-120-tkg-bmq #1 TKG SMP PREEMPT Mon, 15 Feb 2021 02:15:43 +0000 x86_64 GNU/Linux” which are the ivybridge version of tkg-bmq.
Drivers are chaotic-nvidia-dkms-tkg-460.39.6 (The time of posting the update match with when the issue started).
The issue was video acceleration glitching, low frame rate on video, high cpu load with memory leaking on chrome when video acceleration was on, had to disable it to be fine.
Software, everything using video acceleration : Chrome, Steam Store Video, Discord, VLC and other Media Player.
Downgrading many package related to video drivers seem to fix it.
Looks like a problem with 460.39 support of older GPUs. It should be reported separately with this information and last working driver version.
This thread is about a different problem – one that seems to affect all GPUs and causes a kernel panic, is present in 460.39, and might be fixed in 460.56.
I’m jealous. I’ve been reporting nvkms crashdumps in the ‘stable’ driver for over a -year- and you guys got NVidia to fix the issue in less than 5 months!
Just to follow up my previous post I have been running the older problematic driver 460.39 with 5.10 kernel compiled with preemption disabled and not had the crash once with almost 7 days uptime, doing the same activity as was causing crash every day.
So I would say from my albeit limited testing you guys were quite probably correct here.
Going to try the latest driver now with my normal kernel config with preemption, looks good so far based on lack of reports here so far, so hopefully they fixed it this time.
I took peak at your bug report and I don’t think that is the same problem, at least the log looks different than all the others from this thread.