Issue:
Mapping IO(Physical) address 0xF8610000 causes system reboot due to:
BUG: unable to handle kernel paging request at 00000000f8610000
It requires setting up netconsole and “sysctl kernel.panic_on_oops=0” to get the error message.
No way to run bug-report.sh while kernel hit oops…
But the fix is quiet strait forward:
By looking up “RIP: 0010:os_lookup_user_io_memory+0x3e1” in the Opps message with GDB and nvidia.ko
I’ve got the location of the issue: nvidia/os-mlock.c:59
I believe the original intention was to check if mapping physical/bus address is contiguous.
But (*pte_array)[i] or (*pte_array)[i-1] means to access the Physical Address as Kernel space virtual address…
Although the type of pte_array is “NvU64 **” but it’s value was assign with:
“pte_array[i] = (NvU64 *)(pfn << PAGE_SHIFT);”
Each element of pte_array is actually a PHYSICAL/BUS address (a 64bits integer) cast as (NvU64* [Pointer of 64bits integer]), which SHOULD NOT be dereference directly…
nvidia-460-fix-invalid-memory-access.patch (468 Bytes)