Now with R38.2.1 my Thor continues to freeze

Now with R38.2.1 my Thor still freezes and does not accept any input from keyboard or mouse for 2 to 3 seconds. I will keep typing and after a delay of 2 to 3 seconds what I just typed will appear. If I attempt to up scroll in Firefox nothing will happen for 2 to 3 seconds and then it will unfreeze and move up. If I touch the keyboard up arrow three to four times; after the browser unfreezes it will launch to the top of the webpage. Typing this short paragraph took an additional 10 seconds due to the repeated timeout/freeze of
the Thor.

The freeze occurs about 7 times a minute. It is a significant problem.

When ssh into Thor this freezing does not occur.

from syslog:

2025-09-18T17:10:30.393290-07:00 chithor gnome-session[2811]: gnome-session-binary[2811]: GnomeDesktop-WARNING: Could not create transient scope for PID 2825: GDBus.Error:org.freedesktop.DBus.Error.UnixProcessIdUnknown: Failed to set unit properties: No such process
2025-09-18T17:10:30.393331-07:00 chithor gnome-session-binary[2811]: GnomeDesktop-WARNING: Could not create transient scope for PID 2825: GDBus.Error:org.freedesktop.DBus.Error.UnixProcessIdUnknown: Failed to set unit properties: No such process

dmesg.txt (125.8 KB)

lsmod.txt (7.3 KB)



R38.2.1 does resolve the multiple

NVRM: nvAssertFailed: Assertion failed: 0 @ g_kern_bus_nvoc.h:2706

1 Like

Hi,

Please help share the method to reproduce this issue.

It occurs when I am locally logged on to the Thor in the gui. At first I was only attaching to Thor via ssh and there were no problems. A week ago I attached local keyboard, mouse, displayport display and logged on to the gui. I encounter the problem when in gui.

ssh connection to Thor has never had a problem.

So even after a system reboot, you would still see this issue immediately?

Yes. I will try some more troubleshooting tomorrow.

Would a nvidia-debugdump –dumpall dump.zip help? if so I’ll attach tomorrow.

debug dump would dump lots of items but actually not all of them are needed.

I would say dmesg and xorg log are necessary for now if this sounds like a graphic issue.

That took a couple hours troubleshooting. I won’t attach all the now results.

Enabling Wayland/Weston solved the problem. The problem occurs with X11.

It is fixed, or at least its occurrence limited to a degree that it ceases to be a problem.

Here’s how I set up Wayland/Weston:

# Confirm Nvidia KMS is on  "Y or 1"
cat /sys/module/nvidia_drm/parameters/modeset

# Create Ubuntu (Wayland) session entry for GDM to use
sudo tee /usr/share/wayland-sessions/ubuntu-wayland.desktop >/dev/null <<'EOF'
[Desktop Entry]
Name=Ubuntu (Wayland)
Comment=Ubuntu session on GNOME Wayland
Exec=env GNOME_SHELL_SESSION_MODE=ubuntu gnome-session --session=ubuntu
TryExec=gnome-session
Type=Application
DesktopNames=ubuntu:GNOME
X-GDM-SessionType=wayland
EOF

# Tell GDM to default to that Wayland session
sudo awk '
  BEGIN{s=0}
  /^\[daemon\]/{s=1; print; next}
  s && /^(WaylandEnable|DefaultSession|AutomaticLoginEnable)=/{next}
  {print}
  END{
    if(!s) print "[daemon]"
    print "WaylandEnable=true"
    print "DefaultSession=ubuntu-wayland"
    print "AutomaticLoginEnable=false"
  }
' /etc/gdm3/custom.conf | sudo tee /etc/gdm3/custom.conf >/dev/null

# Ensure AccountsService prefers Wayland for user
USER_NAME="$(id -un)"
sudo install -d -m0755 /var/lib/AccountsService/users
sudo tee "/var/lib/AccountsService/users/${USER_NAME}" >/dev/null <<'EOF'
[User]
XSession=ubuntu-wayland
Session=ubuntu-wayland
SystemAccount=false
EOF

sudo chown root:root "/var/lib/AccountsService/users/${USER_NAME}"
sudo chmod 0644 "/var/lib/AccountsService/users/${USER_NAME}"
rm -f ~/.dmrc 2>/dev/null || true

# Restart the display manager
sudo systemctl restart gdm

# Confirm the change
echo $XDG_SESSION_TYPE  #should return "wayland"


I think the issue may arise in following dmesg.

[   12.971239] NVRM: loading NVIDIA UNIX Open Kernel Module for aarch64  TempVersion  Release Build  (bugfix_main)  (buildbrain@5ce570c8-aabd-435b-a438-99bd3943cffd-jf1d-52rgf)  Wed Sep 10 12:34:47 PDT 2025
[   12.998690] nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for aarch64  TempVersion  Release Build  (bugfix_main)  (buildbrain@5ce570c8-aabd-435b-a438-99bd3943cffd-jf1d-52rgf)  Wed Sep 10 12:34:37 PDT 2025
[   13.001728] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[   13.969949] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
[   13.969962] nvidia 0000:01:00.0: [drm] No compatible format found
[   13.969977] nvidia 0000:01:00.0: [drm] Cannot find any crtc or sizes
[   13.970007] [drm] [nvidia-drm] [GPU ID 0x00020000] Loading driver
[   14.225121] nvidia-modeset: WARNING: HW supports 8 heads. Limiting to 4 heads
[   14.225562] nvidia-modeset: WARNING: GPU:0: The head configuration (0xff) is inconsistent with the number of heads (4)
[   14.373014] [drm] Initialized nvidia-drm 0.0.0 20160202 for 8808c00000.display on minor 2
[   14.373022] [drm] [nvidia-drm] [GPU ID 0x00020000] Invalid framebuffer console info
[   14.495643] NVRM: nvAssertFailedNoLog: Assertion failed: minRequiredIsoBandwidthKBPS <= clientBwValues[DISPLAY_ICC_BW_CLIENT_EXT].minRequiredIsoBandwidthKBPS @ kern_disp_0402.c:111
[   14.544363] Console: switching to colour frame buffer device 430x90
[   14.570797] nv_platform 8808c00000.display: [drm] fb0: nvidia-drmdrmfb frame buffer device
[   15.244822] kauditd_printk_skb: 170 callbacks suppressed
[   15.244826] audit: type=1326 audit(1758328867.152:176): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=snap.cups.cupsd pid=2243 comm="cupsd" exe="/snap/cups/1102/sbin/cupsd" sig=0 arch=c00000b7 syscall=55 compat=0 ip=0xffff8988910c code=0x50000
[   15.257874] audit: type=1326 audit(1758328867.164:177): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=snap.cups.cups-browsed pid=2526 comm="cups-browsed" exe="/snap/cups/1102/sbin/cups-browsed" sig=0 arch=c00000b7 syscall=274 compat=0 ip=0xffff9e231c28 code=0x50000
[   15.486709] NVRM: rpcRmApiControl_dce: NVRM_RPC_DCE: Failed RM ctrl call cmd:0x731341 result 0xffff: Failure: Generic Error [NV_ERR_GENERIC]
[   15.771212] tegra-i2c 810c630000.i2c: I2C transfer timed out
[   16.633148] r8126: enP2p1s0: link up
[   17.915513] Bluetooth: RFCOMM TTY layer initialized
[   17.915525] Bluetooth: RFCOMM socket layer initialized
[   17.915530] Bluetooth: RFCOMM ver 1.11
[   18.291903] rfkill: input handler disabled
[   23.864348] tegra-mc 8108020000.memory-controller: dispr: non-secure read @0x0000fffffffffa00: EMEM address decode error (EMEM decode error)
[   23.864387] tegra-mc 8108020000.memory-controller: ptcr: @0x0000000000000000: Read response with poison bit error status:0
[   23.874153] tegra-mc 8108020000.memory-controller: dispr: non-secure read @0x0000fffffffffe00: EMEM address decode error (EMEM decode error)
[   23.886725] tegra-mc 8108020000.memory-controller: dispr: non-secure read @0x0000fffffffff600: EMEM address decode error (EMEM decode error)
[   23.899295] tegra-mc 8108020000.memory-controller: dispr: non-secure read @0x0000fffffffffa00: EMEM address decode error (EMEM decode error)
[   23.911867] tegra-mc 8108020000.memory-controller: dispr: non-secure read @0x0000fffffffffc00: EMEM address decode error (EMEM decode error)
[   23.924452] tegra-mc 8108020000.memory-controller: ptcr: @0x0000000000000000: Read response with poison bit error status:0
[   23.935637] arm-smmu-v3 8806000000.iommu: EVTQ overflow detected -- events lost
[   23.942962] arm-smmu-v3 8806000000.iommu: event 0x10 received:
[   23.942963] arm-smmu-v3 8806000000.iommu: 	0x0000090000000010
[   23.942964] arm-smmu-v3 8806000000.iommu: 	0x0000020800000000
[   23.942965] arm-smmu-v3 8806000000.iommu: 	0x0000007ffa53ea00
[   23.942966] arm-smmu-v3 8806000000.iommu: 	0x0000000000000000
1 Like

I spoke too soon. Still getting input delays in Firefox. and still, but reduced, amount of delays in a terminal.

I’ve run nvstart-weston.sh on Orin and it cleanly implmented Weston for that logon session.
I ran it earlier today on Thor and it ruined the display and locked the computer. Had to reset power to Thor to get a respond

Xorg.0.log (23.0 KB)

dmesg.txt (132.4 KB)

1 Like

hi team, just to add more folks experiencing same symptoms and errors. Tried even wihout any monitor/mouse/keyboard and issue still remains (less but still).

~$ grep ā€˜NVRM: nvAssertFailed: Assertion failed: 0 @ g_kern_bus_nvoc.h:2706’ /var/log/kern.log | wc -l
1713

Today I’ve got a new, different bunch of error:

NVRM: rpcRmApiControl_dce: NVRM_RPC_DCE: Failed RM ctrl call cmd:0x731341 result 0xffff: Failure: Generic Error [NV_ERR_GENERIC]

dmesg2.txt (129.7 KB)

The problem remains. The things I did to try to fix the ā€œfreezingā€ did not entirely fix the problem.

Hi,

Just to clarify. Could someone just share what is the scenario to reproduce this issue?

I need some more detail but not something like you escape error by using Weston.

I understand that you want to avoid this problem but what I want here is to check why this error happened on your side but not on mine.

  1. Are you using HDMI or DP monitor? or it would reproduce on both?
  2. How long it took to see this issue happened? For example, will I see it immediately after the first boot?
  3. Will it require to reproduce with extra application running? For example, Firefox?
  4. Will this issue get improved if you run with jetson_clocks? (just for experiment)

Displayport monitor

Immediately after the first boot. 2 hours ago I re-flashed the thor from the usb drive installation media. When thor did oem setup it immediately started freezing stuttering pausing for a couple seconds about 9 times a minute. The first time I flashed from sdkmanager.

Nothing was running beyond system processes and terminal at that time.

jetson_clocks does not improve it.

After that I did apt update / apt install nvidia-jetpack nvidia-jetpack-dev when I rebooted it showed the nvidia boot screen then there was not graphics or prompt or any activity indicating that thor was live. I can ssh into it and that works fine no freezing.

dmesg.txt (129.8 KB)

Xlog.txt (20.7 KB)

As someone previously noted, running following in a terminal stops the freezing.

nvidia-smi dmon

I re-flashed my Thor again this morning with sdkmanager r38.2.1. I’ve attached some files that might be helpful. They were acquired either at first boot of my fully flashed Thor or shortly thereafter. Nothing additional had been installed other than nano and python3-pip. No browser at that time.

xrandr-tegrastats.txt (23.7 KB)

Xorg.0.log (27.9 KB)

lsmod.txt (7.3 KB)

journal-2025-09-22-after-flash.log (198.3 KB)

1 Like

we are checking this internally.

May have found a clue to Thor freezing.

Just enabled WIFI for the first time. I still have a momentary freeze but the duration is significantly diminished. If I turn off wifi the longer duration freezing returns.

but now my dmesg is flooded with following every second:

[Tue Sep 23 09:35:40 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS
[Tue Sep 23 09:35:41 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS
[Tue Sep 23 09:35:42 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS
[Tue Sep 23 09:35:43 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS
[Tue Sep 23 09:35:44 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS
[Tue Sep 23 09:35:45 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS
[Tue Sep 23 09:35:46 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS
[Tue Sep 23 09:35:47 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS
[Tue Sep 23 09:35:48 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS
[Tue Sep 23 09:35:49 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS
[Tue Sep 23 09:35:50 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS

Hi,

Do you have MGBE connected on that QSPF on your board?

No. Just the rj45 2.5g ethernet

That’s a a little weird that MGBE spew such lots of logs when issue happened.

Are you sure this happened everytime? I mean MGBE logs coming out with stutter eveytime.

BTW, could you record a video for the stutter issue you saw? Need to confirm if we hit same issue.

Subsequent to running ā€œnvidia-smi dmonā€ and enabling wifi, Thor has a little lessened number of stutters and with a shortened time period. Tried to do video of webpage stutters and just tried again. When I start the video recording that is another process that lessens the number and severity/time of stutters. I just tried again and got the same results. Closed recording application and now am back to prior more significant pause/stutter.

Here’s a short video showing the cursor stopping blinking. It is the best I can get now. If needed I’ll flash Thor again to get a better representation of what happens.

xpcs_lane_bring_up dmesg every second only started after enabling wifi yesterday. And is still occurring.

[Tue Sep 23 18:55:47 2025] nvethernet a808e10000.ethernet: [xpcs_lane_bring_up][827][type:0x4][loga-0x0] PCS block lock SUCCESS