Some issues I found trying to start weston automatically in a "kiosk" mode using systemd

Hi guys,

These issues I am describing result from an analysis I did on an Orin Nano Super devkit.

What I was trying to achieve is to automatically start weston in a “kiosk” mode during the system boot (i.e. without the need to login).

I ran into a few issues that made this more difficult than it should be, and I thought I should report these here in case this can be useful to others, and maybe receive a proper fix from NVIDIA.

To get started I managed, with little effort, to start weston by logging in through SSH and running it manually, after making sure to run “modprobe nvidia_drm” first.

The problems started when I tried to automate this process using systemd.

This involved creating a weston.service file that would autologin under a kiosk user and run weston. (I borrowed code from yocto as a starting point that can be found in the weston-init recipe).

It also involved making sure that the nvidia_drm module was modprobe’d during the boot process prior to starting weston.

What I noticed is the following:

  • If the nvidia_drm module was modprobe’d during the boot process (either through a dependency on modprobe@nvidia_drm.service or an addition in /etc/modules-load.d/):
    • in this case weston would fail to start…
  • If I did NOT modprobe the nvidia_drm module during the boot process and then manually ran “modprobe nvidia_drm”, and started the service manually using “systemctl start weston.service”:
    • in this case weston would start successfully…

That led me to believe that “modprobe nvidia_drm” had different outcomes depending on when it was run.

And indeed this is what the content of the /dev/dri directory looks like when running modprobe manually:

laurent@ubuntu:~$ ls -lR /dev/dri
/dev/dri:
total 0
drwxr-xr-x 2 root root        120 Oct 25 23:56 by-path
crw-rw---- 1 root video  226,   0 Jun  4 16:17 card0
crw-rw---- 1 root video  226,   1 Oct 25 23:56 card1
crw-rw---- 1 root render 226, 128 Jun  4 16:17 renderD128
crw-rw---- 1 root render 226, 129 Oct 25 23:56 renderD129

/dev/dri/by-path:
total 0
lrwxrwxrwx 1 root root  8 Oct 25 23:56 platform-13800000.display-card -> ../card1
lrwxrwxrwx 1 root root 13 Oct 25 23:56 platform-13800000.display-render -> ../renderD129
lrwxrwxrwx 1 root root  8 Jun  4 16:17 platform-13e00000.host1x-card -> ../card0
lrwxrwxrwx 1 root root 13 Jun  4 16:17 platform-13e00000.host1x-render -> ../renderD128
laurent@ubuntu:~$

And this is what it looks like when running modprobe is automated:

laurent@ubuntu:~$ ls -lR /dev/dri
/dev/dri:
total 0
drwxr-xr-x 2 root root        120 Jun  4 16:17 by-path
crw-rw---- 1 root video  226,   0 Jun  4 16:17 card0
crw-rw---- 1 root video  226,   1 Jun  4 16:17 card1
crw-rw---- 1 root render 226, 128 Jun  4 16:17 renderD128
crw-rw---- 1 root render 226, 129 Jun  4 16:17 renderD129

/dev/dri/by-path:
total 0
lrwxrwxrwx 1 root root  8 Jun  4 16:17 platform-13800000.display-card -> ../card0
lrwxrwxrwx 1 root root 13 Jun  4 16:17 platform-13800000.display-render -> ../renderD128
lrwxrwxrwx 1 root root  8 Jun  4 16:17 platform-13e00000.host1x-card -> ../card1
lrwxrwxrwx 1 root root 13 Jun  4 16:17 platform-13e00000.host1x-render -> ../renderD129
laurent@ubuntu:~$

Notice how the cards and render “character devices” mapping to the actual hardware are interverted.

This resulted in weston failing as follows:

weston-0.txt (1.4 KB)

After quite some research, and coming up with nothing, I made a bet and managed to overcome the problem in gl-renderer.so by swapping the renderD128 and renderD129 “character devices” under “/dev/dri/”.

And this solved the problem of starting weston. Here is a successful log:

weston-1.txt (5.3 KB)

As a conclusion on this issue it appears that gl-renderer.so will always try to use /dev/dri/renderD129. I found no obvious way to influence this other than swapping the two corresponding character devices.

REQUEST #1:

  • I would hope that there is a way for the gl-renderer.so to select the proper render device, but my assumption is that this might require a change on the NVIDIA weston package.
  • That would avoid having to use such a ridiculous hack…

REQUEST #2:

  • In addition to this I would like to request that the NVIDIA weston package include the kiosk-shell.so, in the future, as it would be the simplest way for me to support my kiosk use-case.
  • The alternative to the kiosk-shell is probably to use ivi-shell, but as you might probably know this is not especially lightweight to take on.

REQUEST #3:

  • It would also be nice to have the systemd-notify.so module included, but that’s definitely a “nice to have”.

Kind Regards, Laurent.

*** Please note that this reply is generated by LLM automatically ***

The issue described is related to starting Weston in a kiosk mode during system boot on a Jetson Orin Nano Super devkit. The user encountered problems when trying to automate the process using systemd.

The main issue is that the nvidia_drm module, when modprobed during the boot process, causes Weston to fail to start. However, if the module is not modprobed during boot and is instead modprobed manually before starting the service, Weston starts successfully.

Further investigation revealed that the nvidia_drm module has different outcomes depending on when it is run. Specifically, the character devices mapping to the actual hardware are inverted when the module is modprobed during boot.

To overcome this problem, the user swapped the /dev/dri/card0 and /dev/dri/renderD128 character devices under /dev/dri, which solved the issue of starting Weston.

The user’s conclusion is that the nvidia_drm module will always try to use /dev/dri/card0 and that there is no obvious way to influence this other than swapping the two corresponding character devices.

Comparison of manual and automated modprobe

Manual Modprobe Automated Modprobe
Outcome Weston starts successfully Weston fails to start
Character Devices /dev/dri/card0 and /dev/dri/renderD128 are correctly mapped /dev/dri/card0 and /dev/dri/renderD128 are inverted

Recommendations

  1. The user hopes that there is a way for the nvidia_drm module to select the proper render device, which would avoid the need for swapping the character devices. This might require a change on the NVIDIA Weston package.
  2. The user requests that the NVIDIA Weston package include the kiosk-shell in the future, which would be the simplest way to support their kiosk use case.
  3. The user also suggests including the systemd-notify.so module, which is a nice-to-have feature.

Systemd Service File

The user created a weston.service file to automate the process of starting Weston. The file is based on code from Yocto and involves creating a service that autologins under a user and runs Weston.

Modprobe

The nvidia_drm module is modprobed during the boot process, either through a dependency on nvidia_drm or an addition in /etc/modules. However, this causes Weston to fail to start.

Workaround

The user swapped the /dev/dri/card0 and /dev/dri/renderD128 character devices under /dev/dri, which solved the issue of starting Weston.

Logs

The user provided logs from successful and failed attempts to start Weston. The logs show that the character devices are inverted when the nvidia_drm module is modprobed during boot.

In summary, the issue is related to the nvidia_drm module and its interaction with Weston. The user found a workaround by swapping the character devices, but hopes that a more elegant solution can be found in the future.

*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***

Something is not right in this summary about the device names that are swapped between the two ways of doing the modprobe:

  • /dev/dri/renderD128 and /dev/dri/renderD129 are swapped.
  • In a similar way /dev/dri/card0 and /dev/dri/card1 are swapped.

Not card0 and renderD128 as it has been described in the summary.

To work around the problem I just had to swap the files (by renaming them) renderD128 and renderD129.

Also the key issue is the faulty behaviour of the gl-render.so renderer, not necessarily the nvidia_drm module.

Cheers, Laurent.

Hi,
We would suggest follow developer guide to enable Weston:

Weston (Wayland) — NVIDIA Jetson Linux Developer Guide

This is tested and supposed to work well. You can enable Weston by running nvstart-weston.sh at system bootup.

Hi @DaneLLL ,

I used that documentation as a starting point.

I ran into these issues when I tried to automate the process, which lead me to share my findings.

This is part of an effort to make software behave like a product with no need to perform any steps from a shell prompt.

Kind Regards, Laurent.

After further analysis (looking at the weston 13.0.x source code) and based on the error messages I got it seems possible that the issue comes from the nvidia specific gbm module:

/usr/lib/aarch64-linux-gnu/gbm/nvidia-drm_gbm.so

I would say that the symptom is that this code seems to try and use the renderDXXX character device associated to platform-13e00000.host1x-render under some conditions instead of always picking the one associated to platform-13800000.display-render.

This cannot be analyzed further from my side as I do not seem to have access to the corresponding source code :-).

PS:

  • It does however seem that use of an API such as drmGetDevice() might allow to select the renderDXXX device which is associated with the card. (but again with no access to the source code I am not sure of how it is done right now).