One of my TX1s started to have an error messages during boot where I have problems finding the cause. They always look like this:
May 15 09:50:09 seavision-2 systemd[1]: Reached target Sound Card.
May 15 09:50:09 seavision-2 systemd[1]: dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device: Dev dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device appeared twice with different sysfs paths /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p15 and /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p14
May 15 09:50:10 seavision-2 systemd[1]: dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device: Dev dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device appeared twice with different sysfs paths /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p15 and /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p18
May 15 09:50:10 seavision-2 systemd[1]: dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device: Dev dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device appeared twice with different sysfs paths /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p15 and /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p19
May 15 09:50:10 seavision-2 systemd[1]: dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device: Dev dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device appeared twice with different sysfs paths /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p15 and /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p2
May 15 09:50:10 seavision-2 systemd[1]: dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device: Dev dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device appeared twice with different sysfs paths /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p15 and /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p13
May 15 09:50:10 seavision-2 systemd-udevd[287]: Process '/bin/rm /var/lib/alsa/asound.state' failed with exit code 1.
May 15 09:50:10 seavision-2 systemd[1]: dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device: Dev dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device appeared twice with different sysfs paths /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p15 and /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p10
May 15 09:50:10 seavision-2 systemd[1]: dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device: Dev dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device appeared twice with different sysfs paths /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p15 and /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p9
May 15 09:50:10 seavision-2 systemd-udevd[279]: Could not generate persistent MAC address for dummy0: No such file or directory
May 15 09:50:10 seavision-2 systemd-udevd[282]: Could not generate persistent MAC address for ip6tnl0: No such file or directory
May 15 09:50:10 seavision-2 kernel: dhd_module_init in
May 15 09:50:10 seavision-2 systemd[1]: dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device: Dev dev-disk-by\x2dpartuuid-00000000\x2d0001\x2d0000\x2d6708\x2dbd5b00000000.device appeared twice with different sysfs paths /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p15 and /sys/devices/sdhci-tegra.3/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0p8
The difference between boots is only, that the cause partition on the eMMC changes. So far I have seen problems with mmcblk0p15, mmcblk0p17 and mmcblk0p19.
The system in question is a TX1 on an Auvidea J120 running the standard L4T 28.2 kernel in a very simple rootfs (basically an ubuntu-base).
I never have seen this problem before on a Jetson and all the other Jetsons we are running (using the same system) don’t report any problems. It is absolutely possible, that something went very wrong with that particular system, but I am a bit clueless right now where to start looking.
May 15 09:50:10 seavision-2 systemd-udevd[279]: Could not generate persistent MAC address for dummy0: No such file or directory
May 15 09:50:10 seavision-2 systemd-udevd[282]: Could not generate persistent MAC address for ip6tnl0: No such file or directory
Device special files are not real files, but are instead a result of a running driver. Apparently the driver related to this is gone. Because systemd is looking for MAC address, then this implies networking is gone (you can’t run networking setup without network drivers).
Is it correct that you are experimenting with partitions? In R28.x and newer much of the boot content has become signed and unsigned or incorrect signature content is rejected. It isn’t possible to know what is going on without knowing specific details of exactly what is being changed in partitions.
network is working fine, I have neither a dummy driver nor the ip6 tunnel device running (havent seen one of those on a Jetson for ages).
My problem is with the partitions. I really don’t do anything with that. What I do is using a custom rootfs inside a L4T 28.2 distribution. Then run apply_binaries.sh and flash.sh to flash the Jetson. So the partition table etc. is as provided by L4T.
What I’m pointing out is that kernel drivers (even if unrelated) are apparently missing. Missing kernel features or device tree caused those drivers to fail. dummy0 and ip6tnl0 are all virtual, and thus there is probably no device tree related to them…which leaves kernel drivers as missing or misconfigured. I always have to wonder if the base kernel/module setup is installed correctly (it is hard to figure out what goes on if modules are missing…how do you debug a system with part of the kernel missing?).
The error is a complaint in sysfs of essentially multiple copies. Sysfs is itself a reflection in RAM created by various kernel components, e.g., drivers…so it is back again that I wonder if perhaps something invalid is going on in kernel config.
However, what do you see from:
sudo gdisk -l /dev/mmcblk0
# And:
lsblk -f
A default TX1 dev kit under R28.2 would show this for gdisk:
You are absolutely right with regards to the kernel and the device drivers. Since I don’t use the dummy driver and IPV6 tunnel I never noticed that something could be off. Maybe I am a bit to la when it comes to the kernel complaining about stuff I don’t use, might be a relic from my old 2.0.x kernel development times…
A quick check on that front shows, that ipv6 and the dummy drivers are both directly compiled into the kernel.
# zgrep CONFIG_DUMMY /proc/config.gz
# CONFIG_DUMMY_IRQ is not set
CONFIG_DUMMY=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
and
zgrep IPV6 /proc/config.gz
CONFIG_IPV6=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
CONFIG_IPV6_OPTIMISTIC_DAD=y
CONFIG_IPV6_MIP6=y
# CONFIG_IPV6_ILA is not set
# CONFIG_IPV6_VTI is not set
CONFIG_IPV6_SIT=y
# CONFIG_IPV6_SIT_6RD is not set
CONFIG_IPV6_NDISC_NODETYPE=y
CONFIG_IPV6_TUNNEL=y
# CONFIG_IPV6_GRE is not set
CONFIG_IPV6_MULTIPLE_TABLES=y
# CONFIG_IPV6_SUBTREES is not set
# CONFIG_IPV6_MROUTE is not set
# CONFIG_IP_VS_IPV6 is not set
CONFIG_NF_DEFRAG_IPV6=y
CONFIG_NF_CONNTRACK_IPV6=y
# CONFIG_NF_DUP_IPV6 is not set
CONFIG_NF_REJECT_IPV6=y
# CONFIG_NF_LOG_IPV6 is not set
# CONFIG_NF_NAT_IPV6 is not set
# CONFIG_IP6_NF_MATCH_IPV6HEADER is not set
And both not ass modules but directly in the kernel. So I have to dig deeper why udevd doesn’t find the driver.
with regards to the partition problem, here are the two outputs:
# gdisk -l /dev/mmcblk0
GPT fdisk (gdisk) version 1.0.1
Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present
Found valid GPT with protective MBR; using GPT.
Disk /dev/mmcblk0: 30777344 sectors, 14.7 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 00000000-0000-0000-0000-000000000000
Partition table holds up to 19 entries
First usable sector is 34, last usable sector is 30777311
Partitions will be aligned on 2-sector boundaries
Total free space is 1 sectors (512 bytes)
Number Start (sector) End (sector) Size Code Name
1 34 29360161 14.0 GiB 0700 APP
2 29360162 29364257 2.0 MiB 0700 TBC
3 29364258 29372449 4.0 MiB 0700 EBT
4 29372450 29376545 2.0 MiB 0700 BPF
5 29376546 29388833 6.0 MiB 0700 WB0
6 29388834 29397025 4.0 MiB 0700 RP1
7 29397026 29409313 6.0 MiB 0700 TOS
8 29409314 29413409 2.0 MiB 0700 EKS
9 29413410 29417505 2.0 MiB 0700 FX
10 29417506 29679649 128.0 MiB 0700 BMP
11 29679650 29720609 20.0 MiB 0700 SOS
12 29720610 29851681 64.0 MiB 0700 EXI
13 29851682 29982753 64.0 MiB 0700 LNX
14 29982754 29990945 4.0 MiB 0700 DTB
15 29990946 29995041 2.0 MiB 0700 NXT
16 29995042 30007329 6.0 MiB 0700 MXB
17 30007330 30019617 6.0 MiB 0700 MXP
18 30019618 30023713 2.0 MiB 0700 USP
19 30023714 30777310 368.0 MiB 0700 UDA
# lsblk -f
NAME FSTYPE LABEL UUID MOUNTPOINT
sda
└─sda1 ext4 b36f216f-ec68-4ff7-bfcc-bc2ad1861019 /XXXXX
mmcblk0rpmb
mmcblk0
├─mmcblk0p1 ext4 1b572b3d-0658-4ec1-8df3-9b449537e01f /
├─mmcblk0p2
├─mmcblk0p3
├─mmcblk0p4
├─mmcblk0p5
├─mmcblk0p6
├─mmcblk0p7
├─mmcblk0p8
├─mmcblk0p9
├─mmcblk0p10
├─mmcblk0p11
├─mmcblk0p12
├─mmcblk0p13
├─mmcblk0p14
├─mmcblk0p15
├─mmcblk0p16
├─mmcblk0p17
├─mmcblk0p18
└─mmcblk0p19
Now I wonder a bit, where the small differences in the table come from. Since I really didn’t change anything in that regard. It is pure L4T 28.2 for a TX1. So why is my APP partition 200MB smaller and concequently the user data partition at the end those 200MB larger. My flash environment is on an extrnal ssd I can’t access right now, so I will need to have a look into that tomorrow…
I have not examined what kernel features are required for dummy0, but if the features you listed are indeed the requirements for the kernel side of dummy0, then it implies something in the boot environment itself is missing (something set up by systemd/init steps since there is no related hardware and thus it is unlikely anything device tree got in the way…but you never know, there might be an inheritance from a step which was a hardware setup step).
On the other hand, the “DUMMY” configs found tend to imply some sort of console, and perhaps this is unrelated to networking dummy devices. Don’t know since I haven’t actually compared the kernel config items to what is required for the failed networking items.
In the case of a driver not finding its hardware, then this can sometimes be due to missing or incorrect firmware. The firmware essentially changes the nature of the hardware, and changing a driver API could imply the need to change to a matching new firmware. Not all hardware uses firmware, but wireless networking does more often than not (those who don’t use firmware must create new hardware to support different regulations throughout the different political regions of the world…else they can only sell the hardware in one location…this makes wireless firmware quite popular).
The size difference of APP is because during command line flash I specifically set to use the max possible APP size. When I flashed I added “-S 14580MiB”, whereas I think the default is “-S 14GiB”. “1458010241024” versus “1410241024*1024” (the byte difference is 255852544 bytes, or 244MiB). The important thing is the size of the non-rootfs partitions…these are the ones used for boot. These other partitions appear to match in size.
Something the partition list does not show is if the partitions’ signatures are valid. Boot content does need to be signed, but so long as those other partitions were not manipulated (and thus changing signature), then the APP (rootfs) partition size changes should not be an issue.
Was there any manipulation or change to any of the non-APP partitions?