Cannot use l4t_initrd_flash with Xavier NX on CTI Boson Carrier

We are using a Xavier NX with Connect Tech Boson carrier board.

We have 100% reproducible success when using the flash.sh command and Connect Tech’s BSP.

  • The flash command was bash -x ./flash.sh cti/xavier-nx/boson/base mmcblk0p1
  • A representative successful log is attached as flash.log (998.1 KB)
  • The flashed operating system has been verified to be fully functional as designed.

However, we wish to boot from an attached 256G NVMe.

We have verified that /dev/nvme0n1 exists and is fully functional, even in the initrd of the above.

We understand that flashing our Xavier NX such that:

  • the device boots from QSPI flash, and then
  • uses the kernel and APP partitions from NVMe

requires the use of l4t_initrd_flash.sh.

This is where we began having issues.

To simplify and debug our setup, we decided to try using l4t_initrd_flash.sh to flash the internal eMMC, just like the above, whereupon we have an issue similar but not identical to this one.

Using the command:

bash -x ./tools/kernel_flash/l4t_initrd_flash.sh cti/xavier-nx/boson/base internal

we get the following error:

Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for device to expose ssh ......RTNETLINK answers: File exists
RTNETLINK answers: File exists
Run command: if [ -f /qspi/l4t_flash_from_kernel.sh ]; then USER=root /qspi/l4t_flash_from_kernel.sh --no-reboot --qspi-only ; fi on root@fe80::1%enp0s20f0u1
4194304
[ 0]: l4t_flash_from_kernel: Starting to create gpt for emmc
Active index file is /home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/images/internal/flash.idx
Number of lines is 68
max_index=67
writing item=43, 1:3:primary_gpt, 512, 19968, gpt_primary_1_3.bin, 16896, fixed-<reserved>-0, 3f7965c0734eee8bc96e67a11ec7994845d0f642
Writing primary_gpt partition with gpt_primary_1_3.bin
Offset is not aligned to K Bytes, no optimization is applied
dd if=/home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/images/internal/gpt_primary_1_3.bin of=/dev/sdc bs=1 skip=0  seek=512 count=16896
16896+0 records in
16896+0 records out
16896 bytes (17 kB, 16 KiB) copied, 0.0107505 s, 1.6 MB/s
Writing primary_gpt partition done
Writing secondary_gpt partition with gpt_secondary_1_3.bin
Offset is not aligned to K Bytes, no optimization is applied
dd if=/home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/images/internal/gpt_secondary_1_3.bin of=/dev/sdc bs=1 skip=0  seek=15757983232 count=16896
16896+0 records in
16896+0 records out
16896 bytes (17 kB, 16 KiB) copied, 0.0169274 s, 998 kB/s
Writing secondary_gpt partition done
[ 1]: l4t_flash_from_kernel: Successfully create gpt for emmc
Run command: partprobe on root@fe80::1%enp0s20f0u1
SSH is not ready
Error flashing non-qspi storage
Error flashing qspi
Cleaning up...

Notes

  • On the host, the udisk2.service is stopped, as recommended
  • On the host, USB network devices have been configured to use stable path-based names
  • The sshd server is reachable at root@fe80::1%enp0s20f0u1
    • This is confirmed by ping, ping6, and ssh -v root@...

On the device:

bash-5.0# ip -c addr
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether f6:f8:18:13:37:a0 brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:0c:8b:b5:15:d0 brd ff:ff:ff:ff:ff:ff
4: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 48:b0:2d:67:4e:a8 brd ff:ff:ff:ff:ff:ff
5: rndis0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether e6:ac:93:2f:a0:44 brd ff:ff:ff:ff:ff:ff
    inet6 fc00:1:1::2/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::e4ac:93ff:fe2f:a044/64 scope link 
       valid_lft forever preferred_lft forever
    inet6 fe80::1/128 scope link 
       valid_lft forever preferred_lft forever

I have a serial console on the device. From the console I know that

  • sshd is running, and
  • the contents of /tmp/sshd.log are:
bash-5.0# cat /tmp/sshd.log
Server listening on :: port 22.
Server listening on 0.0.0.0 port 22.
Connection closed by authenticating user root fe80::2%rndis0 port 55928 [preauth]
Connection closed by authenticating user root fe80::2%rndis0 port 44518 [preauth]
Connection closed by authenticating user root fe80::2%rndis0 port 44524 [preauth]
Connection closed by authenticating user root fe80::2%rndis0 port 44536 [preauth]
Connection closed by authenticating user root fe80::2%rndis0 port 44552 [preauth]
Connection closed by authenticating user root fe80::2%rndis0 port 44560 [preauth]
Connection closed by authenticating user root fe80::2%rndis0 port 44570 [preauth]
Connection closed by authenticating user root fe80::2%rndis0 port 44580 [preauth]
Connection closed by authenticating user root fe80::2%rndis0 port 44588 [preauth]
Connection closed by authenticating user root fe80::2%rndis0 port 44594 [preauth]
Connection closed by authenticating user root fe80::2%rndis0 port 44596 [preauth]
Connection closed by fc00:1:1::1 port 45834 [preauth]

My hypothesis is that the ssh timeouts might be too short, although that is just a guess.

I tried changing the timeouts in:

  • tools/kernel_flash/l4t_network_flash.func and
  • tools/kernel_flash/l4t_initrd_flash_internal.sh,

but my timeout modifications appeared to be unused, so perhaps these are not the correct files to modify.

So, to summarize:

  • bash -x ./flash.sh cti/xavier-nx/boson/base mmcblk0p1 works
  • bash -x ./tools/kernel_flash/l4t_initrd_flash.sh cti/xavier-nx/boson/base internal fails, reporting:
    • l4t_flash_from_kernel: Successfully create gpt for emmc (success)
    • Run command: partprobe on root@fe80::1%enp0s20f0u1 -> SSH is not ready, but
    • ssh root@fe80::1%enp0s20f0u1 from the host succeeds
    • the full log is flash-initrd.log (187.8 KB)

can you try to use

bash -x ./tools/kernel_flash/l4t_initrd_flash.sh --network usb0 cti/xavier-nx/boson/base interna

Haha! I actually tried that and forgot to mention it! :-)

Invoking with

bash -x ./tools/kernel_flash/l4t_initrd_flash.sh --network usb0 cti/xavier-nx/boson/base internal

gives a slightly different failure. We do not get l4t_flash_from_kernel: Successful.... Instead:

***************************************
*                                     *
*  Step 3: Start the flashing process *
*                                     *
***************************************
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for device to expose ssh ......RTNETLINK answers: File exists
RTNETLINK answers: File exists
Waiting for device to expose ssh ...Run command: flash on fc00:1:1:0::2
SSH is not ready
Cleaning up...

And from the device console:

bash-5.0# cat /tmp/sshd.log
Server listening on :: port 22.
Server listening on 0.0.0.0 port 22.
Connection closed by authenticating user root fc00:1:1::1 port 55116 [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 55130 [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 55132 [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 55138 [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 55144 [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 55150 [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 44474 [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 44482 [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 44484 [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 44498 [preauth]

bash-5.0# ip -c addr
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether ee:84:a9:2a:77:df brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:0c:8b:b5:15:d0 brd ff:ff:ff:ff:ff:ff
4: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 48:b0:2d:67:4e:a8 brd ff:ff:ff:ff:ff:ff
5: rndis0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 42:75:2e:15:f0:04 brd ff:ff:ff:ff:ff:ff
    inet6 fc00:1:1::2/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::4075:2eff:fe15:f004/64 scope link 
       valid_lft forever preferred_lft forever
    inet6 fe80::1/128 scope link 
       valid_lft forever preferred_lft forever

Once again, from the host ssh -v root@fc00:1:1:0::2 does connect.

The full flashing log: flash-emmc-usb0.log (188.9 KB)

Can you change maxcount variable in this function in tools/kernel_flash/l4t_network_flash.func

 run_flash_commmand_on_target()
{
...
fi
count=0
maxcount=10
while ! sshpass -p root ssh "root@${1}" "${SSH_OPT[@]}" "echo SSH ready"
do
	count=$((count + 1))
	if [ "${count}" -ge "${maxcount}" ]; then
		echo "SSH is not ready"
		exit 1
	fi
	sleep 1
done
if [ -e "${L4T_INITRD_FLASH_DIR}/bin/aarch64/simg2img" ]; then
	cp "${L4T_INITRD_FLASH_DIR}/bin/aarch64/simg2img" "${NFS_IMAGES_DIR}"
fi
if ! sshpass -p root ssh "root@${1}" "${SSH_OPT[@]}" "NFS_ROOTFS_DIR=\"${NFS_ROOTFS_DIR}\" NFS_IMAGES_DIR=\"${NFS_IMAGES_DIR}\" /bin/${FLASH_FROM_NETWORK_SCRIPT} ${cmd[*]}"; then
	echo "Flash failure"
	exit 1
fi
if ! sshpass -p root ssh "root@${1}" "${SSH_OPT[@]}" "nohup reboot &>/dev/null & exit"; then
	echo "Reboot failure"
	exit 1
fi

export LC_ALL="${OLD_LC_ALL}" LANG="${OLD_LANG}" LANGUAGE="${OLD_LANGUAGE}"
}

So I increased the timeout as suggested, but no luck.

Adding an nmap -6 -p 22 --host-timeout 0.5 fc00:1:1::2 to the sshpass loop shows that the ssh port is reachable almost immediately, but sshpass still fails.

Further debugging has shown the cause.

The command(s) sshpass -p root ssh "root@${1}" ... always fails with the following return code:

6      Host public key is unknown. sshpass exits without confirming the new key.

I have tried adding

local ssh_opts=( -o StrictHostKeyChecking=no -o UserKnownHostsFile=$(mktemp) )

and

sshpass -p root ssh "${ssh_opts[@]}" "root@${1}"  ...

to all sshpass instances, but with no luck so far.

@lhoang, I’ve been investigating and the results are curiouser and curiouser!

First I modified the run_flash_command_on_target function to use ssh-keyscan to connect and record the device’s ssh keys, and use them as a user’s “known hosts” files:

run_flash_commmand_on_target()
{
	set -x
	echo "Run command: flash on ${1}"
	local OLD_LC_ALL="${LC_ALL}"
	local OLD_LANG="${LANG}"
	local OLD_LANGUAGE="${LANGUAGE}"
	export LC_ALL="" LANG="en_US.UTF-8" LANGUAGE=""
	local cmd=()

	local hosts_file=$(mktemp)
	SSH_OPT_EXTRA=( -oUserKnownHostsFile="${hosts_file}" -vvv )

	if [ -n "${target_partname}" ]; then
		cmd+=("-k" "${target_partname}")
	fi

	if [ -n "${external_only}" ]; then
		cmd+=("${external_only}")
	fi
	count=0
	maxcount=10
	echo ">>> DEBUG 'root@${1}' <<<"
	while ! ssh-keyscan -T 1 "${1}" > "${hosts_file}"
	do
		echo -n ">>> nmap ${1} port 22 <<< " ; nmap -6 -p 22 --host-timeout 0.5 "${1}" | grep ssh
		count=$((count + 1))
		if [ "${count}" -ge "${maxcount}" ]; then
			echo "SSH is not ready"
			exit 1
		fi
		sleep 1
	done
	echo ">>> DEBUG contents of ${hosts_file} <<<"
	cat "${hosts_file}"
	if [ -e "${L4T_INITRD_FLASH_DIR}/bin/aarch64/simg2img" ]; then
		cp "${L4T_INITRD_FLASH_DIR}/bin/aarch64/simg2img" "${NFS_IMAGES_DIR}"
	fi
	if ! sshpass -p root ssh "root@${1}" "${SSH_OPT[@]}" "${SSH_OPT_EXTRA[@]}" "NFS_ROOTFS_DIR=\"${NFS_ROOTFS_DIR}\" NFS_IMAGES_DIR=\"${NFS_IMAGES_DIR}\" /bin/${FLASH_FROM_NETWORK_SCRIPT} ${cmd[*]}"; then
		echo "Flash failure"
		exit 1
	fi
	if ! sshpass -p root ssh "root@${1}" "${SSH_OPT[@]}" "${SSH_OPT_EXTRA[@]}" "nohup reboot &>/dev/null & exit"; then
		echo "Reboot failure"
		exit 1
	fi

	export LC_ALL="${OLD_LC_ALL}" LANG="${OLD_LANG}" LANGUAGE="${OLD_LANGUAGE}"
	set +x
}

This works as expected and lets us move to the next phase.

At this point, we are correctly talking to the sshd of the recovery device, but cannot log in with the password root.

Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for device to expose ssh ......RTNETLINK answers: File exists
RTNETLINK answers: File exists
Waiting for device to expose ssh ...+ echo 'Run command: flash on fc00:1:1:0::2'
Run command: flash on fc00:1:1:0::2
+ local OLD_LC_ALL=
+ local OLD_LANG=en_US.UTF-8
+ local OLD_LANGUAGE=
+ export LC_ALL= LANG=en_US.UTF-8 LANGUAGE=
+ LC_ALL=
+ LANG=en_US.UTF-8
+ LANGUAGE=
+ cmd=()
+ local cmd
++ mktemp
+ local hosts_file=/tmp/tmp.hyEIV2hQX6
+ SSH_OPT_EXTRA=(-oUserKnownHostsFile="${hosts_file}" -vvv)
+ '[' -n '' ']'
+ '[' -n '' ']'
+ count=0
+ maxcount=10
+ echo '>>> DEBUG '\''root@fc00:1:1:0::2'\'' <<<'
>>> DEBUG 'root@fc00:1:1:0::2' <<<
+ ssh-keyscan -T 1 fc00:1:1:0::2
# fc00:1:1:0::2:22 SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.7
# fc00:1:1:0::2:22 SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.7
# fc00:1:1:0::2:22 SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.7
# fc00:1:1:0::2:22 SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.7
# fc00:1:1:0::2:22 SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.7
+ echo '>>> DEBUG contents of /tmp/tmp.hyEIV2hQX6 <<<'
>>> DEBUG contents of /tmp/tmp.hyEIV2hQX6 <<<
+ cat /tmp/tmp.hyEIV2hQX6
fc00:1:1:0::2 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBOmh+T7E7zuPe5bARIVjEQxkj4WEunDc2CqnHRhbsD5R5VI9mNdUAWM5h0MgibjX+wVhNrEUalZ0i4ahTVGfCqo=
fc00:1:1:0::2 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDU8OdDzisuMr1HH4XyB61gC14+RjfSKq1qcfnPt5bcWXr1N+zGtmNU6u3t+CJZ1pg9Ah4oaWF6O24koh0Anc3EBnHe7xR37EmhgA8XbvzAdM1DKin92iDiJDYvoL9MswNRl8yecJkoj+WJVmzaTfJMkCWuqIJf9cONu78tJXNqhsWTO1pQfKgkTX1bkNklIjgwEs79BeMKFLKT0egLS4JeIsuNbHiYeSn2gmOOAntxuCBKLiru/mIlPjtAvezJbdjmJ6R9CHXxEwxktOShywnx2Xi2FPVd/aa2dNqT/KIgAvdX9UWZBvGxla7LWEuj665zltRixHZbhTT4Anh3qemFeCMsOCtfQ//E9GkAkwFTCIaZ4B8LjbKayx1s543u20nmWI6D2kSXSDLScxmlqpp0mMTj5CUxRM66djyRseHEUQ/w5rUY6exARGksUTyMPZ0z0Au/FvyMufQPuxb19JMTKO3QTbUjsRUVnU8EUdqjG+m/23ye6oGY4XcRr5ZdjC0=
fc00:1:1:0::2 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJx0jsQRgq/tRO5jXZ+1RMzUgP3pWtpLPK26UuWiDdLz
+ '[' -e /home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/bin/aarch64/simg2img ']'
+ cp /home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/bin/aarch64/simg2img /home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/images
+ sshpass -p root ssh root@fc00:1:1:0::2 -q -oServerAliveInterval=90 -oServerAliveCountMax=3 -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oUserKnownHostsFile=/tmp/tmp.hyEIV2hQX6 -vvv 'NFS_ROOTFS_DIR="/home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/rootfs" NFS_IMAGES_DIR="/home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/images" /bin/nv_flash_from_network.sh '
OpenSSH_8.9p1 Ubuntu-3ubuntu0.1, OpenSSL 3.0.2 15 Mar 2022
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: include /etc/ssh/ssh_config.d/*.conf matched no files
debug1: /etc/ssh/ssh_config line 21: Applying options for *
debug2: resolve_canonicalize: hostname fc00:1:1:0::2 is address
debug2: resolve_canonicalize: canonicalised address "fc00:1:1:0::2" => "fc00:1:1::2"
debug3: ssh_connect_direct: entering
debug1: Connecting to fc00:1:1::2 [fc00:1:1::2] port 22.
debug3: set_sock_tos: set socket 3 IPV6_TCLASS 0x10
debug1: Connection established.
debug1: identity file /root/.ssh/id_rsa type -1
debug1: identity file /root/.ssh/id_rsa-cert type -1
debug1: identity file /root/.ssh/id_ecdsa type -1
debug1: identity file /root/.ssh/id_ecdsa-cert type -1
debug1: identity file /root/.ssh/id_ecdsa_sk type -1
debug1: identity file /root/.ssh/id_ecdsa_sk-cert type -1
debug1: identity file /root/.ssh/id_ed25519 type -1
debug1: identity file /root/.ssh/id_ed25519-cert type -1
debug1: identity file /root/.ssh/id_ed25519_sk type -1
debug1: identity file /root/.ssh/id_ed25519_sk-cert type -1
debug1: identity file /root/.ssh/id_xmss type -1
debug1: identity file /root/.ssh/id_xmss-cert type -1
debug1: identity file /root/.ssh/id_dsa type -1
debug1: identity file /root/.ssh/id_dsa-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.9p1 Ubuntu-3ubuntu0.1
debug1: Remote protocol version 2.0, remote software version OpenSSH_8.2p1 Ubuntu-4ubuntu0.7
debug1: compat_banner: match: OpenSSH_8.2p1 Ubuntu-4ubuntu0.7 pat OpenSSH* compat 0x04000000
debug2: fd 3 setting O_NONBLOCK
debug1: Authenticating to fc00:1:1::2:22 as 'root'
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts2: No such file or directory
debug3: order_hostkeyalgs: no algorithms matched; accept original
debug3: send packet: type 20
debug1: SSH2_MSG_KEXINIT sent
debug3: receive packet: type 20
debug1: SSH2_MSG_KEXINIT received
debug2: local client KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,sntrup761x25519-sha512@openssh.com,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256,ext-info-c
debug2: host key algorithms: ssh-ed25519-cert-v01@openssh.com,ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,sk-ssh-ed25519-cert-v01@openssh.com,sk-ecdsa-sha2-nistp256-cert-v01@openssh.com,rsa-sha2-512-cert-v01@openssh.com,rsa-sha2-256-cert-v01@openssh.com,ssh-ed25519,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ssh-ed25519@openssh.com,sk-ecdsa-sha2-nistp256@openssh.com,rsa-sha2-512,rsa-sha2-256
debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: MACs ctos: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: MACs stoc: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: compression ctos: none,zlib@openssh.com,zlib
debug2: compression stoc: none,zlib@openssh.com,zlib
debug2: languages ctos: 
debug2: languages stoc: 
debug2: first_kex_follows 0 
debug2: reserved 0 
debug2: peer server KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256
debug2: host key algorithms: rsa-sha2-512,rsa-sha2-256,ssh-rsa,ecdsa-sha2-nistp256,ssh-ed25519
debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: MACs ctos: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: MACs stoc: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: compression ctos: none,zlib@openssh.com
debug2: compression stoc: none,zlib@openssh.com
debug2: languages ctos: 
debug2: languages stoc: 
debug2: first_kex_follows 0 
debug2: reserved 0 
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: ssh-ed25519
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug3: send packet: type 30
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug3: receive packet: type 31
debug1: SSH2_MSG_KEX_ECDH_REPLY received
debug1: Server host key: ssh-ed25519 SHA256:cqcP2ox6TZSW4RC+mdKWZzjAVqUPkaCbVIRzufHiQos
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts2: No such file or directory
Warning: Permanently added 'fc00:1:1::2' (ED25519) to the list of known hosts.
debug3: send packet: type 21
debug2: ssh_set_newkeys: mode 1
debug1: rekey out after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug3: receive packet: type 21
debug1: SSH2_MSG_NEWKEYS received
debug2: ssh_set_newkeys: mode 0
debug1: rekey in after 134217728 blocks
debug1: Will attempt key: /root/.ssh/id_rsa 
debug1: Will attempt key: /root/.ssh/id_ecdsa 
debug1: Will attempt key: /root/.ssh/id_ecdsa_sk 
debug1: Will attempt key: /root/.ssh/id_ed25519 
debug1: Will attempt key: /root/.ssh/id_ed25519_sk 
debug1: Will attempt key: /root/.ssh/id_xmss 
debug1: Will attempt key: /root/.ssh/id_dsa 
debug2: pubkey_prepare: done
debug3: send packet: type 5
debug3: receive packet: type 7
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<ssh-ed25519,sk-ssh-ed25519@openssh.com,ssh-rsa,rsa-sha2-256,rsa-sha2-512,ssh-dss,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ecdsa-sha2-nistp256@openssh.com>
debug3: receive packet: type 6
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug3: send packet: type 50
debug3: receive packet: type 51
debug1: Authentications that can continue: 
debug3: start over, passed a different list gssapi-with-mic,publickey,keyboard-interactive,password
debug3: preferred gssapi-with-mic,publickey,keyboard-interactive,password
debug3: authmethod_lookup gssapi-with-mic
debug3: remaining preferred: publickey,keyboard-interactive,password
debug3: authmethod_is_enabled gssapi-with-mic
debug1: Next authentication method: gssapi-with-mic
debug1: No credentials were supplied, or the credentials were unavailable or inaccessible
No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_0)


debug1: No credentials were supplied, or the credentials were unavailable or inaccessible
No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_0)


debug2: we did not send a packet, disable method
debug3: authmethod_lookup publickey
debug3: remaining preferred: keyboard-interactive,password
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Trying private key: /root/.ssh/id_rsa
debug3: no such identity: /root/.ssh/id_rsa: No such file or directory
debug1: Trying private key: /root/.ssh/id_ecdsa
debug3: no such identity: /root/.ssh/id_ecdsa: No such file or directory
debug1: Trying private key: /root/.ssh/id_ecdsa_sk
debug3: no such identity: /root/.ssh/id_ecdsa_sk: No such file or directory
debug1: Trying private key: /root/.ssh/id_ed25519
debug3: no such identity: /root/.ssh/id_ed25519: No such file or directory
debug1: Trying private key: /root/.ssh/id_ed25519_sk
debug3: no such identity: /root/.ssh/id_ed25519_sk: No such file or directory
debug1: Trying private key: /root/.ssh/id_xmss
debug3: no such identity: /root/.ssh/id_xmss: No such file or directory
debug1: Trying private key: /root/.ssh/id_dsa
debug3: no such identity: /root/.ssh/id_dsa: No such file or directory
debug2: we did not send a packet, disable method
debug3: authmethod_lookup keyboard-interactive
debug3: remaining preferred: password
debug3: authmethod_is_enabled keyboard-interactive
debug1: Next authentication method: keyboard-interactive
debug2: userauth_kbdint
debug3: send packet: type 50
debug2: we sent a keyboard-interactive packet, wait for reply
debug3: receive packet: type 51
debug1: Authentications that can continue: 
debug3: userauth_kbdint: disable: no info_req_seen
debug2: we did not send a packet, disable method
debug3: authmethod_lookup password
debug3: remaining preferred: 
debug3: authmethod_is_enabled password
debug1: Next authentication method: password
debug3: send packet: type 50
debug2: we sent a password packet, wait for reply
debug3: receive packet: type 51
debug1: Authentications that can continue: 
Permission denied, please try again.
+ echo 'Flash failure'
Flash failure

The full log is here: flash-emmc-ssh-keyscan.log (197.3 KB)

Now here is the funny part.

I can verify the password for root is root on the device via the console:

bash-5.0# cat /etc/shadow
root:$6$qYzNFHlg$M4RG6AtkTS3kj1/Al2WqoRvUxWL9mFRjUadC74qSBhgWtkRjtVtiZJpJvAaG4DEHnJKSnOVwX4fHCt0A7vtXR/:18192:0:99999:7:::
sshd:\*:17669:0:99999:7:::

and

$ mkpasswd -m sha-512 -S 'qYzNFHlg' root
$6$qYzNFHlg$M4RG6AtkTS3kj1/Al2WqoRvUxWL9mFRjUadC74qSBhgWtkRjtVtiZJpJvAaG4DEHnJKSnOVwX4fHCt0A7vtXR/

and I have verified that this is the same password in the ota_tools directory.

I can verify that fc00:1:1:0::2 is the device and is connected to the sshd daemon, by kill -STOP ... and kill -CONT ... on the device.

However trying to ssh from the host to the device manually connects but password authentication fails.

On the host, using

ssh -vvv -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -v root@fc00:1:1:0::2 /usr/bin/echo hello

results in “permission denied”

OpenSSH_8.9p1 Ubuntu-3ubuntu0.1, OpenSSL 3.0.2 15 Mar 2022
debug1: Reading configuration data /home/andrew/.ssh/config
debug1: /home/andrew/.ssh/config line 50: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: include /etc/ssh/ssh_config.d/*.conf matched no files
debug1: /etc/ssh/ssh_config line 21: Applying options for *
debug2: resolve_canonicalize: hostname fc00:1:1:0::2 is address
debug2: resolve_canonicalize: canonicalised address "fc00:1:1:0::2" => "fc00:1:1::2"
debug3: ssh_connect_direct: entering
debug1: Connecting to fc00:1:1::2 [fc00:1:1::2] port 22.
debug3: set_sock_tos: set socket 3 IPV6_TCLASS 0x10
debug1: Connection established.
debug1: identity file /home/andrew/.ssh/id_rsa type 0
debug1: identity file /home/andrew/.ssh/id_rsa-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.9p1 Ubuntu-3ubuntu0.1
debug1: Remote protocol version 2.0, remote software version OpenSSH_8.2p1 Ubuntu-4ubuntu0.7
debug1: compat_banner: match: OpenSSH_8.2p1 Ubuntu-4ubuntu0.7 pat OpenSSH* compat 0x04000000
debug2: fd 3 setting O_NONBLOCK
debug1: Authenticating to fc00:1:1::2:22 as 'root'
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts2: No such file or directory
debug3: order_hostkeyalgs: no algorithms matched; accept original
debug3: send packet: type 20
debug1: SSH2_MSG_KEXINIT sent
debug3: receive packet: type 20
debug1: SSH2_MSG_KEXINIT received
debug2: local client KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,sntrup761x25519-sha512@openssh.com,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256,ext-info-c
debug2: host key algorithms: ssh-ed25519-cert-v01@openssh.com,ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,sk-ssh-ed25519-cert-v01@openssh.com,sk-ecdsa-sha2-nistp256-cert-v01@openssh.com,rsa-sha2-512-cert-v01@openssh.com,rsa-sha2-256-cert-v01@openssh.com,ssh-ed25519,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ssh-ed25519@openssh.com,sk-ecdsa-sha2-nistp256@openssh.com,rsa-sha2-512,rsa-sha2-256
debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: MACs ctos: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: MACs stoc: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: compression ctos: zlib@openssh.com,zlib,none
debug2: compression stoc: zlib@openssh.com,zlib,none
debug2: languages ctos: 
debug2: languages stoc: 
debug2: first_kex_follows 0 
debug2: reserved 0 
debug2: peer server KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256
debug2: host key algorithms: rsa-sha2-512,rsa-sha2-256,ssh-rsa,ecdsa-sha2-nistp256,ssh-ed25519
debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: MACs ctos: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: MACs stoc: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: compression ctos: none,zlib@openssh.com
debug2: compression stoc: none,zlib@openssh.com
debug2: languages ctos: 
debug2: languages stoc: 
debug2: first_kex_follows 0 
debug2: reserved 0 
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: ssh-ed25519
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: zlib@openssh.com
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: zlib@openssh.com
debug3: send packet: type 30
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug3: receive packet: type 31
debug1: SSH2_MSG_KEX_ECDH_REPLY received
debug1: Server host key: ssh-ed25519 SHA256:cqcP2ox6TZSW4RC+mdKWZzjAVqUPkaCbVIRzufHiQos
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts2: No such file or directory
Warning: Permanently added 'fc00:1:1::2' (ED25519) to the list of known hosts.
debug3: send packet: type 21
debug2: ssh_set_newkeys: mode 1
debug1: rekey out after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug3: receive packet: type 21
debug1: SSH2_MSG_NEWKEYS received
debug2: ssh_set_newkeys: mode 0
debug1: rekey in after 134217728 blocks
debug2: get_agent_identities: ssh_agent_bind_hostkey: agent refused operation
debug1: get_agent_identities: ssh_fetch_identitylist: agent contains no identities
debug1: Will attempt key: /home/andrew/.ssh/id_rsa RSA SHA256:OLct8o8JRFiX/TDDTGP9lc1rUiw7vlS/CWw0YeQeC2g explicit
debug2: pubkey_prepare: done
debug3: send packet: type 5
debug3: receive packet: type 7
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<ssh-ed25519,sk-ssh-ed25519@openssh.com,ssh-rsa,rsa-sha2-256,rsa-sha2-512,ssh-dss,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ecdsa-sha2-nistp256@openssh.com>
debug3: receive packet: type 6
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug3: send packet: type 50
debug3: receive packet: type 51
debug1: Authentications that can continue: 
debug3: start over, passed a different list gssapi-with-mic,publickey,keyboard-interactive,password
debug3: preferred gssapi-with-mic,publickey,keyboard-interactive,password
debug3: authmethod_lookup gssapi-with-mic
debug3: remaining preferred: publickey,keyboard-interactive,password
debug3: authmethod_is_enabled gssapi-with-mic
debug1: Next authentication method: gssapi-with-mic
debug1: No credentials were supplied, or the credentials were unavailable or inaccessible
No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_1000)


debug1: No credentials were supplied, or the credentials were unavailable or inaccessible
No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_1000)


debug2: we did not send a packet, disable method
debug3: authmethod_lookup publickey
debug3: remaining preferred: keyboard-interactive,password
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Offering public key: /home/andrew/.ssh/id_rsa RSA SHA256:OLct8o8JRFiX/TDDTGP9lc1rUiw7vlS/CWw0YeQeC2g explicit
debug3: send packet: type 50
debug2: we sent a publickey packet, wait for reply
debug3: receive packet: type 51
debug1: Authentications that can continue: 
debug2: we did not send a packet, disable method
debug3: authmethod_lookup keyboard-interactive
debug3: remaining preferred: password
debug3: authmethod_is_enabled keyboard-interactive
debug1: Next authentication method: keyboard-interactive
debug2: userauth_kbdint
debug3: send packet: type 50
debug2: we sent a keyboard-interactive packet, wait for reply
debug3: receive packet: type 51
debug1: Authentications that can continue: 
debug3: userauth_kbdint: disable: no info_req_seen
debug2: we did not send a packet, disable method
debug3: authmethod_lookup password
debug3: remaining preferred: 
debug3: authmethod_is_enabled password
debug1: Next authentication method: password
root@fc00:1:1::2's password: 
debug3: send packet: type 50
debug2: we sent a password packet, wait for reply
debug3: receive packet: type 51
debug1: Authentications that can continue: 
Permission denied, please try again.
root@fc00:1:1::2's password: 
debug3: send packet: type 50
debug2: we sent a password packet, wait for reply
debug3: receive packet: type 51
debug1: Authentications that can continue: 
Permission denied, please try again.
root@fc00:1:1::2's password: 
debug3: send packet: type 50
debug2: we sent a password packet, wait for reply
debug3: receive packet: type 51
debug1: Authentications that can continue: 
debug2: we did not send a packet, disable method
debug1: No more authentication methods to try.
root@fc00:1:1::2: Permission denied ().

On the device, the sshd logs are not helpful:

bash-5.0# cat /etc/         
bash-5.0# cat /tmp/sshd.log
Server listening on :: port 22.
Server listening on 0.0.0.0 port 22.
Connection closed by fc00:1:1::1 port 36418 [preauth]
Connection closed by fc00:1:1::1 port 36414 [preauth]
Connection closed by fc00:1:1::1 port 36428 [preauth]
Unable to negotiate with fc00:1:1::1 port 36430: no matching host key type found. Their offer: sk-ecdsa-sha2-nistp256@openssh.com [preauth]
Unable to negotiate with fc00:1:1::1 port 36434: no matching host key type found. Their offer: sk-ssh-ed25519@openssh.com [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 36446 [preauth]
Connection closed by authenticating user root fc00:1:1::1 port 43914 [preauth]

I have tried appending LogLevel DEBUG3 to /etc/ssh/sshd_config and kill -HUP ... the sshd, but there is no increase in useful information.

I have tried manually changing the root password on the device in the recovery initrd, but cannot:

bash-5.0# passwd
passwd: pam_start() failed, error 26

And I have verified that sshd is not using PAM:

bash-5.0# grep -i pam /etc/ssh/sshd_config
# ... comments removed
UsePAM no

I am at a loss as to how to proceed with debugging this issue at this point.

@lhoang, I have more debugging info, and it’s troubling.

First is that I can confirm a bug in NVIDIA’s sshd_config setup script.

I found, to my surprise, that on the initrd recovery, the following:

bash-5.0# grep -i password /etc/ssh/sshd_config
# To disable tunneled clear text passwords, change to no here!
PasswordAuthentication no
PermitEmptyPasswords no

Patching JetPack with the following corrects the ssh login issue:

diff --git a/tools/ota_tools/version_upgrade/build_base_recovery_image.sh b/tools/ota_tools/version_upgrade/build_base_recovery_image.sh
index 2ad2799..7d67018 100755
--- a/tools/ota_tools/version_upgrade/build_base_recovery_image.sh
+++ b/tools/ota_tools/version_upgrade/build_base_recovery_image.sh
@@ -223,7 +223,7 @@ prepare_sshd_files()
 	sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' "${initrd_dir}/${sshd_config_file}";check_error
 	sed -i 's/#StrictModes/StrictModes/' "${initrd_dir}/${sshd_config_file}";check_error
 	sed -i 's/#PubkeyAuthentication yes/PubkeyAuthentication no/' "${initrd_dir}/${sshd_config_file}";check_error
-	sed -i 's/#PasswordAuthentication/PasswordAuthentication/' "${initrd_dir}/${sshd_config_file}";check_error
+	sed -i 's/#\?PasswordAuthentication \+.*/PasswordAuthentication yes/' "${initrd_dir}/${sshd_config_file}";check_error
 	sed -i 's/#PermitEmptyPasswords/PermitEmptyPasswords/' "${initrd_dir}/${sshd_config_file}";check_error
 	sed -i 's/UsePAM yes/UsePAM no/' "${initrd_dir}/${sshd_config_file}";check_error
 
@@ -606,4 +606,3 @@ copy_recovery_files "${TOT_L4T_DIR}"
 rm -Rf "${WORKDIR}"
 
 echo "Finished"
-
diff --git a/tools/ota_tools/version_upgrade/ota_make_recovery_img_dtb.sh b/tools/ota_tools/version_upgrade/ota_make_recovery_img_dtb.sh
index 65242bc..db59cf2 100755
--- a/tools/ota_tools/version_upgrade/ota_make_recovery_img_dtb.sh
+++ b/tools/ota_tools/version_upgrade/ota_make_recovery_img_dtb.sh
@@ -102,7 +102,7 @@ prepare_sshd_files()
 	sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' "${_initrd_dir}/${sshd_config_file}";check_error
 	sed -i 's/#StrictModes/StrictModes/' "${_initrd_dir}/${sshd_config_file}";check_error
 	sed -i 's/#PubkeyAuthentication yes/PubkeyAuthentication no/' "${_initrd_dir}/${sshd_config_file}";check_error
-	sed -i 's/#PasswordAuthentication/PasswordAuthentication/' "${_initrd_dir}/${sshd_config_file}";check_error
+	sed -i 's/#\?PasswordAuthentication \+.*/PasswordAuthentication yes/' "${_initrd_dir}/${sshd_config_file}";check_error
 	sed -i 's/#PermitEmptyPasswords/PermitEmptyPasswords/' "${_initrd_dir}/${sshd_config_file}";check_error
 	sed -i 's/UsePAM yes/UsePAM no/' "${_initrd_dir}/${sshd_config_file}";check_error

Second the flashing still does not proceed. From the logs (edited for clarity):

Waiting for target to boot-up...
Waiting for device to expose ssh ......RTNETLINK answers: File exists
RTNETLINK answers: File exists
Waiting for device to expose ssh ...+ echo 'Run command: flash on fc00:1:1:0::2'
Run command: flash on fc00:1:1:0::2
...
+ '[' -e /home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/bin/aarch64/simg2img ']'
+ cp /home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/bin/aarch64/simg2img /home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/images
+ sshpass -p root ssh root@fc00:1:1:0::2 -q -oServerAliveInterval=90 -oServerAliveCountMax=3 -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oUserKnownHostsFile=/tmp/tmp.yhyMPDP1KH -vvv 'NFS_ROOTFS_DIR="/home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/rootfs" NFS_IMAGES_DIR="/home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/images" /bin/nv_flash_from_network.sh '
OpenSSH_8.9p1 Ubuntu-3ubuntu0.1, OpenSSL 3.0.2 15 Mar 2022
...
debug1: Sending command: NFS_ROOTFS_DIR="/home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/rootfs" NFS_IMAGES_DIR="/home/andrew/Local/nvidia/nvidia-linux-for-tegra/nvidia-l4t-board-support-packages/tools/kernel_flash/images" /bin/nv_flash_from_network.sh
...
[ 0]: l4t_flash_from_kernel: Starting to create gpt for emmc
...
Active index file is /mnt/internal/flash.idx
Number of lines is 68
max_index=67
writing item=43, 1:3:primary_gpt, 512, 19968, gpt_primary_1_3.bin, 16896, fixed-<reserved>-0, 4efcf66bdeefdf9cabc381ba980f716b2da90fc5
...
flock: cannot open lock file /var/lock/nvidiainitrdflashdebug2: channel 0: written 56 to efd 6
...
: No such file or directory
...
Transferred: sent 2232, received 2888 bytes, in 2.8 seconds

Bytes per second: sent 797.7, received 1032.1

debug1: Exit status 66

+ echo 'Flash failure'

Flash failure
+ exit 1

That’s a lot of text, here is a summary:

  • the ssh login to the device works and the host can correctly send commands to the device
  • the first set of commands, begin writing the GPT, succeeds
  • however, the remote command fails because of the following sshd error
    • flock: cannot open lock file /var/lock/nvidiainitrdflashdebug2: channel 0: written 56 to efd 6

This command fails because /var/lock does not exist.

bash-5.0# ls -l /var
total 0
drwxr-xr-x 2 root 0 0 Jan  1 00:01 run
bash-5.0#

Summary

  • The initial ssh command fails because the initrd recovery disk does not have PasswordAuthentication yes in its sshd_config file.
  • When this is fixed, subsequent commands fail because the /var/lock directory does not exist.
  • I can confirm what with the password patch, the nfs system correctly starts on the device.

the /var/lock folder should be in
“${NFS_ROOTFS_DIR}/var/lock”

can you check

If this is not a provided NVIDIA root file system then initrd flash might not work

I just checked our root file system, and it does have ${NFS_ROOTFS_DIR}/var/lock correctly symlinked to /run/lock

Of course, /run exists on the device, but, being a tmpfs, it seems that /run/lock is not created, and does not exist.

Since /var/lock is deprecated by the Linux Filesystem Standard, and /run needs to be populated at runtime since it is a tmpfs, it would seem that the /run/lock directory would need to be created by NVIDIA’s init shell script.

If you can tell me where, I’m happy to try patching JetPack and testing!

SOLVED

The solution requires two patches for JetPack 5.1.0 and 5.1.1 (others are untested).

First the two occurrences of the sshd_config setup must be changed such that:

-       sed -i 's/#PasswordAuthentication/PasswordAuthentication/' "${_initrd_dir}/${sshd_config_file}";check_error
+       sed -i 's/#\?PasswordAuthentication \+.*/PasswordAuthentication yes/' "${_initrd_dir}/${sshd_config_file}";check_error

Here is a complete patchfile: 0001-Fix-sshd-PasswordAuthentication.patch (2.6 KB)

Second the initrd-flash system incorrectly assumes that the directory /var/lock exists.

On Ubuntu and Debian, /var/lock is a symlink to /run/lock. But /run is always a tmpfs that is set up by systemd. So no LFS-conforming setup can have an existing /run/lock directory!

The solution is to either have the initrd-flash system create the directory or to prominently note in the documentation that this non-standard, non-conforming directory must exist.

In our builds we did the following:

mkdir      "rootfs/run/lock"
chown 0:0  "rootfs/run/lock"
chmod 1777 "rootfs/run/lock" # drwxrwxrwt

so things would behave with and without a proper tmpfs mount.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.