Flashing tx2 gets stuck on post installation

Hi,

I am trying to re-flash a tx2 using a host pc which run Ubuntu 16.04 and Jetpack 3.2.1. Both devices are connected to a switch which is connected to the internet.

The actual flashing works, however it keeps getting stuck at the start of the post installation.

The xTerm window gets stuck on:

Waiting 30 seconds to make sure target is fully up
DEBUG:  echo GetValueToInfoBroker: key = Install_Dir, value = /media/paul/DATA/jetpack_install >> /tmp/jetpack_debug.log
DEBUG:  mkdir -p /home/paul/.ssh 
DEBUG:  ssh-keygen -R 10.42.0.45
Host 10.42.0.45 not found in /home/paul/.ssh/known_hosts
DEBUG:  ssh-keyscan 10.42.0.45>>/home/paul/.ssh/known_hosts
# 10.42.0.45:22 SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu1
# 10.42.0.45:22 SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu1
# 10.42.0.45:22 SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu1
DEBUG:  echo /home/paul
/home/paul
DEBUG:  ssh-add /home/paul/.ssh/id_rsa 0<&-
Identity added: /home/paul/.ssh/id_rsa (/home/paul/.ssh/id_rsa)
scp -F /dev/null -o PubkeyAuthentication=no -o ConnectTimeout=30 -o StrictHostKeyChecking=no /home/paul/.ssh/id_rsa.pub nvidia@10.42.0.45:/home/nvidia/tmp.pub
 .0.45'
id_rsa.pub                                    100%  392     0.4KB/s   00:00    
nvidia@tegra-ubuntu:~$ b /home/nvidia/.ssh/authorized_keys 
nvidia@tegra-ubuntu:~$ exit
logout

When I manually start a ssh session I had to login using the password. When I inspected the ssh auth failures in /var/log/auth.log, I see:

Sep 25 09:18:52 tegra-ubuntu sshd[3608]: Authentication refused: bad ownership or modes for directory /home/nvidia
Sep 25 09:18:52 tegra-ubuntu sshd[3608]: Authentication refused: bad ownership or modes for directory /home/nvidia
Sep 25 09:18:55 tegra-ubuntu sshd[3608]: Accepted password for nvidia from 10.42.0.1 port 57270 ssh2
Sep 25 09:18:55 tegra-ubuntu sshd[3608]: pam_unix(sshd:session): session opened for user nvidia by (uid=0)
Sep 25 09:18:55 tegra-ubuntu systemd-logind[556]: New session 3 of user nvidia.
Sep 25 09:18:55 tegra-ubuntu systemd: pam_unix(systemd-user:session): session opened for user nvidia by (uid=0)
Sep 25 09:18:55 tegra-ubuntu sshd[3608]: lastlog_openseek: Couldn't stat /var/log/lastlog: No such file or directory
Sep 25 09:18:55 tegra-ubuntu sshd[3608]: lastlog_openseek: Couldn't stat /var/log/lastlog: No such file or directory

Thereto I inspected the user permissions using ls -la, resulting in:

/home/nvidia:                           drwxrwxrwx 4 1002 1003 4096 Sep 25 10:21
/home/nvidia/.ssh:                      drwxrwxr-x 2 nvidia nvidia 4096 Sep 25 09:17
/home/nvidia/.ssh/authorized_keys:      -rw-r--r-- 1 nvidia nvidia  392 Sep 25 09:17

I’ve read several websites which claim that user home directories should not be writable by others, for example see: https://superuser.com/questions/215504/permissions-on-private-key-in-ssh-folder

When I manually tried to set the user home directory’s permissions using:

sudo chmod go-w /home/nvidia

It strangely resulted in:

sudo: /usr/bin/sudo must be owned by uid 0 and have the setuid bit set

It seems to me that the user permissions of this clean flash are corrupted.

Are these user permissions indeed the problem and how would I resolve it?

Thanks in advance!

Bad ownership or modes means permissions were changed for that user’s “~/.ssh” directory, or that non-public files within that directory are providing illegal access. If you are logged in locally at that user’s account, what do you see from:

cd ~/.ssh
ls -ld . *

Directory should show rw to user, all other permissions denied. Public keys should be readable by anyone, and only writable by you. Private keys should show as not accessible in any way by anyone except you.

One of the concerns is how it ended up that way. If the filesystem being flashed were from a non-ext4 filesystem (e.g., NTFS), then the flashed filesystem would be incapable of preserving permissions and you’d have to completely reflash from scratch on ext4. If there is no reason for those permissions being wrong, then it becomes suspicious.

The failure of “sudo” pretty much nails it at the failure to use an ext4 filesystem to flash from. SUID is not understood by other filesystems, e.g., VFAT or NTFS. You will have to reformat the host’s disk the flash comes from to be ext4 before flashing again.

Thanks! This was indeed the problem. It didn’t occur to me the assigned install folder on the host should have a ext4 filesystem. Since I ran out of disk space on my primary disk I assigned the install folder to my NTFS disk.

This is just a thought for NVIDIA related to the NTFS or other filesystem issue which might simplify things for everyone (and avoid any requirements on the host PC’s filesystem type of the flash content).

The reason the host filesystem type is mandatory as ext4 is that the content is unpacked and built up in the “rootfs/” directory prior to being used to populate the loopback file. It would save disk space and prevent the ext4 issues if the unpacking of files and such went directly into an ext4 formatted loopback file (“bootloader/system.img.raw” loop mounted to “rootfs/”). This would skip the extra copy operation from “rootfs/” into the loop device. After this remove the loopback and use mksparse on “bootloader/system.img.raw” to get “bootloader/system.img” without the separate “rootfs/” ever being used as anything other than a mount point.