Total encryption of Xavier?

you may also like to check relevant threads:

https://devtalk.nvidia.com/default/topic/1033018/jetson-tx2/help-about-using-ecryptfs-to-ecrypt-user-

https://devtalk.nvidia.com/default/topic/898029/jetson-tx1/encryption-via-ecryptfs-on-a-tk1-tx1

https://devtalk.nvidia.com/default/topic/925520/jetson-tk1/encrypting-emmc-partition/

I used ecryptfs for years for my homes and have never seen that happen. You ruled out the tarball since you can’t write anything to home after. The swap error is odd.

Perhaps you can rule out swap entirely by turning it off. You can do that with “sudo swapoff -av” and checking there is 0 in the total column for Swap with the “free” command.

If your tarball then extracts, it moves closer to a solution, but as others may have menetioned ecryptfs is buggy and deprecated.

You can encrypt a block device using luks. Ubuntu provides a way to do that on install, however this doesn’t apply to Tegra. You would probably have to copy the rootfs to a luks encrypted disk/partition, rebuild your kernel with luks support, if needed, and provide an initrd with some mechanism to remotely unlock and boot your drive. As mentioned by somebody else in the thread, this is prevented by dropbear-initramfs not working on Xavier (for now).

You could store the password in the initramfs but that would defeat the point of the encryption, since if you steal the device and read the initramfs, you can unlock /. As Linuxdev mentioned, to do all this you’ll have to become somewhat an expert in initrd design, or patch dropbear-initramfs so it runs on Tegra.

I’d agree with @mdegans to start by disabling swap as a test.

This may not mean anything, but I noticed the cpython errors in the bad strace, but not the ok strace. I also noticed use of “/var/tmp” in the non-error case, but completely missing these in the failure case. This is not a complete strace (which no doubt is enormous), so I don’t know if that means anything.

In a similar way, I see cpython in the bad case, but not the working case. Once more, I don’t know if that is due to only being an excerpt of the (enormous) strace file which the full traces might contain.

As you mention, these two lines of difference naming different tmp locations is likely very important:

unlinkat(AT_FDCWD, "/home/test/.test_file.txt.swp", 0) = 0
# versus:
unlinkat(AT_FDCWD, "/var/tmp/test_file.txt.swp", 0) = 0

Note that “= 0” is success, but what seems important is the fact that the tmp file is in different locations between success and failure.

You might run this test to see if you get an error:

sudo -s
touch /var/tmp/deleteme.txt
rm /var/tmp/deleteme.txt
touch /home/test/deleteme.txt
rm /home/test/deleteme.txt
exit

Basically, you are just testing if creation and deleting differ for “/var/tmp/” and “/home/test/” with “sudo”. Then, when all previous “deleteme.txt” files are gone, try again, but without sudo. Is it safe to say your vi or other operations are done without sudo?

If you run “df -H -T” on your system when it runs with encryption, what shows up?

If you can find a very simple failure case, one where the strace log becomes smaller, then you may be able to post the entire log (you could run “gzip -9” on it) and the answer would become more obvious. Using direct debug methods like strace makes it important to simplify to the smallest possible failure case
then we can go straight to the answer.


Straying off from debugging for a moment, it is quite possible that you could enable ssh in the initrd (probably not an easy thing), and use that for entering a password. Or serial console to enter a password. As @mdegans mentions, having the password in the initrd is not necessarily secure, but there is more than one way to provide a password once you master initrd editing. Perhaps it would be as simple as looking for a specific USB thumb drive or SD card with a password in a file (and even the thumb drive or SD card could be encrypted and need their own passwords). In the case of password on SD card or thumb drive the system would only boot if the correct removable storage is present during boot.

Encrypted filesystems are complicated, and I would not be surprised if there is some simple details missing whereby recreating and checking the instructions each time might lead to an answer. It’s hard to say without sitting down and working for hours trying new things.

This is kind of hacky, but you could have a pi zero (or something else that’s cheap) relay the password by serial console or in USB HID mode (like a rubber ducky). You could even set it up to reboot (or possibly even flash) your Xavier if it locks up and you’re out of reach.

That’s marginally better than external storage which could just be taken along with the device. However, that has concerns since somebody could fake a lockup, listen to the serial port/USB and steal the unlock credentials. I am not sure any anti-temper measures would be sufficient against that. You also would have to manage two devices. A custom initrd is probably the “correct” solution.

All this isn’t easy, like linuxdev said. If your organization has requirements for encryption, you should really have somebody specialized in the area advise you on this since security is really easy to mess up, even for experts. Hire someone to break into the thing as well.

OTOH if you have no requirements, it probably makes little sense to use encryption at all since it just slows things down. I don’t encrypt my workstation filesystem since if somebody has local access, I am screwed anyway. Nation states also aren’t in my threat model.

Thank you @Andrey1984, @mdegans and @linuxdev for your help :)

@linuxdev, yes the logs are not complete. As you mentioned strace generates a huge file. I will make a samller tar file and try to reproduce the issue so that strace log files become smaller. Also, my post might have made a confustion, sorry about that, but if you look at the logs “/var/tmp” can only be seen in the failure case.

I tried these and there was no problem and yes I run all opeartions, vim, tar, and etc, without sudo.

Filesystem             Type      Size  Used Avail Use% Mounted on
/dev/mmcblk0p1         ext4       30G   14G   15G  50% /
none                   devtmpfs  8.2G     0  8.2G   0% /dev
tmpfs                  tmpfs     8.3G  8.2k  8.3G   1% /dev/shm
tmpfs                  tmpfs     8.3G   90M  8.2G   2% /run
tmpfs                  tmpfs     5.3M  4.1k  5.3M   1% /run/lock
tmpfs                  tmpfs     8.3G     0  8.3G   0% /sys/fs/cgroup
/dev/nvme0n1p1         ext4      492G   78M  467G   1% /mnt/m2ssd
tmpfs                  tmpfs     1.7G   17k  1.7G   1% /run/user/120
/home/test/.Private    ecryptfs  30G    14G  15G    50% /home/test
tmpfs                  tmpfs     1.7G     0  1.7G   0% /run/user/1000

You can download the logs from here: Gofile - Your all-in-one storage solution
You see I suspect that it is the number of files we write into the partition that causes this issue. For example, when I try to copy a big file (say more than 300MB) there is no issue. say I have a tarball that contains only 1 big file. I can extract it with no issue. but in another case, suppose I have a tar file (50KB) that contains many small files. Once I start extracting it, it’s like when certain amount of file are extracted the rest of the files cannot be extracted anymore.

dmesg interestingly outputs the following errors:

[   51.927447] crypt_scatterlist: Error setting key; rc = [-12]
[   51.927560] crypt_extent: Error attempting to crypt page with page_index = [0], extent_offset = [0]; rc = [-22]
[   51.927742] ecryptfs_encrypt_page: Error encrypting extent; rc = [-22]
[   51.928002] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000000])
[   51.946616] tegra-se-nvhost 15820000.se: no free key slot
[   51.946759] crypt_scatterlist: Error setting key; rc = [-12]
[   51.946878] crypt_extent: Error attempting to crypt page with page_index = [0], extent_offset = [0]; rc = [-22]
[   51.947062] ecryptfs_encrypt_page: Error encrypting extent; rc = [-22]
[   51.947217] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000000])
[   51.956273] tegra-se-nvhost 15820000.se: no free key slot
[   51.956408] crypt_scatterlist: Error setting key; rc = [-12]
[   51.956516] crypt_extent: Error attempting to crypt page with page_index = [0], extent_offset = [0]; rc = [-22]
[   51.956686] ecryptfs_encrypt_page: Error encrypting extent; rc = [-22]
[   51.956825] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000000])
[   51.966119] tegra-se-nvhost 15820000.se: no free key slot
[   51.966246] crypt_scatterlist: Error setting key; rc = [-12]
[   51.966367] crypt_extent: Error attempting to crypt page with page_index = [0], extent_offset = [0]; rc = [-22]
[   51.966550] ecryptfs_encrypt_page: Error encrypting extent; rc = [-22]
[   51.972691] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000000])
[   89.117238] tegra-se-nvhost 15820000.se: no free key slot
[   89.117421] crypt_scatterlist: Error setting key; rc = [-12]
[   89.117539] crypt_extent: Error attempting to crypt page with page_index = [0], extent_offset = [0]; rc = [-22]
[   89.117725] ecryptfs_encrypt_page: Error encrypting extent; rc = [-22]
[   89.117848] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000000])
[   89.122210] tegra-se-nvhost 15820000.se: no free key slot
[   89.122394] crypt_scatterlist: Error setting key; rc = [-12]
[   89.122527] crypt_extent: Error attempting to crypt page with page_index = [0], extent_offset = [0]; rc = [-22]
[   89.122839] ecryptfs_encrypt_page: Error encrypting extent; rc = [-22]
[   89.122992] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000000])
[   89.126223] tegra-se-nvhost 15820000.se: no free key slot
[   89.126340] crypt_scatterlist: Error setting key; rc = [-12]
[   89.126470] crypt_extent: Error attempting to crypt page with page_index = [0], extent_offset = [0]; rc = [-22]
[   89.126727] ecryptfs_encrypt_page: Error encrypting extent; rc = [-22]
[   89.126871] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000000])

I dont know much about this but I will try to google around and see what I can find about it.

Thank you @linuxdev and @mdegans. For now I am less concerned about how to provide the password for decryption, as it has less priority as of now. but the main issue here is the encrption/decryption itself.

@mdegans, I have also been using ecryptfs to encrypt my home directory on my PC and have never had any issue with it. The problem is with my xavier only. I did try turning swap off and extracting the tarball but the problem was still there. Well, yes as you mentioned it is of course a bug related to ecryptfs, but since I have not had any issues with it on my PC I thought maybe I should try it on xavier too. I have thought about luks as the alternative, for example, having sensitive data on an external ssd and encrypt with luks, so I guess if I cant fix this problem then probably I have no choice but to try luks.

If you don’t need full disk encryption, important data and encrypted swap partitions on a nvme ssd will probably work. I have most of / on an nvme ssd and it’s a night and day performance difference compared to the internal flash.

It’s not too hard to do. Just plug in an nvme ssd, partition and format it with gnome-disks or something, and use a systemd unit or similar to arrange it to be mounted on startup.

I don’t think you can use /etc/fstab in this particular case since it’s encrypted. The script your unit calls could get the credentials to unlock from somewhere off the device fairly easily. Initrd tutorials for remote luks unlock outline the basic procedure and you could adapt that to your needs.

Btw, the errors which appear from the encryption software are something the encryption people could probably work with. So far as I know those are not specific to the particular hardware or o/s.

To illustrate something, this strace excerpt (from near the end of the trace) shows:

write(1, "test/1/__init__.cpython-36.pyc\n", 31) = 31
openat(AT_FDCWD, "test/1/__init__.cpython-36.pyc", O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_NONBLOCK|O_CLOEXEC, 0664) = 4
write(4, "3\r\r\n\2E\323]|\22\0\0\343\0\0\0\0\0\0\0\0\0\0\0\0.\0\0\0@\0\0"..., 3584) = -1 EINVAL (Invalid argument)
write(2, "tar: ", 5)                    = 5
write(2, "test/1/__init__.cpython-36.pyc: "..., 44) = 44
write(2, ": Invalid argument", 18)      = 18

We know there was a request to open and examine a file, and to write, but the “invalid argument” is probably just another way of phrasing those other errors. Excerpt of the other errors which are probably the same error:

[   89.126223] tegra-se-nvhost 15820000.se: no free key slot
[   89.126340] crypt_scatterlist: Error setting key; rc = [-12]
[   89.126470] crypt_extent: Error attempting to crypt page with page_index = [0], extent_offset = [0]; rc = [-22]
[   89.126727] ecryptfs_encrypt_page: Error encrypting extent; rc = [-22]
[   89.126871] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000000])

If you talk to the people who support the encryption software, then showing that same set of errors from the “crypt_" and "ecryptfs_”, along with the description of having large files work, but small files fail, will probably be something they’ve run into before. If they can pinpoint something more specific, then we can probably help figure out how to implement the fix on the Jetson. At this point though, all I know is that it doesn’t like to write in some circumstances, that the encryption changes are in the middle of the reason it can’t write
but I have no clue as to why the encryption changes themselves are failing.

Thank you so much for all this information!
On my side I chose to gain some time and go for the partial encryption, as the crypt module is now supported by default in the latest version of Jetpack, and it met the requirements for my project.
Like Andrey said, The only data ancrypted is located on a separate disk.

Thank you again @linuxdev and @mdegans. As I have had no luck with ecryptfs, I also decided to move my home directory onto an external ssd card and encrypt the disk.