Jetson TX2 + armhf

jk.t · September 28, 2020, 12:59pm

Does the Jetson TX2 supports armhf? I read it somewhere in the forum that, it is compatible to armhf. But both armhf, arm64 bit software cannot run at the same time. We need to put the Jetson in ARMHF mode.
Is that correct? Kindly advice.

linuxdev · September 28, 2020, 7:09pm

You probably will be interested in what the TX2 architecture actually is, along with what armhf is.

In the 32-bit days this was all an ARMv7-a architecture. That architecture made several components optional, such as NEON, and more notably, floating point hardware. When the CPU had the optional floating point hardware the software used had to include a calling convention for the floating point, and this combination of hardware and software conventions is known as ARM hard float, or armhf.

The 64-bit TX2 is an ARMv8-a architecture. When used directly on ARMv8-a, there is no support for 32-bit ARMv7-a, including no support for armhf. However, ARMv8-a has a low performance extension known without the “-a”: “ARMv8”. ARMv8 32-bit mode is guaranteed to support all optional hardware from ARMv7-a. There is no need for compiler flags in 32-bit mode ARMv8 compatibility mode to specify NEON for example, and any attempt to supply such an option switch in compile is an error…not because the mode does not exist, but because it is not optional.

The 32-bit ARMv8 extension is a special mode. When in this mode the CPU recognizes and can use all instructions for any ARMv7-a, and includes the ability to use all optional versions of the ARMv7-a, such as NEON and hardware floating point. ARMv8 is a superset of ARMv7-a, and thus all ARMv7-a can run on a core in ARMv8 compatibility mode, and although most ARMv8 can run on an ARMv7-a with all optional hardware present, there is a small subset of ARMv8 which will not run on ARMv7-a.

The 32-bit compatibility mode of ARMv8 is low performance, and I would avoid this if you have any possibility of doing so. However, the CPU in this compatibility mode will do what you ask (armhf is supported in this mode). None of the 64-bit instruction set is accepted in ARMv8 mode (an example illustrating a case of an ARMv8 instruction not supported by ARMv7-a is the instruction to switch back to 64-bit mode…ARMv7-a has no such instruction, it is purely 32-bit).

The software support side effort will be significant if the software runs in user space. The original creation of 64-bit occurred when there were not yet any 64-bit user space applications. So originally the first experimental L4T release on 64-bit ran the kernel in 64-bit mode, but ran any user space in 32-bit mode. This was rather poor performance, and not long after when 64-bit user space applications had been ported, then so too was the user space ported to 64-bit. 32-bit was not kept in user space.

Are you dealing with armhf in kernel space? If so, then life is much easier than if you are dealing with armhf in user space. User space requires an entire set of support libraries, along with the linker tools. On anything but the very first test releases (which has no 64-bit user space support) you would have to install 32-bit linker tools and 32-bit libraries side-by-side with the 64-bit content. 32-bit would be a foreign architecture, but with the right user space tools, the kernel could understand armhf.

To emphasize, when in 32-bit mode, you are correct that armhf can operate, but no 64-bit will work. You are in for a world of painful support if you want to support 32-bit user space, but if you choose to do so, then an earlier L4T release will be a nice guide. The L4T versions are listed here (you might need to log in, and then go there a second time if forwarding does not work):
https://developer.nvidia.com/embedded/linux-tegra-archive

You would examine the R23.x series. I think R24.1 was the transition point, and this might have mixed some 32-bit and 64-bit in user space. All of those releases were 64-bit kernels, but 32-bit mode would still be used even though much of the kernel was 64-bit mode. There were lots of bugs in those days.

FYI, PCs also have the intent of normally operating in 64-bit mode, and 32-bit modes are considered a “foreign architecture”. However, when 64-bit PCs came out, it was well known that 64-bit would have very little acceptance if 32-bit did not “just work”. Every operating system which was newly adapted to 64-bit had a default installation of simultaneously installing 32-bit linkers and 32-bit support libraries. No such expectation of lots of need for 32-bit compatibility existed in the embedded world, and the only such support you will see was designed to transition completely to 64-bit in a short time.

An interesting complication of compiling the hybrid support in the R23.x and early R24.x series was that in some cases the kernel compile itself required both a 32-bit compiler and a 64-bit compiler (if you compile such a kernel and see a vdso error, then it probably means you forgot the 32-bit compiler). In the ARM world those compilers are always separate tools, but for a PC, this “known it is going to happen to need both 32-bit and 64-bit” implies the tools were set up from the start to support this. For example, you will find a single compiler front end has setup for i386 and also for x86_64/amd64 support.

jk.t · September 29, 2020, 5:11am

Hi linuxdev,
thanks for the details.
Just to make it clear, Jetson TX2 supports armhf in ARMv-8 low-performance mode.
But I need to try with the R23.x version of L4T.
Even on that, if armhf is in kernel space then the job is much easier, or else in the user space, the supported libraries/linker tools are needed.
Is that correct?

linuxdev · September 29, 2020, 4:40pm

Due to ARMv8 mode being able to execute armhf, the CPU cores do support armhf. The core must be in that mode, which excludes 64-bit code.

I’m not sure when, but I think even R24.1 still had 64-bit kernel mode and 32-bit user space. You’d have to research the docs more closely, or experiment to find the most recent release which used 32-bit armhf in user space. The early bugs could be rather serious, and so you will probably find it worth your sanity to get the most recent release with 32-bit user mode if you go that route. Your idea on using docker might be a better choice.

Explaining user space is simple enough, and all code there which you are interested in is armhf, and that also requires the 32-bit user space libraries (such as glibc) and linker environment.

Kernel space having 64-bit is for drivers or features using 64-bit (this is almost everything even if you plan to use 32-bit user space). A kernel function or driver has the option of using an interrupt such that it transitions to the ARMv8 32-bit compatibility mode which can execute armhf. This has to be done intentionally, but does not require any libraries or special tools. You might find that building this kernel requires naming both a 64-bit and 32-bit compiler (especially if you see a “vdso” error), but the actual support for kernel space being 64-bit while dealing with 32-bit user space code is much simpler than building an entire world for 32-bit user space if user space is to also support 64-bit for simultaneous operation.

If you can get docker to work, then I highly recommend using that along with a very recent L4T release. Do keep in mind that you might not be able to use armhf with CUDA or many hardware accelerated GPU functions. If that is needed, then it is possible you could actually require one of the older releases, e.g., R24.1. In that case you will be required to use a “highly” out of date CUDA release.

I do not use docker and thus cannot help much with it, but a lot of people here are quite good with docker. You could probably get help with a 32-bit ARMv8 docker environment.

NOTE: Do be careful to find out if your armhf code needs hardware accelerated CUDA. I doubt you could do this with any of the docker environments on a system running a newer release (the driver supports 64-bit, not 32-bit).

jk.t · September 30, 2020, 5:06am

Thanks Linuxdev.