Request for Guidance on Software Transition from Qualcomm QNX to NVIDIA Jetson AGX Orin

Hello, I am a master’s student currently engaged in a thesis project that involves transitioning critical software developed initially for the Qualcomm platform running QNX OS, to the NVIDIA Jetson AGX Orin platform, which utilizes Linux for Tegra.

Given the substantial architectural differences between the two platforms and their operating systems, I anticipate encountering significant challenges in the cross-platform compilation and optimization processes. In light of these complexities, I would greatly appreciate your insights and support on the following:

  1. Source Code Compatibility:
    Considering the software was originally developed for QNX on Qualcomm, could you confirm if this would be directly compatible with the NVIDIA Jetson AGX Orin, or are substantial modifications expected? What are the common challenges you foresee in such transitions?
  2. Cross-Platform Compilation:
    Does NVIDIA provide specific tools or support for cross-compiling software developed for architectures like Qualcomm’s ARM-based systems? If so, could you detail these tools and their usage?
  3. Compiler and Toolchain Recommendations:
    Could you recommend specific compilers and toolchains that are optimal for developing and compiling applications on the NVIDIA Jetson AGX Orin, especially those that were initially designed for different platforms such as Qualcomm with QNX?
  4. Technical Support and Resources:
    What kind of technical support and resources does NVIDIA offer to developers undertaking such migration projects? Are there forums, documentation, or direct support channels that you would recommend?

I can’t provide much on this, I’ve never used QNX. But let me ask, what language is this written in? If it is C or C++, or perhaps even something interpreted (e.g., Python 2 or 3), then you might be able to get more answers.

Also, is this using linked libraries? If the libraries don’t provide the POSIX content in common with all of the *NIX platforms, then you might be stuck coding for that as well.

Lastly, is this all a command line program? Does it use a GUI? Does it use OpenGL or OpenGLES? Does it use audio? Does it talk to cameras? What is the type of I/O this program talks to?

Hello linuxdev,

Thank you for your attention.

Programming Language:

The software in question is primarily written in C++.

Use of Libraries and Compatibility:

Our current software heavily relies on libraries that comply with POSIX standards, which generally ensures a good level of compatibility across UNIX-like systems. However, during this transition, we aim to carefully evaluate the libraries that may not be fully compatible with the Jetson platform. This assessment will guide us in finding or creating suitable replacements to ensure functionality.

Build System and Toolchains:

CMake is the primary tool we use for managing our build process, ensuring that the appropriate compilation flags and configurations are set for different environments. This ensures our code can be compiled and linked correctly for the target platform, particularly for cross-compiling to ARM architecture, which Jetson AGX Orin uses.

Application Type and Graphics:

Our application comprises both command-line and graphical user interface components. We use OpenGL for rendering, which is critical for the visualization tasks it performs. We are also evaluating how NVIDIA-specific GPU acceleration (CUDA) can further enhance these graphics operations.

I/O Operations:

The software interacts with various input and output systems, including cameras for image processing and audio processing components. Part of our current work involves determining how these I/O operations can be adapted for optimal performance on the Jetson platform.

C++ is available and quite common under Linux. Do you have a particular C++ standard? For example, C++11. Many of the compilers (mainly gcc/g++) have multiple standards available, and an option in compile switches to that standard.

List the libraries. Then list the functions linked against. That list can be checked off for the name of libraries under Linux that cover this. If not covered, then you’d have to write your own, or port the library. Being POSIX, chances are that a lot of the requirements will already be available in Linux, but perhaps file names will differ.

Note that some specialty instructions on x86_64/amd64 won’t exist on ARM 64-bit, e.g., SIMD or MMX, but ARM might have similar capabilities. Many programs don’t use those specialty instructions, so there is a good chance you won’t see these. If you do, then you have some porting ahead of you.

CMake is commonly used in Linux. Configuring probably will be related to the above comment on listing libraries.

Instead of cross compiling, at least at first, I suggest you set up an Orin with an external disk (for space and for moving between desktop and Orin). This wouldn’t mean a disk you have to set up for booting to, it could just be something mounted in a temp location to put your code in. Then add libraries and other content to the Jetson itself until you get the compile. Once that is done, you could clone the rootfs image of the Jetson and loopback mount it on your host PC. What this would give you is the sysroot with development headers and libraries that is portable among Linux desktop PCs when cross compiling. It also gives you a backup. The actual program being worked on would not need to be part of the clone since you would be using something like a USB external disk or similar while figuring out what libraries are needed.

Regarding graphics, I suggest on the Linux side you run this command on the Jetson to add a package:
sudo apt-get install mesa-utils

This adds the command “glxinfo”. Running glxinfo from the GUI will tell you a lot about the OpenGL abilities of the Jetson. On anything with an NVIDIA GPU you can usually do this to know if it is using the NVIDIA hardware acceleration:
glxinfo | egrep -i '(nvidia|version)'

I do not know if there is something similar on QNX or not. You might find yourself limited to OpenGLES.

For terms in what follows, L4T is what actually gets flashed to a Jetson, and this is in turn Ubuntu with NVIDIA drivers. JetPack/SDK Manager is just a GUI front end to the flash software, but you will find a given release of JetPack is tied to a given release of L4T (but JetPack lets you use alternate releases). To see your L4T release:
head -n 1 /etc/nv_tegra_release

Regarding GPU acceleration via CUDA, keep in mind that most CUDA applications expect a PCI-based discrete GPU (dGPU) with its own memory (VRAM). The method of detection quite often involves a PCI bus query. In Linux, this is usually via the nvidia-smi application.

In L4T R35.x and earlier you cannot use nvidia-smi for this purpose because the GPU is integrated (iGPU) directly to the memory controller. You cannot use the separate CUDA installation which is not via JetPack even if it is for 64-bit ARM. You would need to check the docs. Go to your specific L4T release here:
https://developer.nvidia.com/linux-tegra

You will also find some sample code. Keep in mind that JetPack running on a separate host PC can have flash deselected. This isn’t always obvious, but when installing that software, one does not flash. You would not put the Jetson in recovery mode. You would fully boot the Jetson, connect the network for ssh access (either via the virtual USB wired networking or actual ethernet would work), and deselect the flash, but keep installation of things like sample software and CUDA enabled so it would install via ssh to your selected login account. Check out the sample CUDA code and how it configures for the GPU without nvidia-smi.

L4T R36.x changes a number of things. The first stable release of R36.x was just released, and this might be desirable (or it might not be). One thing this does is switch to a more recent mainline kernel, but the GPU is still an iGPU. There is a partial “surrogate” nvidia-smi which I have not used, but it should have some partial ability to provide some of the information that this function would have for a dGPU. I don’t know the details, but you are advised to compare it to what you get from a desktop Linux system with the NVIDIA drivers. This includes configuring and running a sample CUDA application.

I think cameras and audio will be much the same, but you will perhaps run into device tree (firmware) edits that would not be needed on a desktop PC. Some devices, e.g., USB, are “plug-n-play”, and will be no different on the Jetson compared to a desktop PC. Other devices cannot self-describe, and those interfaces will likely require a device tree edit for the interface.

Hello Linuxdev,

thank you very much for your detailed guidance. Your insights are very helpful as I set up my development environment for the NVIDIA Jetson AGX Orin platform. Here are some further details and a couple of additional queries:

  1. C++ Standards and GCC Version:

    • my Ubuntu system is using GCC 9.4.0. Regarding the C++ standard, I plan to use C++14 as it ensures compatibility across all platforms I’m working with. You mentioned using external disks for building, which seems like an efficient way to manage different build environments. Could you suggest any specific filesystems or configurations that optimize performance for building on these external storage solutions?
  2. Libraries and POSIX Compliance:

    • My project heavily relies on POSIX-compliant features, especially for threading and I/O operations. Are there specific POSIX features or extensions that are particularly well-suited to NVIDIA’s architecture or that I should be mindful of in a Linux environment?
  3. Cross-Platform Development:

    • As I transition from an ARM-based QNX system to ARM-based L4T on Jetson, are there specific compiler flags or optimizations you recommend for ensuring optimal performance? Also, is there a way to streamline the build process across different platforms to maintain consistency in the development process?
  4. GPU Acceleration and Tools:

    • Regarding GPU acceleration, I am particularly interested in leveraging CUDA for data processing tasks. Do you have any recommendations for CUDA optimization strategies or tools that could enhance the efficiency of compute-intensive operations?
  5. Networking and I/O:

    • Lastly, my project involves significant data transfer between devices. Do you have any tips for optimizing network configurations on Jetson devices to minimize latency or enhance throughput?

I appreciate your support and look forward to any additional insights or resources you might recommend to help with these aspects of my project."

  1. You won’t have any issues with gcc version and standard on the Jetson. External disks avoid placing large amounts of content on the local disk, and also allows plugging in to other devices. Not mandatory, but it is quite useful. You would simply use ext4, and hopefully your user on each system has the same UID and GID (not mandatory, but it is quite useful so far as making life easier).

If you examine any “/etc/passwd” file (password is not in that file, it is public information), then you will see each user has both a numeric user ID and a numeric group ID associated with that account. When you create a new account you can actually specify the numeric IDs. The group is usually chosen (for ordinary login accounts) to be the name of the user, and you might need to create that group before creating that user. Note that this is optional, but it greatly aids simply unplugging and replugging storage (after proper umount) such that both systems see the same person. If one system says a certain numeric ID is user “abc”, and the same ID on another system is for user “xyz”, then each system will name the person associated with the ID, and will not know about name. This is rarely a problem, but when I form systems I specifically create base users (not system user) to have specific UID and GID values so I can directly scp and rsync across systems without needing to translate IDs. Really, the external disk only cares about being ext4. IDs are just icing on the cake. See the IDs of your developer account for the Jetson and desktop Linux to see if they differ. If they do differ, see if some other account owns that ID. If not owned by anyone else, any ls or similar will simply label the file ownership as the UID.

  1. POSIX threading (pthreads) is widely available in Linux. You might need to tweak something, but mostly you might just need to find the lib required, and add the user space library. Even that might not be needed because of how prevalent it is. I can’t speak for CUDA kernels, but this is specific to NVIDIA, and to the release used. Jetsons tend to use only one release of CUDA, and not those which are released separately like on a desktop PC. However, Orin usually has more than one release available via Docker. You’d want to look up your L4T release via “head -n 1 /etc/nv_tegra_release”, and then check for CUDA releases. If you have a compatible release, then only detection of the GPU would differ. See the CUDA sample code you can get from JetPack/SDK Manager (sample code is available for both host PC and Jetson via that method; just uncheck “flash” and leave checked what you want to install; the Jetson would be fully booted, not in recovery mode, and networking would be used for that install).

  2. About the only flag I know of that I can right now might be useful is to specify your chosen c++14 standard:
    -std=c++14
    (there might be an alias used, like “g++14”; see “man g++” or “man gcc” on the host PC)

I do not know of any particular method to streamline across platforms. Mostly you’d want to have a source control system, e.g., git or svn (“subversion”). Whatever you prefer. The name of a library to link to might differ, but just give it a try and see what shows up.

  1. @dusty_nv would be the one to ask for documentation for CUDA optimizations that start on QNX and end up on a Jetson (he’s the local AI guru!). Do beware that the iGPU of a Jetson uses system RAM, and does not have dedicated VRAM like a PC has, so you might have different RAM restrictions if your models are large enough.

  2. About the only simple recommendation is to use wired ethernet on a switch which in turn connects to a router. Anything Wi-Fi will be painful IMHO. On your main Linux development PC create an ssh key for your user. Export the public key to each remote Linux system (including the Jetson) so you can simply ssh or scp or sftp securely without passwords. If you need to do admin tasks, then it is possible that maybe you will want to unlock the root account of the Jetson for network access, but not local access (locking root is an "Ubuntu thing"™, and is not a Jetson topic per se); you would then do the same key export, and after that lock root from network access by any means other than key.

1 Like

Hello Linuxdev,
thank you very much for your valuable and detailed answer. and sorry for late reply.
Lastly I would like to ask you:

Connectivity and Remote Access:
1-)Is it recommended to keep the Jetson AGX Orin connected to the network via Ethernet for the entire development process?
2-)Can I perform remote development over SSH after the initial setup and flashing are complete
3-)Are there any specific network configurations or settings that I should be aware of for optimal performance?
Development Environment:
4-)Are there any best practices for setting up a Docker environment for cross-compiling and development on Jetson AGX Orin?
5-)Can you provide any tips for optimizing the build process and managing dependencies on the Jetson platform?
Performance Optimization:

6-)What tools and methods are recommended for profiling and optimizing GPU-accelerated applications on Jetson AGX Orin?
7-)Are there any specific considerations for optimizing radar processing algorithms on the Jetson platform?
8-)I plan to use a Docker container for a consistent build environment. Could you provide any recommendations or best practices for setting up a Docker environment specifically for cross-compiling and development on Jetson AGX Orin?

9-)I am adapting radar processing algorithms to leverage Nvidia GPUs. Do you have any specific advice on optimizing these algorithms for Jetson AGX Orin, particularly using CUDA and TensorRT?

I added a second NIC to my host PC and a router with restricted access. I have a large number of embedded devices on that network. This is all wired. I highly recommend that since there are a number of things you can keep constant and will have admin access to the router. It is easy to set up the host with ssh keys for your developer and export the public key to each embedded system. It makes life so much easier. Being on a separate LAN removes a lot of security issues and lot of issues of routers you cannot control (like your ISP’s router). This isn’t mandatory, but it is so very convenient (especially for lots of developers, but even for one developer this is a "good thing"™).

Mostly you can indeed run remote development. You have to understand though that if your application is a GUI application, and you are going to run it, that you should not forward to the desktop PC GUI without knowing that the Jetson no longer is the one performing a number of functions. Less obvious is that if you forward to the host PC via ssh, then there are a number of CUDA based applications which will also offload to the host PC (by forwarding I’m thinking of “ssh -Y” or “ssh -X”; plain “ssh” won’t forward). All text-based function can be forwarded to the host PC. I regularly use scp and/or sftp in addition to ssh. Once set up many tasks have no need for a monitor.

For network configuration I require my router to assign an IP address only to known MAC addresses. Then I assign a specific IP address to that MAC address. My /etc/hosts file does not need a complication for DNS from the private router because the same system always gets the same address.

Incidentally, I always create my developer UID and GID on the Jetsons to be the same as on my developer account of the host PC. If you are using the first admin account of your host PC, then that would default to UID and GID of 1000. The same is true of the first user account created on the Jetson. One can manually set this though. It is possible that if you are in a group environment that you might want all developers to have the same GID, but different UIDs, and put multiple login accounts in on the Jetsons which share that GID. Every user in a group environment on separate systems could still all have the same UID and GID and it would be convenient, but it depends on policies (for example, if everyone has their own host PC, this is never a problem, but if a single central host PC server is used with multiple people logging in, then you will likely use a different login name/pass for every developer rather than having everyone use the same account).

You would have to ask others about Docker environment, I don’t have the knowledge to answer. @dusty_nv is the local expert on Docker with CUDA and AI.

I think dependencies are a bit of an open book depending on situation. What I can recommend is that once things are working the way you want that you save a copy of the “/proc/config.gz” and the outputs of:

  • uname -r
  • head -n 1 /etc/nv_tegra_release
  • cat /etc/nv_boot_control.conf

Should you need to recreate something, then that provides a lot of definitions even if the kernel is customized. You should also set up either a clone process for a backup, or else an rsync over ssh process (this is fairly easy, another thread could be started if you want that information).

There are tools for optimizing CUDA code, but I’m the wrong person to ask. Some of the developers here can help, but once again, I think @dusty_nv is the expert on that topic. This includes specific cases like depth perception and RADAR or LIDAR.

1 Like

thank you very much.