When you write a program, you don’t flash it. It’s just a file to copy from one computer to another. Flashing is reserved for installing an operating system when there is no operating system. ssh
has the command “scp
” to copy from one machine to another (there are other possibilities, but this is simple and reliable and fast).
So far as cross compiling CUDA code, and related, someone else will need to answer. However, there is something related to this which is probably confusing: A GPU also has an architecture. For NVIDIA GPUs there is something called “compute capability”. One can compile a CUDA program such that it applies to one or more GPUs, but it is neither ARM (arm64/aarch64) nor Intel (amd64/x86_64). For a Nano, which is Maxwell architecture. GPU code is compiled in general by the nvcc
app, and options during compile specify which compute capability to add (and more than one can be specified, e.g., for the Jetson plus one for host PC). For the Nano it is compute capability 5.3, abbreviated as “sm_53
”. Example:
nvcc -gencode arch=compute_53,code=sm_53 ...other content...
(you can have a list of architectures to support, the above is just for a Nano because it is Maxwell architecture)
Here is an example Makefile
:
https://forums.developer.nvidia.com/t/tx2-gpu-obsolete-for-pytorch/158330/8
(note how it enables multiple architectures)
Basically, if you run a user space program, then the part which runs on the CPU must be for that CPU’s architecture, and the same is true for GPU: It is a specialty CPU with its own architecture, and one program can talk to another, or interchange data.
If you want to find software which matches with your Nano, then look first at the L4T release on the Nano with “head -n 1 /etc/nv_tegra_release
”. Then go to the listing of L4T releases here:
https://developer.nvidia.com/linux-tegra
If your release is R32.7.2 (the latest release a Nano works with), then you’d end up here:
https://developer.nvidia.com/embedded/linux-tegra-r3272
Documentation and downloads will work with that release, including the example code which JetPack/SDK Mananger can install on the host PC, along with nvcc
. Note that if you run another release of SDKM, then you can start it like this to see all available releases compatible with SDKM:
sdkmanager --archivedversions
The R32.7.2 “L4T Driver Package (BSP) Sources” has many files in it (you can use command line to view this, or the Ubuntu “ark
” program). One such package is “nvsample_cudaprocess_src.tbz2
”. Or you can use sdkmanager
to directly just install examples on the host PC (you can uncheck installation of anything to the Jetson and just pick for host PC, but if you choose to also install to the Jetson, then you can be sure a number of important items are present for developing on the Jetson and cross compiling on the host PC).
So you have some extra challenges there: Cross compiling CPU code for the Jetson, cross linking to the Jetson (preferably linking to a local PC’s mounted clone of the Jetson), and building the GPU code to the right compute capability.
Someone else can probably give you a better way to start on this than I can.