I have already started down this road because I took a NVIDIA course “Scaling GPU-Accelerated Application with the C++ standard library” and would like to perform this on the Jetson. The course compiles with NVC++ compiler for the --stdpar=gpu compiler option. This doesn’t seem available on the NVCC compiler.
I had a look at HPC and noticed that Jetson met the min requirements (see below) so I proceeded to load the docker image and started setting up. So far it looks good, but I’m not that far into it.
Now you are letting me know that this approach will not work?
Thanks,
Paul
"Before running the NVIDIA HPC SDK NGC container, please ensure that your system meets the following requirements.
Pascal (sm60), Volta (sm70), Turing (sm75), Ampere (sm80), or Hopper (sm90) NVIDIA GPU(s)
CUDA driver version >= 440.33 for CUDA 10.2
Docker 19.03 or later which includes support for the --gpus option, or Singularity version 3.4.1 or later
For older Docker versions, use nvidia-docker >= 2.0.3
.
.
.
Multiarch containers for Arm (aarch64) and x86_64 are available for select tags starting with version 21.7."
The docker container mounts but running the example the mpi complains about problems binding the memory. I looked at the hwloc and found membind:set_proc_membind = 0.
When I exit the docker and run the hwloc I also get a membind:set_proc_membind = 0.
Is there anyway to alleviate this and allow for binding?
Thank you for the response. It looks like this may not be the way to go. Ultimately, I would like to do C++20 standard focusing on parallelism on both CPUs and GPUs, like in the course. G++ works for the CPU’s only and found that NVC++ works with the CPUs and GPU’s.
When do you think the jetson family be able to ISO C++20 and beyond?