Trying to run profiling on my TX2 application to help optimize memory performance. I’ve followed the blog instructions, but am met with a blank screen and warnings that say System Profiling or CPU profiling is not supported on the underlying platform. This using the Visual Profiler on Ubuntu 16.xx. I know OpenACC is not supported on the TX2, but I’d like to see what my memory access is looking like as I tune my application to see if the optimization routines are adding any value.
Where did you reference to and the BSP version?
I saw this error when I upgraded to L4T 32.1 but did not switch the references
in the Makefile to point to the newer compiler toolchain.
I think you need GCC 7.3.1
You can get it here: https://developer.nvidia.com/embedded/linux-tegra
Thanks, I’ll try that when I get a chance. As for now, I’m running nvprof with the output options on the device and then opening it in the profiler application. Including commands here in case anyone reading this is uncomfortable/unfamiliar with remote console stuff.
ssh username@tx2 #connect to tx2
#I might have to find nvprof, so: find / -name nvprof
sudo /usr/local/cuda-10.0/bin/nvprof -f --export-profile prof.nvvp --analysis-metrics [path to exe]
exit #return to my desktop
scp username@tx2:/path_to_nvvp_file ~/local_path/prof.nvvp #copy back to the host