… and the kernel doesn’t get profiled correctly. Basically I’m guessing the “$*” are taken as a glob by a shell.
If you take the command line and change the --kernel-name argument to be ‘main$_omp_fn$0’ (i.e. quote) then running that command line by hand will correctly find and profile the kernel.
Now if I can figure out why my real application runs 10x slower using g++ vs. nvc++. ;^\
On reflection I probably should have been more precise in that this is an ncu-ui issue. ncu does seem to work as long as one quotes properly on the command line, but ncu-ui is the part that gets confused by the “$” in the format when it synthesizes the ncu command.
I think it would be nice to have this fixed.
P.S. Someone should check LLVM OpenMP offload; I don’t have a working compiler to see if it has similar problems.
Correct. The command with the “/bin/sh” is from the ncu-ui window. Whatever is creating that command within ncu-ui is getting confused by the “$” signs. When I run a profile from within ncu-ui it says no kernels were profiled and I’m betting that’s because the command line has the kernel name as “/bin/sh…” and not a valid kernel name. In my experience over the past week, I’ve never been able to profile from within ncu-ui because the g++ OpenMP kernel name always gets clobbered. As I noted in the original, if I cut and paste the command line from the ncu-ui and put in a quoted ‘main$omp_fn$0’ as the kernel-name, then it runs correctly as an ncu command.
Try to profile a kernel names “main$omp_fn$0” from within ncu-ui. If it works for you then there’s something even weirder going on and maybe it in the way my shells are setup.
Interesting. When I click on a kernel in the timeline and select “profile” I get a pop up. On the “filter” tab the wrong name (i.e. main/bin/sh) populates the kernel name. So whatever is populating that “kernel name” field is getting confused by the “$”. I was able to type in the kernel name by hand and have things work, so there is a manual work around.
I had a hard time reconstructing what I had done, but as best as I can tell using Version: 2024.2.0.0 (build 34181891) (public-release) the problem is fixed.
Just to further (and more precisely document) the issue was that if you’re in the timeline view and you right click on a kernel and “Profile Kernel” the name doesn’t always get correctly put in the synthesized ncu command line. This appeared when using gcc gpu offload and the kernel name was of the form “main$omp_fn$0”.