I’m currently trying to profile an simple vector addition executable on GPU using nsight compute 2022.2 which is installed directly on my jetson orin agx (I went to check on the application releases notes which says the jetson hosts are now supported). I am currently receiving an error that says it failed to prepare the kernel when launching the interactive profiler (see image). From lots of different topics on the forum I have guessed that it is because the app doesn’t have root access (Can not successfully use nsight compute on Jetson AGX Xavier - #14 by 1125111494).
I would like to know how to fix this error either by giving my localhost ssh connection root privileges or any other way please.
Finally got my setup back and tried what you shared but it still didn’t work. I still have that error saying the kernel failed to be prepared. I also launched the app in root access.
We want to clarify your environment first.
Do you want to remote profile Jetson on a desktop environment or run the tool on Orin directly?
More, do you get the 2022.2 from JetPack 5.0.1 DP or on the public website?
Since the version included in the JetPack should be 2021.5 rather than 2022.2 (host).
I would like to run the tool directly on the jetson orin. Since the 2022 update, it says the jetsons are supported as a host, thats why I installed directly from the public website the 2022 version. Also I do not have the 2021.5 version installed since it was not included inside the jetpack package.
I don’t know why but I kept assuming aarch sbsa was the architecture for jetson. Now that I need remote profiling, I have connected my laptop which runs ubuntu 18.04 and uses nsight compute 2022. Which arch should i select ?
I tried all 3 and the only arch where I could attach a process was with aarch64 sbsa. ppc64le and x86_64 could create the process but not connect to it…
Problem is I have the same error since im using the same arch as before.
Please install the CUDA toolkit from the SDK Manager.
It includes a cross-compile package (cuda-cross-aarch64_xx.deb) and a compatible nsight compute version.
The nsight compute for Orin is located at:
/opt/nvidia/nsight-compute/2021.2.5
Then you should find the Linux(aarch64) listed on the target platform as below:
Sorry for the incorrect message before.
We checked the tool included in the JetPack 5.0.1 DP and it is 2021.2.4.
When testing 2021.2.4 with a CUDA vectorAdd sample, we can still profile it correctly.
So maybe this issue is related to the application that is profiled.
Would you mind checking if the vectorAdd sample can work in your environment?
Sorry it took me so long to test it out, I fell ill and couldn’t work on my jetson.
For the profiler, I compiled vectorAdd from your samples repository and compiled it using make command. I then ran the executable and all seemed to work. But then trying to prrofile it using my setup (host computer using ubuntu 18.04 and nsight compute 2021.2.4) it showed me the same error: “Failed to prepare kernel for profiling”.
It seems that it might be a configuration problem since it worked on your setup.
We recently found some common errors that might help your use case as well.
When profiling, Nsight Compute will ask for the current CUDA context first.
If it returns CUDA_ERROR_INVALID_CONTEXT, the profiler will create a new context to work on.
So it’s expected to get an error in the first kernel. Click the “Resume” key can make the profiling keep working.
For the “failed to prepare kernel for profiling” error, it’s more like you select the incorrect target platform.
Please use Linux (aarch64) instead of Liux (aarch64 sbsa) and try it again.
Yea, the first error I have it but I can keep going until I reach a kernel without problems.
For the “failed to prepare kernel for profiling” error, I can confirm you that I selected the right platform which is aarch64 and not aarch64 sbsa. I still get the same mistake…