Llama pytorch profiling


I’m trying to profile an inference of Llama GPT, which recently added cuda support link.

The model is ran through a bash script, which creates an appropriate docker container and launches the model within it.

However, the profiler is unable to detect any kernel, even though I have confirmed that the model is running on the GPU.

Currently my setup is as such, where run.sh is the aforementioned bash script.

However the profiler does not detect any kernel, and I get a “no kernels were profiled” warning.

Is there something I’m missing?

Hi, andreas.so.alexandrou

Thanks for using Nsight Compute. I’m afraid this is a known issue now. I’ll report a ticket for tracking the requirement.

For work around, I think you can map nsight compute into docker image, and profile within docker directly. Sorry for the inconvenience !

Hi, andreas.so.alexandrou

Futher checking with our dev, it’s the expected behavior. If you want to profile inside the container, you need to start ncu inside the container. Or you can launch application inside and connect to an ncu or ncu-ui host launched outside, in this case you have to setup the networking with the container appropriately.

This topic was automatically closed after 3 days. New replies are no longer allowed.