Profiling with python notebook

Hello,

I am completely new to profiling GPU and stuck with connection issues and would be grateful to have any help.

I wrote some kernels using anaconda’s python with jupyter notebook and numba’s cuda module. I want to optimize these kernels using a visual profiler.
I have installed CUDA 11.0 and since it does not support deprecated nvprof i have installed Nsight as recommended by nvidia. However I cannot figure out how to connect Nsight to the running notebook process. I’ve tried launching cmd with nsight, and launching anaconda prompt from the cmd, and launching the notebook from anaconda prompt but it does not seem to be able to connect to the process. Can you suggest a correct way to attach nsight to the python kernel?

Regards

You are correct that you need to launch the process that is to be profiled from Nsight Compute, you cannot attach to any generic running process.

However, I don’t think you would need to launch cmd from within Nsight Compute. Since anaconda is basically just an environment, it should be sufficient to start the notebook under Nsight Compute from within anaconda, similar to the commands below. Note that you need to use “–target-processes all” if the launched process is not the one using CUDA, but one of its child processes is.

$ cmd anaconda
$ (conda) which jupyter
$ (conda) /home/user/bin/anaconda3/envs/jupyter/bin/jupyter
$ (conda) ncu --target-processes all (other args) /home/user/bin/anaconda3/envs/jupyter/bin/jupyter notebook 

However, it might be easier and faster to profile your notebooks directly on the command line, without going through the browser, e.g. using runipy:

$ (conda) ncu --target-processes all (other args) /home/user/bin/anaconda3/envs/jupyternb/bin/runipy my_notebook.ipynb

Note that Nsight Compute is used for optimizing individual CUDA kernels. If you are looking for whole-program optimization, similar to Visual Profiler’s timeline, Nsight Systems would be the right tool to use: https://developer.nvidia.com/nsight-systems

Thank you so much for the help. With the ncu command from anaconda prompt I was able to find the attachable process from the GUI. Additionally, I had to install an older version of nsight compute and use nv-sight-cu-cli command instead of ncu because unfortunately pascal is no longer supported and everything is working as expected now