"Profile CUDA Application" always fails with "No kernel launches captured!"

I’ve never been able to use any of the experiments under “Profile CUDA Applications”. This applies to projects I’ve made (based on the template CUDA project), and it also applies to the sample projects that come with Nsight. When I try, the program runs, the Summary Report screen shows up, and under “CUDA Overview / Summary of captured CUDA activity.”, it says “No kernel launches captured! Please verify your activity settings and make sure the kernel filter is setup correctly.”

I’ve gone through http://http.developer.nvidia.com/NsightVisualStudio/3.2/Documentation/UserGuide/HTML/Content/Profile_CUDA_Settings.htm quite a few times, and I don’t notice anything in there that I’m not trying.

I’m using Visual Studio Pro 2012, and I’m doing this on a single machine (no remote debugging). Here’s a step by step of what I’m doing:

1.) Unzip the Nsight samples. Find “Nsight CUDA Samples_vc100.sln” and double click it to launch visual studio. Tell visual studio to upgrade the solution file to VS 2012 format when it asks.
2.) Build → Rebuild Solution, make sure matrixMul is selected in Solution Explorer
3.) Start up Nsight Monitor
4.) In visual studio, NSIGHT → Start Performance Analysis…
5.) Notice that ‘Connection Status’ has a green light, while ‘Application Control’ and ‘Capture Control’ have red lights.
6.) Change Activity type to ‘Profile CUDA Application’
7.) Notice that ‘Kernels to Profile’ is empty (which supposedly means it’ll capture all kernel launches)
8.) Change ‘Experiments to Run’ from the default ‘Overview’ to ‘All’
9.) Click the Launch button under ‘Application Control’

The application runs, I see some text in the console window for a very short moment and the console window disappears. The ‘Summary Report’ page pops up in Visual Studio, and the “No kernel launches captured!” error message is there.

Based on what I’ve read while trying to figure this out, I changed ‘Code Generation’ in the property pages to “compute_20,sm_20”, but this didn’t help. I also tried uninstalling Nsight 3.2 and installing Nsight 3.1, but that didn’t help. I tried putting the kernel’s function name in the ‘kernel filter’ box. I’ve also tried the regular expression .* , but neither helped. Also, I did find that if I prevent the program from finishing right away, I can see that all three lights on the activity page are green while the program is running.

So am I missing a step? Am I doing something completely unexpected or nonsensical? I’ve never been able to get this work on any project, so I have no working baseline to compare to, and I have no idea what I’m doing.

I found my problem: I had updated my graphics drivers. I reinstalled the cuda 5.5.20 toolkit, which removed the newer drivers and rolled me back to the version bundled with CUDA. And now everything is fine.