Nsight 3.0 much slower than previous version

I just installed Nsight 3.0 (visual studio 2010), and I am finding it unusable. It is working much, much slower than in previous versions. Like, a function that used to take about 1 second, is now taking several minutes. Uninstalling this version, and reinstalling the old version fixes the problem, but I would like to use the new features.

Has anyone else experienced this? Is there some option hidden away that I might change, to bring performance back to how it was?


Which functions do you find slower? Is this with CUDA or Graphics?

All of them, really. I am writing a particle-based fluids simulator with CUDA. It really slams into a wall in the density computation step, which requires searching for neighboring fluid particles, and computing a weighted sum of the masses. With the older versions of nsight, it worked fine. Now, when debugging, it becomes unbearably slow.

Hi, do you see any slowdown before you enter CUDA debugging? What I would like to know is when exactly you start seeing the slowdown and what is it that is slow (Visual Studio, the performance of the app, etc).
If you are debugging, what toolwindows do you have open when you see the slowdown?
Are you able to reproduce this slowdown with any of the CUDA samples?

To answer your questions in order:

There is no slowdown before entering CUDA debugging. Running in release mode works completely fine.

I start seeing slowdown exactly when the program starts to enter CUDA kernels. Other (CPU) code that runs before the CUDA kernel (initialization, etc) works fine. However, once I hit the CUDA functions, it slams nearly to a halt, especially the particle density computation function I mentioned before. If I wait a long time, it will eventually get passed this function, but then gets stuck at the next function as well.

When I am debugging, I’ve got nsight monitor open, as well as a bunch of basic debugging windows (warp watch, memory watch, etc) open in visual studio.

I tried some of the CUDA samples, but they all seem to work fine. This is very weird, why would my program work for an older version but not the newer, but other programs work fine for both? Could there be a setting in the project files that might be messed up in mine?


I have exactly the same issue with VS 2008.

First time I’ve tried NS v3.0 with this new code, but previous versions were fast with other code.

Fairly simple kernels. I set a break pt in a kernel and run it, and it takes forever (many minutes, sometimes hours) to break.

Stepping is also agonizingly slow. IF there are no suggestions, I will have to revert to an older version cause this is intolerably unproductive.

Sorry to hear. Can you provide details of your setup (CUDA toolkit version, driver version, GPUs, local or remote debugging, etc)?
What happens if you have the stock Visual Studio debugging windows open like variables and callstack (meaning, no CUDA info or CUDA warpwatch tool window opened)?

robosmith, are you NOT able to reproduce the problem with the CUDA samples like sckulp?

If you happen to have a sample or a project that you can share that reproduces the problem, please attach here, or let me know so we can get it.

I have CUDA v5.0 on a 520M with Optimus. Local debugging with driver v3.20

Performance with VS debugging is normal (fast).

I tried one of the samples (simpleGL) and it did not seem slow.

It will take me some time to pare down my code so I can give it to you. Don’t know when…

The kernel consists of 1 block with 1000 threads accessing shared memory.

When I first ran it and it did not break, I thought the debugger was not working at all.

Later I found that if I wait long enough, it eventually breaks.

I’ve taken the calls to curand lib out of my kernel and now it runs quite a bit faster in NSight.

Still not as fast as previous versions.

Ive installed the new NVIDIA_Nsight_Visual_Studio_Edition_No_CUDA_Toolkit_Win64_3.0.0.13123 20 Minutes ago.

Im running windows7 and MS VS 2010.

Display Driver 320.00-desktop-win8-win7-winvista-64bit-english-beta.

My computer has an Intel i5 CPU@3,1Ghz und an Nvidia GT610 GPU.

When i use Nsight in MS VS 2010 environment, it seems to be quite slow. For example if i analyse my kernels - experimental results - cuda occupancy, or open any menu, it takes ~5-8 seconds to open a new window. Windows Taskmanager shows 30-40% workload. While it has not crashed and seems to be a stable application, you should fix this lag issue…

edit: and the lag issue also concers scrollbars. very painfull.

Do you see the same problem when you do not use Nsight (meaning, in plain VS)? What menus are you referring to?

If you guys have any reproducibles, please send it my way.

I’m using vs2010 and I have the same issue. I debug my code in menu : NSight->Start Graphic debugger. The application works quite well when I first start it, however when I click on the button “Pause and capture frame” the CPU usage begin to increase, nearly 40%-50%( I have a Core i5 2.6Ghz CPU). Anyway it’s ok and I can continue doing other works. But when the HLSL code hit a breakpoint( Anywhere, in any code) then Visual Studio freezes. I thought that was a bug but later I found I can still step over it so it may be something else. On my computer this issue happens in every Direct3D projects, from the simple codes I wrote to the samples in DX SDK…

No, plain MS VS 2010 is just fine.

Nsight -> Start CUDA debugging- then a report is generated by the debugger. Each window in this report is very slow.


Just trying to narrow things down, can you try turning off the memory checker: Nsight -> Enable CUDA Memory Checker should not have the icon pressed.

Does the slowdown still reproduce?


Sorry, i gave up on this one, because im very busy right now. Speeds slow, ill wait for the next update of Nsight to show up.

Ive seen there was an update. I installed it, but it didnt change anything, just to let you know.

This one i installed: NVIDIA_Nsight_Visual_Studio_Edition_No_CUDA_Toolkit_Win64_3.0.0.13150

Still slow, as before…


Can someone tell me: Where can i download the latest, stable Nsight 2.0 (!) Version. I want to downgrade until these problems in Version 3.0 are fixed.

Can no one tell me where i can get Nsight 2.0 Final Release??? I need it until Nsight 3.0 gets patched.


I am very sorry to hear that you are having a problem with Nsight VSE 3.0. If you can provide a reproducible it would help us identify the issue. We have not been able to reproduce the performance issue that you are experiencing.

Older versions of Nsight VSE are available at

It’s frustrating to watch possible bugs being reported only halfway, with nothing provided to help debugging.

There seems little point in hoping for an issue to be “fixed” or “patched” without making the effort to supply code and/or data to reproduce the issue.

The conditions that trigger a bug are a valuable resource, meant for sharing.