I have found some time and uninstalled Nsight 5.3 and installed 5.4.
The examples in Nsight 5.4 (Debugging/Matrix Multiply) are somehow wrong, I 've got an error when loading to VS2015.
I tried examples from Nsight 5.3.
EDITED:
First I did mistake and had SimpleStreams as main project in Solution , so it run debug for Simple Streams.
I changed it and I’m back to my previous output:
– cut –
[Matrix Multiply Using CUDA] - Starting…
GPU Device 0: “GeForce GTX 660” with compute capability 3.0
MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel…
– /cut –
I changed setting of both project and matrixMul.cu to have “Generate GPU Debug Information” set as “Yes (-G)” [it’s -G, not -G0] And here is my output from rebuild:
– cut –
1>------ Rebuild All started: Project: matrixMul, Configuration: Debug Win32 ------
1>
1> D:\Data\visual-c-wrk\NsightSamples\CUDA\Debugging\Matrix Multiply>“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe” -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin” -I…....\Common -I…....\Common\C99 -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" -G --keep-dir “D:\Data\visual-c-wrk\NsightSamples\CUDA\Debugging\Matrix Multiply\obj\Win32_Debug_vc100” -maxrregcount=0 --machine 32 --compile -g -D_DEBUG -DWIN32 -D_CONSOLE -Xcompiler “/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd " -o “D:\Data\visual-c-wrk\NsightSamples\CUDA\Debugging\Matrix Multiply\obj\Win32_Debug_vc100\matrixMul.cu.obj” “D:\Data\visual-c-wrk\NsightSamples\CUDA\Debugging\Matrix Multiply\matrixMul.cu” -clean
1>CUDACOMPILE : nvcc warning : The ‘compute_20’, ‘sm_20’, and ‘sm_21’ architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
1> matrixMul.cu
1> Compiling CUDA source file matrixMul.cu…
1>
1> D:\Data\visual-c-wrk\NsightSamples\CUDA\Debugging\Matrix Multiply>“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe” -gencode=arch=compute_20,code="sm_20,compute_20" -gencode=arch=compute_30,code="sm_30,compute_30" -gencode=arch=compute_35,code="sm_35,compute_35" --use-local-env --cl-version 2015 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin” -I…....\Common -I…....\Common\C99 -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include” -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" -G --keep-dir “D:\Data\visual-c-wrk\NsightSamples\CUDA\Debugging\Matrix Multiply\obj\Win32_Debug_vc100” -maxrregcount=0 --machine 32 --compile -cudart static -g -D_DEBUG -DWIN32 -D_CONSOLE -Xcompiler "/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd " -o “D:\Data\visual-c-wrk\NsightSamples\CUDA\Debugging\Matrix Multiply\obj\Win32_Debug_vc100\matrixMul.cu.obj” “D:\Data\visual-c-wrk\NsightSamples\CUDA\Debugging\Matrix Multiply\matrixMul.cu”
1>CUDACOMPILE : nvcc warning : The ‘compute_20’, ‘sm_20’, and ‘sm_21’ architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
1> matrixMul.cu
1> matrixMul_vc100.vcxproj → D:\Data\visual-c-wrk\NsightSamples\CUDA\Debugging\Matrix Multiply\bin\Win32_Debug_vc100\matrixMul.exe
1> matrixMul_vc100.vcxproj → D:\Data\visual-c-wrk\NsightSamples\CUDA\Debugging\Matrix Multiply\bin\Win32_Debug_vc100\matrixMul.pdb (Full PDB)
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========
– /cut –
Note it’s compiled as “Debug” “Win32” . I’m not usre if “x64” would have some benefit.
I have breakpoints on lines 78, 84, 115 - all are “full red points” so active but Debugger doesn’t reach them.