I am using the code matrixMul.cu provided in the NVIDIA Corporation/CUDA Samples/v8.0/0_Simple/ directory. The code has been compiled on my windows10 enterprise /64bit laptop which has a K2000M GPU. The tool I am using is MS Visual Studio Community 2015 v. 14.0.2543.01 Update 3.
The nvcc command line reads:
Driver API (NVCC Compilation Type is .cubin, .gpu, or .ptx)
set CUDAFE_FLAGS=–sdk_dir "C:\Program Files (x86)\Windows Kits\8.1"
“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe” --use-local-env --cl-version 2015 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64” -I./ -I…/…/common/inc --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -o x64/Release/%(Filename)%(Extension).obj “%(FullPath)”
Runtime API (NVCC Compilation Type is hybrid object or .c file)
set CUDAFE_FLAGS=–sdk_dir "C:\Program Files (x86)\Windows Kits\8.1"
“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe” --use-local-env --cl-version 2015 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64” -I./ -I…/…/common/inc --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -DWIN32 -Xcompiler "/EHsc /nologo /FS /Zi /MT " -o x64/Release/%(Filename)%(Extension).obj “%(FullPath)”
I then use debug/run without debugger Matrixmul release x64 and I get the following output:
[i]Matrix Multiply Using CUDA] - Starting…
GPU Device 0: “Quadro K2000M” with compute capability 3.0
MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel…
done
Performance= 46.60 GFlop/s, Time= 2.813 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.[/i]
My questions are:
- [/can I compile and run the same code (how?) so that it runs on the CPU only, so that I can compare the performance .]
- If this is not possible how do I modify the source code so that the gpu is not activated