Hi
I am running VS 2015 SP3, Win7/64 SP1, cuda 9.0, and dev driver 388.59.
I have a Quadro K620 and Titan XP in my machine.
I have a the matrix multiplication kernel I am running. When the kernel is finished running the results are validated against a matrix mult performed on the Host CPU. The results are nearly identical.
When I run the app under VS 2015 in release or debug mode it works fine. All of my code has been compiled for SM61 and is running on the Titan XP.
When I go to a dos box and do
nvprof --devices 0 --kernels kernelMatMult --metrics all --log-file foo.txt --csv --profile-api-trace none DataGen.exe -x KernelMatMult -bs 32,32,1 -gs 64,64,1 --numele 2048
my machine hangs then reboots! If I specify something like
nvprof --devices 0 --kernels kernelMatMult --metrics ipc --log-file foo.txt --csv --profile-api-trace none DataGen.exe -x KernelMatMult -bs 32,32,1 -gs 64,64,1 --numele 2048
nvprof --devices 0 --kernels kernelMatMult --metrics flop_count_sp --log-file foo.txt --csv --profile-api-trace none DataGen.exe -x KernelMatMult -bs 32,32,1 -gs 64,64,1 --numele 2048
it works fine! Any ideas? I’ve been chasing this bug for 4 days.