I have a cuda program with opencv 3.2 library, and building on the Visual Studio 2015, running on GTX1080.
Within this program, will run one thousand times to get the average time of a single operations.
But I encountered a strange problem in a great many tests:
- When I use “visual studio 2015 nsight performance analysis” to profiling the time of this program, the spent average time of this program is about 650ms.
- But when I running the .exe of this program directly, the average time is about 720ms.
That is to say the “.exe” is slow about 10% than using “visual studio 2015 nsight performance analysis”.
From the link https://devtalk.nvidia.com/default/topic/766013/performance-is-much-better-when-profling-with-nsight-than-when-running-production-code/, I checked the environmental variables in my computer, I have not set the below environmental variables:
So I want to know whether “visual studio 2015 nsight performance analysis” will set some flags to GPU to improve the performance, or why happen the above problems?
Because my program will run on the customer’s computer, so I want my “.exe” program could run as fast as using “visual studio 2015 nsight performance analysis”.
Thank you very much！