I am using OpenCL on Windows 7 64bit, compiling with Visual Studio 2008. When I run a program with the debugger attached (ie, launch by pressing F5), it takes around 15 seconds to compile a very simple OpenCL kernel from source using clBuildProgram(). However, running the same program without the debugger attached (ie, launch by pressing crtl-F5) results in a compile time of under half a second.
I am using the Cuda Toolkit v3.0 and GPU Computing SDK v3.0 with driver version 197.16. The target platform has two devices, both of which are Geforce GTX 260M cards.
Has anyone else seen this behavior, and if so, is there a workaround to make OpenCL kernels build faster with the debugger attached?