Hi, on linux we got some very funny results. On two separate programs that were initially not specifying workgroup size, when checked with the visual profiler seemed to have workgroups of size 1 ! not even 32 and an occupancy of 0.25. Just by specifying the workgroup size, the kernels started to run over 10 times faster ! What is that about ??? i must say i don’t really trust the opencl linux visual profiler as it gives very strange results some times.
so no one ran into this problem ?