I have an ASUS ENGTX285 board and since I’m optimizing a code, running the openclprof (Visual Profiler for OpenCL) and then the Analyze Occupancy function, I got the following output:
Kernel details : Grid size: 360 x 1, Block size: 3 x 36 x 1
Register Ratio = 0.75 ( 12288 / 16384 ) [14 registers per thread]
Shared Memory Ratio = 0.1875 ( 3072 / 16384 ) [468 bytes per Block]
Active Blocks per SM = 6 : 8
Active threads per SM = 648 : 768
Occupancy = 1 ( 24 / 24 )
Achieved occupancy = 1 (on 30 SMs)
Occupancy limiting factor = None
But, since the GTX285 compute capability is 1.3, shouldn’t the Visual Profiler output have these lines instead of the corresponding ones above?
Active threads per SM = x : 1024
Occupancy = 1 ( 32 / 32 )
I’m I wrong or the Visual Profiler is not correctly detecting the GTX285 card?
Or otherwise, is my code for some reason actually not using all the “graphic card capabilities”?