Hey guys, I managed a full 60x speedup on the GPU for a filter I’m developing. However, apon using the visual profiler it states that the occupancy is always at 0.75 in all my tests. Does this mean at most 75% of the GPU is used?? In the occupancy calculator I get 100% in my calculations…
Any help is appreciated!
Occupancy doesn’t equal performance. It’s not even a very good estimate, especially when you’re using a lot of shared memory.
An occupancy of 75% means that 75% of the maximum number of warps are being processed. (This is 24 warps for compute capability 1.0 and 1.1 devices, and 32 for 1.2 and 1.3.) More warps means more opportunities to hide memory latency, but once you hit 50%, there is usually not much benefit to increasing it. A compute-bound kernel can easily keep a multiprocessor busy 100% of the time with only 6 warps, possibly less in some cases.