Kernel Occupancy Could someone explain this?

I read and re-read the definition of occupancy in the Programming Guide, and I still don’t get it.
Could someone explain this concept to me and how its related to overall performance.
An example would help.
Thanks!

Occupancy is that ratio of active warps to the maximum permissible warps per multiprocessor. A nice overview is contained in this post. It is usually correlated with overall kernel performance up to the point where there is enough active warps per multiprocessor to hide as much of the latency as is possible in the execution model (which is 6 warps in current hardware). Beyond that, it rather depends on the kernel. For “light” kernels, the performance will usually increase with occupancy right up to an occupancy of 1. For very computationally intensive kernels with a lot of work per thread, performance and occupancy can be rather independent (V.Volkov’s matrix multiply kernels are a great example of this - they hit peak performance at very low occupancy).