A wave in an SM is a group of warps that can run in parallel. The number of waves is calculated as the number of blocks / max blocks per SM / the number of SMs.
In your case, when there are 2640 blocks and 132 SMs, the waves per SM is 10. This means two blocks can run in parallel in one SM.
The number of blocks that can run in parallel in an SM is determined by the available resources in the SM, usually referring to the number of registers, shared memory, and warp slots. This value can be queried with this CUDA API: cudaOccupancyMaxActiveBlocksPerMultiprocessor.