How to specific the number of SMs used in my program?

Dear all,
If I want to launch a kernel with N threads, How can I make the gpu scheduler allocate SMs to my programs as more as possible?

Thanks for your reply

Other than via CUDA stream priorities, you have no control over the block scheduler in a GPU.

The heuristics of block scheduling are not published.

The GPU block scheduler will generally attempt to deliver blocks to SMs in such a way as to maximize throughput of your kernel. This generally means delivering blocks evenly to all available SMs.

You should strive for full occupancy of the GPU. As a target minimum, this means create kernels that contain at least 2048*(# of SMs in your GPU), total thread count (or more).