How to use cudaOccupancyMaxActiveBlocksPerMultiprocessor from cuda-python

Using cudaOccupancyMaxActiveBlocksPerMultiprocessor in C/C++ is simple enough, one just needs to pass the function, the blocksize and the intended dynamic memory size. The function handle is passed as a pointer.

I am wondering how to do this using the cuda-python library. Its interface for the same function does not specify the type for the function, and I could find no examples online. I would be glad if someone can provide an example of how to use this.