I want to write a Python extension with C++ (1. Extending Python with C or C++ — Python 3.9.5 documentation), but I want the C++ extension to use CUDA. There aren’t many how-to’s on this online, and the ones I’ve found are fragmented and very dated. I’m not sure if this pathway to using CUDA is fully supported, and what the implications are. Using generic Python bindings for CUDA like Numba etc are not an option.
Would greatly appreciate if someone can give a breakdown of using CUDA in Python C/C++ extensions.