I am learning to use TensorRT. Right now I am exploring how to make inference with a .trt/.engine/.plan file. And that is how I came across Numba, PyCUDA, and CUDA Python API.
Do I need all of them? Or when should I use each one? What are the pros and cons of each one?
Suppose you are using python API, is that correct?
Please noticed that we don’t official have any CUDA python API.
So it’s recommended to use pyCUDA to explore CUDA with python.
Numba is a compiler so this is not related to the CUDA usage.
In general, only pyCUDA is required when inferencing with TensorRT.
Please find this sample for more information: