I would like to create multiple model instances on multiple GPUs with python.
When using a runtime to deserialize a TRT engine, how to assign this engine to a specific GPU?
I know I could do that with the C++ API through cudaSetDevice(). However, this instruction seems no in Python API.
Any instructions would be very helpful.
TensorRT Version : 7.0
GPU Type : 1080 Ti
Nvidia Driver Version : 418.56
CUDA Version : 10.0
CUDNN Version : 7.6
Operating System + Version : Ubuntu 18.04
Python Version (if applicable) : 3.6.9