I am calibrating my model using the the nvidia/tensorflow:19.02-py3 docker image that has TRT integration in Tensorflow on a T4 GPU. Is it possible to speed up the calibration if I increase the number of T4 GPUs on my machine or is calibration only supported on a single GPU at this time.
By “calibration”, do you mean the actual int8 calibration process described here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#optimizing_int8_c? If so, this is not yet supported, but this is typically a one-time process, then you can save the calibration cache for future engine creation on the same model if necessary.
Or are you wondering if you can run inference on your TF-TRT model using multi-gpu? If so, I don’t believe it is possible in native TensorRT. However, you can do multi-gpu inference using TensorRT Inference Server as mentioned in this post: https://devtalk.nvidia.com/default/topic/1043118/tensorrt/tf-trt5-how-to-run-tensorflow-tensorrt-inferences-with-multiple-gpus/