Triton Inference Server, Model Analyzer


I am utilizing the excellent features of the NVIDIA Triton Inference Server. In particular, I am utilizing the Model Analyzer in Triton Inference Server to estimate the approximate throughput for models when using GPUs.

I find it very convenient as it gives different results and plots depending on which GPU it is run on. However, when performing many configuration options for a specific model, it can sometimes take several hours to 1-2 days.

Therefore, I was wondering if there is a way to obtain numerical results without running it on a GPU. Is it absolutely necessary to perform this kind of task?