List of available models in Model control mode

Therec · March 6, 2020, 10:30am

Hi!

I’ve been using the TensorRT Inference Server for a while and I love it! It is awesome!

I’m trying to use the mode “model control” to schedule myself the load/unload of the models I need, because I deal with models that are quite big and I do not have enough available GPU memory to load all the models at once (even combining all the GPU memory available in all the GPUs). I have only memory to load 2 or 3 models, however I cannot know in advance which models I’ll need for a certain task, so I would like to dynamically load the models I need and if possible choose pairs of Model/GPU(s).

Is there a way to get the list of available models even if they are not loaded?
Is there a way to dynamically assign a GPU to a model that we request to load?

Thanks!

Alex Reche

Topic		Replies	Views
How does Triton server manage many models? Triton Inference Server - archived	0	835	February 18, 2021
Triton inference server dynamic load TensorRT inference-server-triton	0	121	July 18, 2024
Real Time Inference with Multi GPU - Multiple Model Triton Inference Server - archived	1	1386	January 29, 2020
Loading a new plan file while running inference TensorRT	1	616	June 9, 2020
Triton deployment using Explicit model control Miscellaneous Products (archived) inference-server-triton	1	1081	October 14, 2021
Triton Inference server : Inference on multi-gpus and load balancing across gpus General	3	218	November 28, 2024
Inconsistant GPU memory utilsation with parallel model instances Triton Inference Server - archived	0	802	July 5, 2021
Multiple model Inference And Runtime Model Switching Isaac ROS ros , isaac-ros-dnn-inference	3	624	May 13, 2024
Triton Inference Server, Model Analyzer Triton Inference Server - archived inference-server-triton	0	364	March 4, 2024
Multi model inference - Swap GPU memory TensorRT	3	557	February 19, 2021

List of available models in Model control mode

Related topics