Description
(Disclaimer: this is duplicate of issue in TIS Github #8153 as there was no response since it was posted 2 weeks ago)
From my experience with interacting with Triton Server using Python client, I haven’t been able to find a way to get a model’s configuration without loading it first, when the server is running in EXPLICIT mode. The reason for that is I have a logic to load a specific version of a model, that updates the runtime model config upon loading. I had to use a context that loads and unloads a model to perform this operation.
with self._model_config_context(model_name) as model_cfg:
version_policy = (
{"latest": {"num_versions": 1}}
if model_version == "latest"
else {"specific": {"versions": [model_version]}}
)
model_cfg.update({"version_policy": version_policy})
if init_params:
model_cfg["parameters"].update(
{k: {"string_value": str(v)} for k, v in init_params.items()}
)
self.logger.debug(f"Model config: {model_cfg}")
new_model_cfg = model_cfg.copy()
self.client.load_model(model_name, config=json.dumps(new_model_cfg))
While loading and unloading an ONNX model can be quite fast, it causes a lot of latency to do the same for Python backend models. If there was an API/method to simply read the model configuration for EXPLICIT mode that I didn’t know about, I’ll be happy to hear details about them. Any other opinions/sharing are welcome too.
Thank you in advance.
Scot