How to get model configuration from HTTP API without first loading the model in EXPLICIT mode?

Description

(Disclaimer: this is duplicate of issue in TIS Github #8153 as there was no response since it was posted 2 weeks ago)

From my experience with interacting with Triton Server using Python client, I haven’t been able to find a way to get a model’s configuration without loading it first, when the server is running in EXPLICIT mode. The reason for that is I have a logic to load a specific version of a model, that updates the runtime model config upon loading. I had to use a context that loads and unloads a model to perform this operation.

with self._model_config_context(model_name) as model_cfg:
    version_policy = (
        {"latest": {"num_versions": 1}}
        if model_version == "latest"
        else {"specific": {"versions": [model_version]}}
    )
    
    model_cfg.update({"version_policy": version_policy})
    if init_params:
        model_cfg["parameters"].update(
            {k: {"string_value": str(v)} for k, v in init_params.items()}
        )
    self.logger.debug(f"Model config: {model_cfg}")
    new_model_cfg = model_cfg.copy()
    
    self.client.load_model(model_name, config=json.dumps(new_model_cfg))

While loading and unloading an ONNX model can be quite fast, it causes a lot of latency to do the same for Python backend models. If there was an API/method to simply read the model configuration for EXPLICIT mode that I didn’t know about, I’ll be happy to hear details about them. Any other opinions/sharing are welcome too.

Thank you in advance.
Scot

Hi @scotrraaj.gopal ,
Thank you for highlighting this.
I will escalate thi sto resp team to get proper attention.

2 Likes