I was running into a issue while following this quick-start guide for Morpheus
https://docs.nvidia.com/morpheus/morpheus_quickstart_guide.html
When doing the following step
(mlflow) root@mlflow-6d98:/mlflow# mlflow deployments create -t triton \
--flavor triton \
--name sid-minibert-onnx \
-m models:/sid-minibert-onnx/1 \
-C "version=1"
The error I would get is
Copied /mlflow/artifacts/0/559ba3818c4a418dacce5c425fb74f25/artifacts/triton/sid-minibert-onnx to /common/triton-model-repo/sid-minibert-onnx
Saved mlflow-meta.json to /common/triton-model-repo/sid-minibert-onnx
Traceback (most recent call last):
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/mlflow_triton/deployments.py", line 109, in create_deployment
self.triton_client.load_model(name)
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/tritonclient/http/__init__.py", line 622, in load_model
_raise_if_error(response)
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/tritonclient/http/__init__.py", line 64, in _raise_if_error
raise error
tritonclient.utils.InferenceServerException: failed to load 'sid-minibert-onnx', no version is available
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/mlflow/bin/mlflow", line 8, in <module>
sys.exit(cli())
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/mlflow/deployments/cli.py", line 133, in create_deployment
deployment = client.create_deployment(name, model_uri, flavor, config=config_dict)
File "/opt/conda/envs/mlflow/lib/python3.8/site-packages/mlflow_triton/deployments.py", line 111, in create_deployment
raise MlflowException(str(ex))
mlflow.exceptions.MlflowException: failed to load 'sid-minibert-onnx', no version is available
The solution to this was to helm uninstall all the deployments related to Morpheus and re-install them again and the deployment was successful.
To uninstall
root@k8master01:~# helm list --namespace $NAMESPACE
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
helper davindermorpheus 1 2022-09-18 22:19:50.155588261 +0000 UTC deployed morpheus-sdk-client-22.06 22.06
morpheus-mlflow davindermorpheus 1 2022-09-18 04:32:29.413680882 +0000 UTC deployed morpheus-mlflow-22.06 22.06
morpheus1 davindermorpheus 1 2022-09-17 04:59:47.852987934 +0000 UTC deployed morpheus-ai-engine-22.06 22.06
helm uninstall morpheus1 --namespace $NAMESPACE
helm uninstall morpheus-mlflow --namespace $NAMESPACE
helm uninstall helper --namespace $NAMESPACE
mlflow) root@mlflow-7889bfd95f-jw4s9:/mlflow# mlflow deployments create -t triton \
> --flavor triton \
> --name sid-minibert-onnx \
> -m models:/sid-minibert-onnx/1 \
> -C "version=1"
Copied /mlflow/artifacts/0/254ae8198afd4452b9e366a116c1c694/artifacts/triton/sid-minibert-onnx to /common/triton-model-repo/sid-minibert-onnx
Saved mlflow-meta.json to /common/triton-model-repo/sid-minibert-onnx
triton deployment sid-minibert-onnx is created
(mlflow) root@mlflow-7889bfd95f-jw4s9:/mlflow#
Other note: when I had this issue in the kubectl -n $NAMESPACE logs deploy/ai-engine
logs I wouldn’t see the TRITONBACKEND related logs.
Example:
I0918 23:23:05.136036 1 grpc_server.cc:4587] Started GRPCInferenceService at 0.0.0.0:8001
I0918 23:23:05.136197 1 http_server.cc:3303] Started HTTPService at 0.0.0.0:8000
I0918 23:23:05.177142 1 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
I0918 23:25:48.075752 1 model_repository_manager.cc:1191] loading: sid-minibert-onnx:1
I0918 23:25:48.180726 1 onnxruntime.cc:2466] TRITONBACKEND_Initialize: onnxruntime
I0918 23:25:48.180744 1 onnxruntime.cc:2476] Triton TRITONBACKEND API version: 1.10
I0918 23:25:48.180814 1 onnxruntime.cc:2482] 'onnxruntime' TRITONBACKEND API version: 1.10
I0918 23:25:48.180818 1 onnxruntime.cc:2512] backend configuration:
{"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}}
I0918 23:25:48.191147 1 onnxruntime.cc:2568] TRITONBACKEND_ModelInitialize: sid-minibert-onnx (version 1)
I0918 23:25:48.193377 1 onnxruntime.cc:2611] TRITONBACKEND_ModelInstanceInitialize: sid-minibert-onnx (GPU device 0)
2022-09-18 23:25:49.338904987 [W:onnxruntime:log, tensorrt_execution_provider.h:59 log] [2022-09-18 23:25:49 WARNING] external/onnx-tensorrt/onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
2022-09-18 23:25:49.470422138 [W:onnxruntime:log, tensorrt_execution_provider.h:59 log] [2022-09-18 23:25:49 WARNING] Output type must be INT32 for shape outputs
2022-09-18 23:25:49.967791996 [W:onnxruntime:log, tensorrt_execution_provider.h:59 log] [2022-09-18 23:25:49 WARNING] Output type must be INT32 for shape outputs
I0918 23:25:49.972195 1 model_repository_manager.cc:1345] successfully loaded 'sid-minibert-onnx' version 1
THanks