NVIDIA Morpheus runtime error: Model is not ready

An exception occurs in pipeline:
morpheus/pipeline/pipeline.py

[2025-05-12 15:58:02,163] {morpheus.pipeline.pipeline:407} ERROR - Exception occurred in pipeline. Rethrowing
Traceback (most recent call last):
File “/opt/conda/envs/morpheus/lib/python3.10/site-packages/morpheus/pipeline/pipeline.py”, line 405, in post_start
await executor.join_async()
File “/opt/conda/envs/morpheus/lib/python3.10/asyncio/base_events.py”, line 649, in run_until_complete
return future.result()
File “/opt/conda/envs/morpheus/lib/python3.10/asyncio/tasks.py”, line 650, in _wrap_awaitable
return (yield from awaitable.await())
RuntimeError: Model is not ready

How do I ensure that the model on ai-engine is ready?
I am using the Morpheus: 24.03.02

Does the morpheus api require models to be stored in /common/models/

Currently my models sit (on the Tritonserver) at: /common/triton-model-repo/

I have verified that the models are loaded, and they are ready:
tritonserver --model-repository=/common/triton-model-repo to list model status; one gets:

+--------------------------------+---------+--------+
| Model                          | Version | Status |
+--------------------------------+---------+--------+
| mymodel1                       | 1       | READY  |
| mymodel2                       | 1       | READY  |
| ... etc                        | 1       | READY  |
+--------------------------------+---------+--------+

Hi,

Thank you for reaching out and inquiring about the issue you’re observing.

I looked into the issue and discussed it with one of our developers. Could you please help us understand a few things:

  • Is Triton running in a Docker container and Morpheus in another container?
  • Have you checked if Morpheus can communicate with Triton? If not, you can verify their communication by running the following commands (replacing localhost:8000 with the URL of the Triton server):
    • curl -v "localhost:8000/v2/health/live"
    • curl "localhost:8000/v2/models/<model name>/config" (replace <model name> with the name of your model)
  • Is the configured URL for Triton correct, and does the model name match?
  • Does the pipeline you’re using include the Triton stage? The traceback doesn’t seem to be Triton-specific.

Triton is runnon on a EKS cluster. Each is in its own container.
Yes. Consistent with the output I posted above showing that the model is ready: I get “< HTTP/1.1 200 OK” and I get output like:
eshold":0,“eager_batching”:false},“instance_group”:[{“name”:“nd_rf_severity_regressor_2.0.0”,“kind”:“KIND_GPU”,“count”:1,“gpus”:[0,1,2,3],“secondary_devices”:,“profile”:,“passive”:false,“host_policy”:“”}],“default_model_filename”:“”,“cc_model_filenames”:{},“metric_tags”:{},“parameters”:{“model_type”:{“string_value”:“treelite_checkpoint”},“output_class”:{“string_value”:“false”}},“model_warmup”:}

after “curl ai-engine:8000/v2/models/mymodelname/config.”

I will check if the Triton Stage is included.

Here are the lines in our entire code base with references to the Triton stage.

./morpheus_pipeline/morpheus_pipeline_builder.py:from morpheus.stages.inference.triton_inference_stage import TritonInferenceStage
./morpheus_pipeline/morpheus_pipeline_builder.py: inference_stage = TritonInferenceStage(

Followed by:

./apps/morpheus_cybersphere/run-morpheus-cybersphere.py: ).add_inference_stage(
./apps/morpheus_cybersphere/run-morpheus-cybersphere.py: pipeline_builder.add_inference_stage(

In my log file, I see this which might indicate that we are loading the models.

2025-05-14 15:39:36,663 [INFO)] - Added stage: <inference-18; TritonInferenceStage(model_name=ourmodelname, server_url=ai-engine:8000, force_convert_inputs=True, use_shared_memory=True, needs_logits=None, inout_mapping=None, input_mapping=None, output_mapping=None)>

Hi! Notice below that while processing message for InferenceClientStage is where the issue stems from; after loading a batch of data.

Pipeline Throughput: 0 events [00:12, ? events/s][2025-05-13 21:28:25,123] {morpheus_pipeline.stages.ad_data_loading_left_shift_stage:119} INFO - AD data loading runtime: 0 minutes, 1 seconds
WARNING: Logging before InitGoogleLogging() is written to STDERR
W20250513 21:28:25.165555 13789 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:25.167411 13788 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:25.271406 13788 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:25.271649 13789 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:25.486009 13788 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:25.486114 13789 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
Pipeline Throughput: 0 events [00:13, ? events/s]W20250513 21:28:25.890973 13788 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:25.891028 13789 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
Pipeline Throughput: 0 events [00:14, ? events/s]W20250513 21:28:26.694842 13788 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:26.694903 13789 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
Pipeline Throughput: 0 events [00:15, ? events/s]W20250513 21:28:28.298821 13788 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:28.298847 13789 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
Pipeline Throughput: 0 events [00:18, ? events/s]W20250513 21:28:31.503276 13788 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:31.503441 13789 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
Pipeline Throughput: 0 events [00:22, ? events/s]W20250513 21:28:35.507351 13789 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:35.507361 13788 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
Pipeline Throughput: 0 events [00:26, ? events/s]W20250513 21:28:39.511365 13789 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:39.511382 13788 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
Pipeline Throughput: 0 events [00:30, ? events/s]W20250513 21:28:43.515393 13788 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
W20250513 21:28:43.515478 13789 inference_client_stage.cpp:255] Exception while processing message for InferenceClientStage, attempting retry.
Pipeline Throughput: 0 events [00:34, ? events/s]E20250513 21:28:47.522738 13788 runnable.hpp:112] /main/inference-5; rank: 0; size: 1; tid: 140020180534848 Unhandled exception occurred. Rethrowing
E20250513 21:28:47.522755 13789 runnable.hpp:112] /main/inference-6; rank: 0; size: 1; tid: 140020170044992 Unhandled exception occurred. Rethrowing
E20250513 21:28:47.522769 13788 context.cpp:124] /main/inference-5; rank: 0; size: 1; tid: 140020180534848: set_exception issued; issuing kill to current runnable. Exception msg: RuntimeError: Model is not ready

Thank you for the updates, please can you check the following pieces requested below as well:

  • Double check that you are running 24.03.02 this point release included a bug fix for the TritonInferenceStage
  • Ensure that the config parameters being passed to the TritonInferenceStage match the model config , specifically the values of the Config.model_max_batch_size , Config.feature_length
  • Check to see if the expected input type matches that of the input tensor (try setting force_convert_inputs=True for the TritonInferenceStage . Also double-check the inout_mapping to ensure the input tensors and output tensors match what the model is expecting.
  • Once you see the errors being reported in the TritonInferenceStage are there associated errors being emitted from the Triton Inference Server?
  • Try running with Config.num_threads=1 , does this avoid the issue and/or does this result in better error messages?
  • If all else fails try running in Python mode by setting morpheus.config.CppConfig.set_should_use_cpp(True) and/or setting the environment variable MORPHEUS_NO_CPP=1 . This probably won’t fix the problem but could result in a better error message.