TRITON error when uploading a custom pytorch inference to Clara v4.0

Hello,

I am trying to upload my custom model to Clara v4.0, following the example here Bring your own Inference — Clara Train SDK v4.0 documentation, and I am getting the following error:

[2022-01-24 18:09:03] [INFO] (aiaa.actions.model_import) - Model Config: {
… my model config …
[2022-01-24 18:09:03] },
[2022-01-24 18:09:03] “format”: “PT”,
[2022-01-24 18:09:03] “path”: “/workspace/models/my_model.ts”
[2022-01-24 18:09:03] }
[2022-01-24 18:09:03] [ERROR] (aiaa.www.api.api_admin) - ‘TRITON’
[2022-01-24 18:09:03] Traceback (most recent call last):
[2022-01-24 18:09:03] File “/opt/conda/lib/python3.8/site-packages/flask/app.py”, line 1950, in full_dispatch_request
[2022-01-24 18:09:03] rv = self.dispatch_request()
[2022-01-24 18:09:03] File “/opt/conda/lib/python3.8/site-packages/flask/app.py”, line 1936, in dispatch_request
[2022-01-24 18:09:03] return self.view_functionsrule.endpoint
[2022-01-24 18:09:03] File “www/api/api_admin.py”, line 201, in admin_model_load
[2022-01-24 18:09:03] File “www/api/server_context.py”, line 32, in put_model
[2022-01-24 18:09:03] File “actions/inference_engine.py”, line 32, in init
[2022-01-24 18:09:03] File “inference/inference_utils.py”, line 102, in init_inference
[2022-01-24 18:09:03] File “configs/modelconfig.py”, line 118, in get_inference_name
[2022-01-24 18:09:03] File “configs/modelconfig.py”, line 115, in get_inference
[2022-01-24 18:09:03] KeyError: ‘TRITON’

Looks like I need to add TRITON somewhere that I didn’t have to do before (uploaded successfully to v3 and a pre-release of v4 that didn’t use docker compose)? How should I do this?

Also, is there any way to upload a new model without a TS file as it used to be possible in the past? Can you give an example on how to do it?

Many thanks,
Lorena

Hi

There are multiple things going on over here so let me try to break it down.

  • Clara V4 using monai based PyTorch V3 was using Tensorflow that is why we now need ts file. This can easily be generated using the export.sh
  • Clara v3 had trition inside our docker in V4 this dependency was moved outside that is why you need to use docker compose to have both containers up and connected to each other. please refer to our notebooks to launch clara train with trition clara-train-examples/startClaraTrainNoteBooks.sh at master · NVIDIA/clara-train-examples · GitHub
  • You can launch AIAA with out triton by setting the engine as AIAA

Hope that helps

Hello,

Thank you for your answer, but it doesn’t help to solve the problem.

1 - I know about the change to use TS from v4, I am asking if it’s possible to still upload a model without the TS with v4, I recall reading it was possible in the documentation in the past.
2 - I know v4 uses Docker compose with Triton as a different docker from v4, unlike v3. Both containers are working, my AIAA is up and running and other NVIDIA models from NGC have been uploaded and are running normally.
3 - But do I actually need to launch AIAA without Triton to run a custom model? That would not be desirable because I also want to be able to run NGC models using Triton at the same time.

Any ideas what is producing the error and how to fix it?

Thanks,
Lorena

Hi Lorena,

Thanks for trying it out and the detailed information.

In the 4.0 release, we separate the Triton inference server outside of the Clara-Train base image, so that is why we would suggest this docker-compose if users want to run Triton backend.

In your use case, we would recommend you just run the AIAA backend.

To run an AIAA server using the AIAA backend, you can use similar steps as what you did before (so no need for docker-compose)

The steps would be:

  • first, start the docker image (remember to pass resources flag like --gpus).
  • Then inside the docker, you can just do “start_aiaa.sh --workspace [your AIAA workspace] --engine AIAA”
  • Then after that, you should be able to follow the example: https://docs.nvidia.com/clara/clara-train-sdk/aiaa/byom/byoi.html
  • And you can also include your own segmentation model.

Thanks