Hi there,
I have been following the tutorial for the brain segmentation PyTorch model
All good until step 5, which ends with a time out message:
curl -X PUT http://0.0.0.0:5000/admin/model/segmentation_2d_brain -F config=@segmentation_2d_brain.json;type=application/json -F data=@unet.pt
504 Gateway TimeoutGateway Timeout
The gateway did not receive a timely response from the upstream server or application.
Apache/2.4.29 (Ubuntu) Server at 0.0.0.0 Port 5000
And in the AIAA server logs:
[2020-12-10 18:09:38] [INFO] (nvmidl.apps.aas.actions.model_import_trtis) - Trying to convert to ProtoBuf:
[2020-12-10 18:09:38] {“platform”: “pytorch_libtorch”, “max_batch_size”: 1, “input”: [{“name”: “INPUT__0”, “data_type”: “TYPE_FP32”, “dims”: [3, 256, 256]}], “output”: [{“name”: “OUTPUT__0”, “data_type”: “TYPE_FP32”, “dims”: [1, 256, 256]}], “instance_group”: [{“count”: 1, “gpus”: [0], “kind”: “KIND_AUTO”}]}
[2020-12-10 18:09:38]
[2020-12-10 18:09:38] [INFO] (nvmidl.apps.aas.actions.model_import_trtis) - Result: platform: “pytorch_libtorch”
[2020-12-10 18:09:38] max_batch_size: 1
[2020-12-10 18:09:38] input {
[2020-12-10 18:09:38] name: “INPUT__0”
[2020-12-10 18:09:38] data_type: TYPE_FP32
[2020-12-10 18:09:38] dims: 3
[2020-12-10 18:09:38] dims: 256
[2020-12-10 18:09:38] dims: 256
[2020-12-10 18:09:38] }
[2020-12-10 18:09:38] output {
[2020-12-10 18:09:38] name: “OUTPUT__0”
[2020-12-10 18:09:38] data_type: TYPE_FP32
[2020-12-10 18:09:38] dims: 1
[2020-12-10 18:09:38] dims: 256
[2020-12-10 18:09:38] dims: 256
[2020-12-10 18:09:38] }
[2020-12-10 18:09:38] instance_group {
[2020-12-10 18:09:38] count: 1
[2020-12-10 18:09:38] gpus: 0
[2020-12-10 18:09:38] }
[2020-12-10 18:09:38]
[2020-12-10 18:10:57] AH01382: Request header read timeout
[2020-12-10 18:13:04] AH01382: Request header read timeout
[2020-12-10 18:14:02] AH01382: Request header read timeout
[2020-12-10 18:14:38] Timeout when reading response headers from daemon process ‘AIAA_Admin’: /opt/nvidia/medical/nvmidl/apps/aas/www/api_admin.wsgi
My AIAA server is using the docker v3.1. image (nvcr.io/nvidia/clara-train-sdk:v3.1.01) and I can successfully upload other models from the catalogue without problems, for example clara_seg_liver_amp. It is started with /var/lib/aiaa/ as /workspace/ and transforms.py is copied to /var/lib/aiaa/lib/
Any idea what is happening?
Thanks for your help!
Lorena