PyTorch Tutorial Timeout

Hi there,

I have been following the tutorial for the brain segmentation PyTorch model

All good until step 5, which ends with a time out message:

curl -X PUT -F config=@segmentation_2d_brain.json;type=application/json -F

504 Gateway Timeout

Gateway Timeout

The gateway did not receive a timely response from the upstream server or application.

Apache/2.4.29 (Ubuntu) Server at Port 5000

And in the AIAA server logs:
[2020-12-10 18:09:38] [INFO] (nvmidl.apps.aas.actions.model_import_trtis) - Trying to convert to ProtoBuf:
[2020-12-10 18:09:38] {“platform”: “pytorch_libtorch”, “max_batch_size”: 1, “input”: [{“name”: “INPUT__0”, “data_type”: “TYPE_FP32”, “dims”: [3, 256, 256]}], “output”: [{“name”: “OUTPUT__0”, “data_type”: “TYPE_FP32”, “dims”: [1, 256, 256]}], “instance_group”: [{“count”: 1, “gpus”: [0], “kind”: “KIND_AUTO”}]}
[2020-12-10 18:09:38]
[2020-12-10 18:09:38] [INFO] (nvmidl.apps.aas.actions.model_import_trtis) - Result: platform: “pytorch_libtorch”
[2020-12-10 18:09:38] max_batch_size: 1
[2020-12-10 18:09:38] input {
[2020-12-10 18:09:38] name: “INPUT__0”
[2020-12-10 18:09:38] data_type: TYPE_FP32
[2020-12-10 18:09:38] dims: 3
[2020-12-10 18:09:38] dims: 256
[2020-12-10 18:09:38] dims: 256
[2020-12-10 18:09:38] }
[2020-12-10 18:09:38] output {
[2020-12-10 18:09:38] name: “OUTPUT__0”
[2020-12-10 18:09:38] data_type: TYPE_FP32
[2020-12-10 18:09:38] dims: 1
[2020-12-10 18:09:38] dims: 256
[2020-12-10 18:09:38] dims: 256
[2020-12-10 18:09:38] }
[2020-12-10 18:09:38] instance_group {
[2020-12-10 18:09:38] count: 1
[2020-12-10 18:09:38] gpus: 0
[2020-12-10 18:09:38] }
[2020-12-10 18:09:38]
[2020-12-10 18:10:57] AH01382: Request header read timeout
[2020-12-10 18:13:04] AH01382: Request header read timeout
[2020-12-10 18:14:02] AH01382: Request header read timeout
[2020-12-10 18:14:38] Timeout when reading response headers from daemon process ‘AIAA_Admin’: /opt/nvidia/medical/nvmidl/apps/aas/www/api_admin.wsgi

My AIAA server is using the docker v3.1. image ( and I can successfully upload other models from the catalogue without problems, for example clara_seg_liver_amp. It is started with /var/lib/aiaa/ as /workspace/ and is copied to /var/lib/aiaa/lib/

Any idea what is happening?
Thanks for your help!

Hi Lorena,

So I just tried the tutorial.
One thing that we forgot to mention there is that.
Whenever you put something in that lib folder.
You need to restart the AIAA server for it to pick those files up.

So if you just restart your AIAA server.
And retry step 5 you should be good.


Hi Yuan-Ting,

Thanks for your reply!

Unfortunately restarting the AIAA server didn’t work. Looking at the triton.log it seemed to be a mismatch of pytorch versions going on. So in fact I recreated using 20.08 PyTorch container as recommended and managed to complete the tutorial :)

I was a bit surprised there is so much difference, as my local version of pytorch is 1.7.0, and the one in the container is 1.7.0a0+8deb4fe (the error message using my local version is below). Does this mean we can only bring models developed with pytorch 1.7.0a0+8deb4fe? Is there a way of changing the Triton version still using the latest AIAA container otherwise?


E1218 14:08:30.076874 70] failed to load ‘segmentation_2d_brain’ version 3: Internal: load failed for libtorch model -> ‘segmentation_2d_brain’:

aten::_convolution(Tensor input, Tensor weight, Tensor? bias, int stride, int padding, int dilation, bool transposed, int output_padding, int groups, bool benchmark, bool deterministic, bool cudnn_enabled) -> (Tensor):

Expected at most 12 arguments but found 13 positional arguments.
Serialized File “code/torch/torch/nn/modules/”, line 8
def forward(self: torch.torch.nn.modules.conv.Conv2d,
input: Tensor) -> Tensor:
input0 = torch._convolution(input, self.weight, None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1, False, False, True, True)
~~~~~~~~~~~~~~~~~~ <— HERE

Hi Loren,
Thanks for debugging and finding the root cause for the problem you faced above.

Clara Train 4.x will bring a complete pytorch support soon. Currently it’s in testing phase or in early access stage. Hopefully in couple of weeks a release candidate should be available for all general public use. Otherwise clara 3.x version has very limited support for PyTorch as you currently experience.

However you can bring your own Inference logic and not depend on Trtion in 3.x version.
And when you load the model you can set native flag true (Try apis through

And you can upgrade the version of PyTorch in docker of your choice and not rely on Trtion for time-being.