UNAVAILABLE: Invalid argument: model ‘main_model_0_0_gpu0’, tensor ‘confs’: the model expects 3 dimensions (shape [1,16128,80]) but the model configuration specifies 3 dimensions (an initial batch dimension because max_batch_size > 0 followed by the explicit tensor shape, making complete shape [-1,16128,2])
if i change
dims: [16128, 2]
to
dims: [16128, 80]
loading model failure gone, but got another issue.
features= <Gst.CapsFeatures object at 0x7fb59f4af5e0 (GstCapsFeatures at 0x7fb4f802e9a0)>
scores.inferDims.d[1]:80
python3: nvdsparsebbox_Yolo.cpp:145: bool NvDsInferParseCustomYoloV4(const std::vector&, const NvDsInferNetworkInfo&, const NvDsInferParseDetectionParams&, std::vector&): Assertion `detectionParams.numClassesConfigured == scores.inferDims.d[1]’ failed.
Aborted (core dumped)
Starting Triton…
I0524 07:48:22.406852 19259 metrics.cc:290] Collecting metrics for GPU 0: Tesla T4
I0524 07:48:22.770141 19259 libtorch.cc:1029] TRITONBACKEND_Initialize: pytorch
I0524 07:48:22.770183 19259 libtorch.cc:1039] Triton TRITONBACKEND API version: 1.4
I0524 07:48:22.770189 19259 libtorch.cc:1045] ‘pytorch’ TRITONBACKEND API version: 1.4
HTTPConnectionPool(host=‘localhost’, port=8000): Max retries exceeded with url: /v2/models/main_model/ready (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x7eff4df86100>: Failed to establish a new connection: [Errno 111] Connection refused’))
2022-05-24 16:48:22.950463: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0524 07:48:23.003441 19259 tensorflow.cc:2169] TRITONBACKEND_Initialize: tensorflow
I0524 07:48:23.003476 19259 tensorflow.cc:2179] Triton TRITONBACKEND API version: 1.4
I0524 07:48:23.003484 19259 tensorflow.cc:2185] ‘tensorflow’ TRITONBACKEND API version: 1.4
I0524 07:48:23.003495 19259 tensorflow.cc:2209] backend configuration:
{}
I0524 07:48:23.005835 19259 onnxruntime.cc:1970] TRITONBACKEND_Initialize: onnxruntime
I0524 07:48:23.005868 19259 onnxruntime.cc:1980] Triton TRITONBACKEND API version: 1.4
I0524 07:48:23.005876 19259 onnxruntime.cc:1986] ‘onnxruntime’ TRITONBACKEND API version: 1.4
I0524 07:48:23.029609 19259 openvino.cc:1193] TRITONBACKEND_Initialize: openvino
I0524 07:48:23.029637 19259 openvino.cc:1203] Triton TRITONBACKEND API version: 1.4
I0524 07:48:23.029643 19259 openvino.cc:1209] ‘openvino’ TRITONBACKEND API version: 1.4
I0524 07:48:23.749998 19259 pinned_memory_manager.cc:240] Pinned memory pool is created at ‘0x7f0eb0000000’ with size 268435456
I0524 07:48:23.752780 19259 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0524 07:48:23.759485 19259 model_repository_manager.cc:1045] loading: main_model:1
HTTPConnectionPool(host=‘localhost’, port=8000): Max retries exceeded with url: /v2/models/main_model/ready (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x7eff4df86910>: Failed to establish a new connection: [Errno 111] Connection refused’))
I0524 07:48:24.910203 19259 logging.cc:49] [MemUsageChange] Init CUDA: CPU +320, GPU +0, now: CPU 577, GPU 5768 (MiB)
I0524 07:48:24.911742 19259 logging.cc:49] Loaded engine size: 137 MB
I0524 07:48:24.911829 19259 logging.cc:49] [MemUsageSnapshot] deserializeCudaEngine begin: CPU 577 MiB, GPU 5768 MiB
HTTPConnectionPool(host=‘localhost’, port=8000): Max retries exceeded with url: /v2/models/main_model/ready (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x7eff4dee7100>: Failed to establish a new connection: [Errno 111] Connection refused’))
W0524 07:48:25.369252 19259 logging.cc:46] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
HTTPConnectionPool(host=‘localhost’, port=8000): Max retries exceeded with url: /v2/models/main_model/ready (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x7eff4dee78b0>: Failed to establish a new connection: [Errno 111] Connection refused’))
I0524 07:48:26.625039 19259 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +491, GPU +212, now: CPU 1086, GPU 6108 (MiB)
HTTPConnectionPool(host=‘localhost’, port=8000): Max retries exceeded with url: /v2/models/main_model/ready (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x7eff4def10a0>: Failed to establish a new connection: [Errno 111] Connection refused’))
I0524 07:48:27.590636 19259 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +287, GPU +200, now: CPU 1373, GPU 6308 (MiB)
I0524 07:48:27.592647 19259 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1373, GPU 6290 (MiB)
I0524 07:48:27.592764 19259 logging.cc:49] [MemUsageSnapshot] deserializeCudaEngine end: CPU 1373 MiB, GPU 6290 MiB
I0524 07:48:27.592777 19259 plan_backend.cc:456] Creating instance main_model_0_0_gpu0 on GPU 0 (7.5) using model.plan
I0524 07:48:27.602969 19259 logging.cc:49] [MemUsageSnapshot] ExecutionContext creation begin: CPU 1373 MiB, GPU 6290 MiB
I0524 07:48:27.605845 19259 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 1373, GPU 6300 (MiB)
I0524 07:48:27.608389 19259 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1373, GPU 6308 (MiB)
I0524 07:48:27.611137 19259 logging.cc:49] [MemUsageSnapshot] ExecutionContext creation end: CPU 1374 MiB, GPU 6484 MiB
I0524 07:48:27.615528 19259 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1373, GPU 6340 (MiB)
E0524 07:48:27.650387 19259 model_repository_manager.cc:1215] failed to load ‘main_model’ version 1: Invalid argument: model ‘main_model_0_0_gpu0’, tensor ‘confs’: the model expects 3 dimensions (shape [1,16128,80]) but the model configuration specifies 3 dimensions (an initial batch dimension because max_batch_size > 0 followed by the explicit tensor shape, making complete shape [-1,16128,2])
I0524 07:48:27.650732 19259 server.cc:504]
±-----------------±-----+
| Repository Agent | Path |
±-----------------±-----+
±-----------------±-----+
I0524 07:48:27.650954 19259 server.cc:543]
±------------±----------------------------------------------------------------±-------+
| Backend | Path | Config |
±------------±----------------------------------------------------------------±-------+
| tensorrt | | {} |
| pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} |
| tensorflow | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {} |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {} |
| openvino | /opt/tritonserver/backends/openvino/libtriton_openvino.so | {} |
±------------±----------------------------------------------------------------±-------+
I0524 07:48:27.651088 19259 server.cc:586]
±-----------±--------±--------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
±-----------±--------±--------------------------------------------------------------------------------------------------------------------------------------------------+
| main_model | 1 | UNAVAILABLE: Invalid argument: model ‘main_model_0_0_gpu0’, tensor ‘confs’: the model expects 3 dimensions (shape [1,16128,80]) but the model con |
| | | figuration specifies 3 dimensions (an initial batch dimension because max_batch_size > 0 followed by the explicit tensor shape, making complete s |
| | | hape [-1,16128,2]) |
±-----------±--------±--------------------------------------------------------------------------------------------------------------------------------------------------+
I0524 07:48:27.651358 19259 tritonserver.cc:1718]
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.13.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory |
| | cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0] | /workspace/triton_server/ |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------+
I0524 07:48:27.651426 19259 server.cc:234] Waiting for in-flight requests to complete.
I0524 07:48:27.651454 19259 server.cc:249] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models