cudaMalloc failed, cuda err_no:2, err_str:cudaErrorMemoryAllocationn and INVALID_CONFIG: Deserialize the cuda engine failed.*

I am testing the classification application with some pretrained models, such as resnet18, cspdarknet19, squeezenet in the docker nvcr.io/nvidia/deepstream:5.1-21.02-triton. I tested two kinds of model file, .etlt and .engineer. I set the file config.pbtxt like this:


name: “resnet18”
platform: “tensorrt_plan”
max_batch_size: 256
default_model_filename: “resnet18.etlt” //or resnet18.engineer
input [

  • {*
  • name: “input_1”*
  • data_type: TYPE_FP32*
  • format: FORMAT_NCHW*
  • dims: [ 3, 224, 224]*
  • }*
    ]

output [

  • {*
  • name: “predictions/Softmax”*
  • data_type: TYPE_FP32*
  • dims: [ 4, 1, 1 ]*
  • label_filename: “labels.txt”*
  • }*
    ]

# Specify GPU instance.
instance_group {

  • count: 1*
  • gpus: 0*
  • kind: KIND_GPU*
    }

But I encountered with the following error when testing the resnet18.engineer:


I0808 12:10:50.618199 4489 model_repository_manager.cc:810] loading: resnet18:1
I0808 12:11:03.952722 4489 plan_backend.cc:333] Creating instance resnet18_0_0_gpu0 on GPU 0 (6.1) using resnet18_fp32.engineer
I0808 12:11:04.173020 4489 plan_backend.cc:670] Created instance resnet18_0_0_gpu0 on GPU 0 with stream priority 0
I0808 12:11:04.180962 4489 model_repository_manager.cc:983] successfully loaded ‘resnet18’ version 1
INFO: infer_trtis_backend.cpp:206 TrtISBackend id:1 initialized model: resnet18
ERROR: infer_cuda_utils.cpp:53 cudaMalloc failed, cuda err_no:2, err_str:cudaErrorMemoryAllocation
ERROR: infer_cuda_utils.cpp:162 create cuda tensor buf failed, dt:kFp32, dims:3x224x224, name:input_1
ERROR: infer_trtis_backend.cpp:321 failed to create gpu tensor buf
0:00:17.595958787 4489 0x55f67426b190 ERROR nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in specifyBackendDims() <infer_trtis_context.cpp:174> [UID = 1]: failed to specify input dims trtis backend for model:resnet18, nvinfer error:NVDSINFER_CUDA_ERROR
0:00:17.595979223 4489 0x55f67426b190 ERROR nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in createNNBackend() <infer_trtis_context.cpp:251> [UID = 1]: failed to specify trtis backend input dims for model:resnet18, nvinfer error:NVDSINFER_CUDA_ERROR
I0808 12:11:04.186713 4489 model_repository_manager.cc:837] unloading: resnet18:1
I0808 12:11:04.188896 4489 server.cc:280] Waiting for in-flight requests to complete.
I0808 12:11:04.188912 4489 server.cc:295] Timeout 30: Found 1 live models and 0 in-flight non-inference requests
I0808 12:11:04.204752 4489 model_repository_manager.cc:966] successfully unloaded ‘resnet18’ version 1
I0808 12:11:05.188988 4489 server.cc:295] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
0:00:18.598405210 4489 0x55f67426b190 ERROR nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in initialize() <infer_base_context.cpp:81> [UID = 1]: create nn-backend failed, check config file settings, nvinfer error:NVDSINFER_CUDA_ERROR
0:00:18.598431025 4489 0x55f67426b190 WARN nvinferserver gstnvinferserver_impl.cpp:439:start:<primary_gie> error: Failed to initialize InferTrtIsContext
0:00:18.598440776 4489 0x55f67426b190 WARN nvinferserver gstnvinferserver_impl.cpp:439:start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app-trtis/config_infer_ls_sf_20210804.txt
0:00:18.598530391 4489 0x55f67426b190 WARN nvinferserver gstnvinferserver.cpp:460:gst_nvinfer_server_start:<primary_gie> error: gstnvinferserver_impl start failed
*** ERROR: main:655: Failed to set pipeline to PAUSED*
Quitting
ERROR from primary_gie: Failed to initialize InferTrtIsContext
Debug info: gstnvinferserver_impl.cpp(439): start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app-trtis/config_infer_ls_sf_20210804.txt
ERROR from primary_gie: gstnvinferserver_impl start failed
Debug info: gstnvinferserver.cpp(460): gst_nvinfer_server_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
App run failed


I encountered with the following error when testing the resnet18.etlt:


I0808 12:28:26.947133 4800 model_repository_manager.cc:810] loading: resnet18:1
E0808 12:28:39.324580 4800 logging.cc:43] coreReadArchive.cpp (32) - Serialization Error in verifyHeader: 0 (Magic tag does not match)
E0808 12:28:39.324718 4800 logging.cc:43] INVALID_STATE: std::exception
E0808 12:28:39.324740 4800 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.
E0808 12:28:39.331900 4800 model_repository_manager.cc:986] failed to load ‘resnet18’ version 1: Internal: unable to create TensorRT engine
ERROR: infer_trtis_server.cpp:1044 Triton: failed to load model resnet18, triton_err_str:Invalid argument, err_msg:load failed for model ‘resnet18’: version 1: Internal: unable to create TensorRT engine;

ERROR: infer_trtis_backend.cpp:45 failed to load model: resnet18, nvinfer error:NVDSINFER_TRTIS_ERROR
ERROR: infer_trtis_backend.cpp:184 failed to initialize backend while ensuring model:resnet18 ready, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:16.278514336 4800 0x55f0f3ad4f90 ERROR nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in createNNBackend() <infer_trtis_context.cpp:246> [UID = 1]: failed to initialize trtis backend for model:resnet18, nvinfer error:NVDSINFER_TRTIS_ERROR
I0808 12:28:39.332112 4800 server.cc:280] Waiting for in-flight requests to complete.
I0808 12:28:39.332138 4800 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
0:00:16.278673426 4800 0x55f0f3ad4f90 ERROR nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in initialize() <infer_base_context.cpp:81> [UID = 1]: create nn-backend failed, check config file settings, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:16.278693530 4800 0x55f0f3ad4f90 WARN nvinferserver gstnvinferserver_impl.cpp:439:start:<primary_gie> error: Failed to initialize InferTrtIsContext
0:00:16.278702507 4800 0x55f0f3ad4f90 WARN nvinferserver gstnvinferserver_impl.cpp:439:start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app-trtis/config_infer_ls_sf_20210804.txt
0:00:16.278788182 4800 0x55f0f3ad4f90 WARN nvinferserver gstnvinferserver.cpp:460:gst_nvinfer_server_start:<primary_gie> error: gstnvinferserver_impl start failed
*** ERROR: main:655: Failed to set pipeline to PAUSED*
Quitting
ERROR from primary_gie: Failed to initialize InferTrtIsContext
Debug info: gstnvinferserver_impl.cpp(439): start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app-trtis/config_infer_ls_sf_20210804.txt
ERROR from primary_gie: gstnvinferserver_impl start failed
Debug info: gstnvinferserver.cpp(460): gst_nvinfer_server_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
App run failed


What wrong with it? How to correct it? Any help is appreciated.

do you mean either network will run into the CUDA error : ERROR: infer_cuda_utils.cpp:53 cudaMalloc failed, cuda err_no:2, err_str:cudaErrorMemoryAllocation ?

above error is because the engine was built from different GPU or TRT/Triton version,