Inference time and Inference on multiple images with nvOCDR

Hello @Fiona.Chen
Thank you for your quick answer. I had tried the mentioned nvocdr sample following all the steps. I had a problem with the ViT versions of the models on my Orin NX and tried to resolve them ( Process Killed when Generating a TensorRT Engine for the ViT models - #8 by junshengy and Process killed when generating a TensorRT Engine for a ViT model ).

Meanwhile, I tried to use the sample with the v1.0 versions of the OCD and OCR model from Optical Character Detection | NVIDIA NGC and Optical Character Recognition | NVIDIA NGC .

I then generated the engines with their respective commands ( inspired from GitHub - NVIDIA-AI-IOT/NVIDIA-Optical-Character-Detection-and-Recognition-Solution: This repository provides optical character detection and recognition solution optimized on Nvidia devices. )

Here is how I modified the nvocdr_app_config.yml file:

source-list:
   #list: file:///workspace/nvocdr/img_0.jpg;file:///workspace/nvocdr/img_1.jpg
   list: file:///~/input/test2-crop.mp4

output:
  ## 1:file ouput  2:fake output 3:eglsink output
  type: 1
  ## 0: H264 encoder  1:H265 encoder
  codec: 0
  #encoder type 0=Hardware 1=Software
  enc-type: 0
  bitrate: 2000000
  ##The file name without suffix
  filename: test

streammux:
  width: 1280
  height: 720
  batched-push-timeout: 40000

video-template:
  customlib-name: nvocdr_libs/aarch64/libnvocdr_impl.so
  customlib-props:
    - ocdnet-engine-path:../../../../../models/nvocdr/ocdnet.fp16.engine
    - ocdnet-input-shape:3,736,1280
    - ocdnet-binarize-threshold:0.1
    - ocdnet-polygon-threshold:0.3
    - ocdnet-max-candidate:200
    - ocrnet-engine-path:../../../../../models/nvocdr/ocrnet.fp16.engine
    - ocrnet-dict-path:../../../../../models/nvocdr/character_list
    - ocrnet-input-shape:1,32,100
   # - ocrnet-decode:Attention

And by doing so, when I run the sample with debug level 3, I get the following errors:

root@************:/~/input/deepstream-app/deepstream_tao_apps/apps/tao_others/deepstream-nvocdr-app# ./deepstream-nvocdr-app nvocdr_app_config.yml
in create_video_encoder, isH264:1, enc_type:0
/bin/bash: line 1: lsmod: command not found
/bin/bash: line 1: modprobe: command not found
!! [WARNING] Unknown param found : type
!! [WARNING] Unknown param found : codec
!! [WARNING] Unknown param found : enc-type
!! [WARNING] Unknown param found : filename
 Now playing! 
Opening in BLOCKING MODE 
0:00:00.182777563 70800 0xaaaaef409980 WARN                    v4l2 gstv4l2object.c:4671:gst_v4l2_object_probe_caps:<nvvideo-h264enc:src> Failed to probe pixel aspect ratio with VIDIOC_CROPCAP: Unknown error -1
0:00:00.228694223 70800 0xffff64000d80 WARN              aggregator gstaggregator.c:2099:gst_aggregator_query_latency_unlocked:<mp4-mux> Latency query failed
Inside Custom Lib : Setting Prop Key=ocdnet-engine-path Value=../../../../../models/nvocdr/ocdnet.fp16.engine
Inside Custom Lib : Setting Prop Key=ocdnet-input-shape Value=3,736,1280
Inside Custom Lib : Setting Prop Key=ocdnet-binarize-threshold Value=0.1
Inside Custom Lib : Setting Prop Key=ocdnet-polygon-threshold Value=0.3
Inside Custom Lib : Setting Prop Key=ocdnet-max-candidate Value=200
Inside Custom Lib : Setting Prop Key=ocrnet-engine-path Value=../../../../../models/nvocdr/ocrnet.fp16.engine
Inside Custom Lib : Setting Prop Key=ocrnet-dict-path Value=../../../../../models/nvocdr/character_list
Inside Custom Lib : Setting Prop Key=ocrnet-input-shape Value=1,32,100
Inside Custom Lib : Setting Prop Key=ocrnet-decode Value=Attention
0:00:00.263201560 70800 0xaaaaef409980 WARN                 basesrc gstbasesrc.c:3688:gst_base_src_start_complete:<source> pad not activated yet
Decodebin child added: source
Decodebin child added: decodebin0
0:00:00.263678467 70800 0xaaaaef409980 WARN                 basesrc gstbasesrc.c:3688:gst_base_src_start_complete:<source> pad not activated yet
Running...
Decodebin child added: qtdemux0
0:00:00.268566134 70800 0xffff64001440 WARN                 qtdemux qtdemux_types.c:249:qtdemux_type_get: unknown QuickTime node type sgpd
0:00:00.268586486 70800 0xffff64001440 WARN                 qtdemux qtdemux_types.c:249:qtdemux_type_get: unknown QuickTime node type sbgp
0:00:00.268600342 70800 0xffff64001440 WARN                 qtdemux qtdemux_types.c:249:qtdemux_type_get: unknown QuickTime node type ldes
0:00:00.268637079 70800 0xffff64001440 WARN                 qtdemux qtdemux.c:3121:qtdemux_parse_trex:<qtdemux0> failed to find fragment defaults for stream 1
0:00:00.268760602 70800 0xffff64001440 WARN                 qtdemux qtdemux.c:3121:qtdemux_parse_trex:<qtdemux0> failed to find fragment defaults for stream 2
Decodebin child added: multiqueue0
Decodebin child added: h264parse0
Decodebin child added: capsfilter0
0:00:00.270216988 70800 0xffff64001440 WARN            uridecodebin gsturidecodebin.c:960:unknown_type_cb:<uri-decode-bin> warning: No decoder available for type 'audio/mpeg, mpegversion=(int)4, framed=(boolean)true, stream-format=(string)raw, level=(string)2, base-profile=(string)lc, profile=(string)lc, codec_data=(buffer)119056e500, rate=(int)48000, channels=(int)2'.
Decodebin child added: nvv4l2decoder0
Opening in BLOCKING MODE 
0:00:00.296631207 70800 0xffff640017c0 WARN                    v4l2 gstv4l2object.c:4671:gst_v4l2_object_probe_caps:<nvv4l2decoder0:src> Failed to probe pixel aspect ratio with VIDIOC_CROPCAP: Unknown error -1
NvMMLiteOpen : Block : BlockType = 261 
NvMMLiteBlockCreate : Block : BlockType = 261 
0:00:00.399059400 70800 0xffff640017c0 WARN                    v4l2 gstv4l2object.c:4671:gst_v4l2_object_probe_caps:<nvv4l2decoder0:src> Failed to probe pixel aspect ratio with VIDIOC_CROPCAP: Unknown error -1
In cb_newpad
###Decodebin pick nvidia decoder plugin.
Error reading serialized TensorRT engine: ../../../../../models/nvocdr/ocdnet.fp16.engine
terminate called after throwing an instance of 'std::length_error'
  what():  cannot create std::vector larger than max_size()
Aborted (core dumped)

Do you think the sample works with v1.0 versions of the models? If so, is there anything I need to modify further to make it work? Otherwise, is there an alternative to infer on multiple sources with the v1.0 models?

Thank you in advance for you help!