OCR Model as Pgie

ozhanatwork · November 18, 2024, 7:02am

Hardware Platform: GPU
Deepstream: 7.1
Docker Image: 7.1-triton-multiarch
GPU Type: A4000

I have created custom object detection pipeline using deepstream-test5. Now I want to add OCR capability to this pipeline. Needed nvmultiurisrcbin and broker features for OCR model. For these reason looks like I need to use OCR model as a Pgie. I have streams with fixed text areas. Is there any example?

Fiona.Chen · November 19, 2024, 1:07am

Can you tell us what is your model’s input and what is the output?

For example, the TAO model BodyPoseNet | NVIDIA NGC takes the image with single or multiple persons as the input and can output the body key points for the persons in the image. We can generate body bboxes with the key points, so we take this model as PGIE.

ozhanatwork · November 19, 2024, 11:36am

Taking HLS stream as an input. My stream has multiple regions with texts. Text locations are fixed. Want to detect all texts and sent to rabbitmq as a single payload (Frame by frame). I also need rest support to add/remove stream to this OCR pipeline.

Licence Plate Recognition example uses OCR model as a Sgie but it doesn’t meet with my case.

This is my object detection pipeline example. I want to create OCR pipeline with same features. Documents say OCR model uses nvdsvideotemplate but I’m using nvinferserver. Will it work if I change config-file with an OCR model config file?

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=0
rows=2
columns=5
width=1280
height=720
gpu-id=0
nvbuf-memory-type=0
#Set to 1 to automatically tile in Square Grid
square-seq-grid=1

[source-list]
use-nvmultiurisrcbin=1
#To display stream name in FPS log, set stream-name-display=1
stream-name-display=1
max-batch-size=10 # Maximum number of streams can added to pipeline
http-ip=localhost
http-port=9000
#Set low latency mode for bitstreams having I and IPPP frames on decoder
low-latency-mode=1
#sgie batch size is number of sources * fair fraction of number of objects detected per frame per source
#the fair fraction of number of object detected is assumed to be 4
sgie-batch-size=40
#Set the below key to keep the application running at all times

[source-attr-all]
enable=1
type=3 #1: Camera (V4L2) 2: URI 3: MultiURI 4: RTSP 5: Camera (CSI) (Jetson only)
gpu-id=0
cudadec-memtype=0
#drop-frame-interval=5
#latency=100
#rtsp-reconnect-interval-sec=10
#Limit the rtsp reconnection attempts
#rtsp-reconnect-attempts=4

[streammux]
gpu-id=0
live-source=1
batch-size=6
batched-push-timeout=40000
width=1920
height=1080
enable-padding=1 # Maintains aspect ratio
nvbuf-memory-type=0
drop-pipeline-eos=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=UDPSink 5=nvdrmvideosink 6=MsgConvBroker
type=6
msg-conv-payload-type=1
msg-conv-msg2p-new-api=0
msg-broker-proto-lib=/opt/nvidia/deepstream/deepstream/lib/libnvds_amqp_proto.so
#Provide your msg-broker-conn-str here
msg-broker-conn-str=rabbit.app_network;5672;guest;guest
topic=deepstream1
msg-broker-comp-id=1
msg-conv-comp-id=1
#Optional:
msg-broker-config=/opt/nvidia/deepstream/deepstream/sources/libs/amqp_protocol_adaptor/cfg_amqp.txt
msg-conv-msg2p-lib=/opt/nvidia/deepstream/deepstream/lib/libnvds_msgconv.so

[sink1]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=UDPSink 5=nvdrmvideosink 6=MsgConvBroker
type=6
msg-conv-payload-type=1
msg-broker-proto-lib=/opt/nvidia/deepstream/deepstream/lib/libnvds_amqp_proto.so
#Provide your msg-broker-conn-str here
msg-broker-conn-str=rabbit.app_network;5672;guest;guest
topic=deepstream2
msg-broker-comp-id=2
msg-conv-comp-id=2
#Optional:
msg-broker-config=/opt/nvidia/deepstream/deepstream/sources/libs/amqp_protocol_adaptor/cfg_amqp.txt
msg-conv-msg2p-lib=/opt/nvidia/deepstream/deepstream/lib/libnvds_msgconv.so

[sink2]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=UDPSink 5=nvdrmvideosink 6=MsgConvBroker
type=6
msg-conv-payload-type=1
msg-broker-proto-lib=/opt/nvidia/deepstream/deepstream/lib/libnvds_amqp_proto.so
#Provide your msg-broker-conn-str here
msg-broker-conn-str=rabbit.app_network;5672;guest;guest
topic=deepstream3
msg-broker-comp-id=3
msg-conv-comp-id=3
#Optional:
msg-broker-config=/opt/nvidia/deepstream/deepstream/sources/libs/amqp_protocol_adaptor/cfg_amqp.txt
msg-conv-msg2p-lib=/opt/nvidia/deepstream/deepstream/lib/libnvds_msgconv.so

[primary-gie]
enable=1
#interval=5
gpu-id=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=/opt/nvidia/deepstream/deepstream/DeepStream-Yolo/config_infer_vehicle.txt
#infer-raw-output-dir=../../../../../samples/primary_detector_raw_output/

Fiona.Chen · November 19, 2024, 1:40pm

Are you taking about the usage of Optical Character Recognition | NVIDIA NGC model provided by TAO toolkit ?

Fiona.Chen · November 19, 2024, 1:43pm

Are there multiple text areas in a single frame?

ozhanatwork · November 19, 2024, 1:44pm

Yes

Yes. OCRNet and OCDNet

Fiona.Chen · November 19, 2024, 1:49pm

We have provided sample of TAO OCD+OCR models. deepstream_tao_apps/apps/tao_others/deepstream-nvocdr-app at master · NVIDIA-AI-IOT/deepstream_tao_apps

The OCD+OCR models can’t be used in deepstream-app configuration directly.

Fiona.Chen · December 24, 2024, 10:42am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

system · January 7, 2025, 10:42am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Nvidia OCR model DeepStream SDK	7	823	December 19, 2023
Deepstream 6.3 can i use one detector to detect car, one detector to detect plate and detector (ocr) to get plate number? DeepStream SDK gstreamer	9	482	October 26, 2023
How to access nvOCDR metadata while using nvdsvideotemplate in Deepstream 6.3 DeepStream SDK gstreamer , deepstream	7	46	January 14, 2025
Queries Regarding OCR TAO Toolkit tao , deepstream	15	661	December 11, 2023
How to get object image detection by pgie and label recognition by sgie DeepStream SDK	13	1843	October 12, 2021
Loading OCDNet as sgie0 DeepStream SDK deepstream	11	36	January 14, 2025
Regarding doubts about deepstream custom parser for onnx with deepstream batch DeepStream SDK gstreamer , deepstream	7	57	October 8, 2024
Depth estimation with deepstream DeepStream SDK	7	447	June 14, 2024
How to Crop RTSP Stream Input in DeepStream for Model Inference? DeepStream SDK	5	51	August 27, 2024
Facial Landmarks in Python DeepStream SDK deepstream	11	68	December 3, 2024

OCR Model as Pgie

Related topics