Training OCRNet for being used for LPD/LPR

foreverneilyoung · May 29, 2024, 11:50am

dGPU
DS 7

As follow up of the long thread here Tao toolkit observations - #63 by foreverneilyoung
I was following the notebook ocrnet/ocrnet-vit.ipynb in order to train OCRNet for German number plate recognition.

I first ran the notebook “as is” to see, what it gives. In the end I got this:

.
├── best_accuracy.onnx
├── status.json
└── trt.engine

I was using this ONNX model as replacement for my original LPR ONNX, trained this morning from lprnet/lprnet.ipynb with the following configuration:

[property]
gpu-id=0
# This model works. Trained from LPRNet
#onnx-file=models/LP/LPR/lprnet_epoch-024.onnx
onnx-file=models/LP/LPR/best_accuracy.onnx
labelfile-path=models/LP/LPR/labels_us.txt
batch-size=16
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
gie-unique-id=3
# This line is causing problems
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid
#0=Detection 1=Classifier 2=Segmentation
network-type=1
parse-classifier-func-name=NvDsInferParseCustomNVPlate
custom-lib-path=nvinfer/libnvdsinfer_custom_impl_lpr.so
process-mode=2
operate-on-gie-id=2
net-scale-factor=0.00392156862745098
#net-scale-factor=1.0
#0=RGB 1=BGR 2=GRAY
model-color-format=0

[class-attrs-all]
threshold=0.5

But all I got was this:

0:00:02.233559654 24474 0x7719e4009c70 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<sgie2-lpr> NvDsInferContext[UID 3]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2109> [UID = 3]: Trying to create engine from model files
WARNING: [TRT]: onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: [TRT]: onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output.
WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output
WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output.
WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output.
WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output.
WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output.
// and lot more of these warnings

Any idea what’s going wrong?

foreverneilyoung · May 29, 2024, 12:51pm

Disregard. It just took very long to build the engine.

However, the error is now

0:00:10.346122630 24982      0x13cc760 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<sgie2-lpr> NvDsInferContext[UID 3]: Error in NvDsInferContextImpl::preparePreprocess() <nvdsinfer_context_impl.cpp:1035> [UID = 3]: RGB/BGR input format specified but network input channels is not 3

Input is an RTSP stream, working fine since weeks for other models

Morganh · May 30, 2024, 2:56am

Hi @foreverneilyoung ,
As we synced in Tao toolkit observations - #61 by Morganh, you are using https://github.com/NVIDIA-AI-IOT/deepstream_lpr_app/blob/master/deepstream-lpr-app/lpr_config_pgie.txt but obviously there is problem in output-blob-names.

I move this topic to deepstream forum to check further.

Fiona.Chen · May 30, 2024, 3:06am

Since it is an ONNX model, please get the input layer dimensions and names by netron.

foreverneilyoung · May 30, 2024, 4:19am

@Fiona.Chen I already consulted netron, it gave an image, which was too big for PNG export. SVG export even too big to be attachable here. Top of it looks like so:

Could you please provide a pointer what you mean by “please get the input layer dimensions and names” and what to do with this?

Fiona.Chen · May 30, 2024, 4:29am

DeepStream does not care about the details of the network, only the input and output are meaningful for DeepStream. Please click the top layer of the network, the input and output layers’ info will appear in right side.

foreverneilyoung · May 30, 2024, 5:06am

Click on input gives this:

Fiona.Chen · May 30, 2024, 5:23am

Since the model is trained by you, please confirm that the model accepts the gray image with 200x64 resolution as input and the input tensor dimension is NCHW.

The preprocessing algorithm is the training script, please guarantee the preprocessing parameters are aligned with your training script. DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums

foreverneilyoung · May 30, 2024, 5:28am

Well, I guess, you can tell this way better than me. The model is the result of a run of this unchanged notebook

github.com

NVIDIA/tao_tutorials/blob/main/notebooks/tao_launcher_starter_kit/ocrnet/ocrnet-vit.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Optical Character Recognition using TAO OCRNet-ViT\n",
    "\n",
    "Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. \n",
    "\n",
    "Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.\n",
    "\n",
    "<img align=\"center\" src=\"https://d29g4g2dyqv443.cloudfront.net/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png\" width=\"1080\">\n",
    "\n",
    "## Sample prediction of OCRNet\n",
    "<img align=\"center\" src=\"https://github.com/vpraveen-nv/model_card_images/blob/main/cv/notebook/ocrnet/OCRNet_inference.png?raw=true\" width=\"960\">"
   ]
  },
  {
   "cell_type": "markdown",

This file has been truncated. show original

Fiona.Chen · May 30, 2024, 5:31am

If you don’t understand the training script, please consult in the TAO forum.

foreverneilyoung · May 30, 2024, 5:33am

Oh thanks, perfect circle

Fiona.Chen · May 30, 2024, 5:36am

@Morganh

Can you tell this user about the preprocessing parameters for Optical Character Recognition | NVIDIA NGC? Seems he is working on trainable_v2.x version

Fiona.Chen · May 30, 2024, 5:43am

@foreverneilyoung

Please set “model-color-format=2” in the nvinfer configuration file for the “the gray image with 200x64 resolution as input and the input tensor dimension is NCHW” ONNX model.

And please customize your own postprocessing “NvDsInferParseCustomNVPlate” function deepstream_lpr_app/nvinfer_custom_lpr_parser/nvinfer_custom_lpr_parser.cpp at master · NVIDIA-AI-IOT/deepstream_lpr_app (github.com) to process the OCRNet’s output tensors.

foreverneilyoung · May 30, 2024, 5:45am

Tried that. Gives segmentation fault

EDIT: I meant “model-color-format=2” gives that

foreverneilyoung · May 30, 2024, 5:47am

What am I supposed to customize here?

foreverneilyoung · May 30, 2024, 5:56am

Maybe one step back in order to provide the current state:

I’m having a running LPD/LPR setup using LPRNet with this configuration:

[property]
gpu-id=0

onnx-file=models/LP/LPR/lprnet_epoch-024.onnx
model-engine-file=models/LP/LPR/lprnet_epoch-024.onnx_b16_gpu0_fp16.engine

#onnx-file=models/LP/LPR/best_accuracy.onnx
#model-engine-file=models/LP/LPR/best_accuracy.onnx_b16_gpu0_fp16.engine

labelfile-path=models/LP/LPR/labels_us.txt
batch-size=16
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
gie-unique-id=3
#output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid
#0=Detection 1=Classifier 2=Segmentation
network-type=1
parse-classifier-func-name=NvDsInferParseCustomNVPlate
custom-lib-path=nvinfer/libnvdsinfer_custom_impl_lpr.so
process-mode=2
operate-on-gie-id=2
net-scale-factor=0.00392156862745098
#net-scale-factor=1.0
#0=RGB 1=BGR 2=GRAY
model-color-format=0

[class-attrs-all]
threshold=0.5

The model used is the result of a training run using this notebook tao_tutorials/notebooks/tao_launcher_starter_kit/lprnet/lprnet.ipynb at main · NVIDIA/tao_tutorials · GitHub
Unfortunately this model is unable to detect “spaces” or other delimiters (which are important for number plates in other parts of the world, other than China or USA)
I tried to train LPRNet with a characterset containing spaces and minus sign to no avail
I got a hint to try OCRNet. So I ran the a.m. notebook for OCRNet training unchanged and got that “best_accuracy.onnx” network
I just replaced the LPR network by the OCR network and this doesn’t work (just commenting the LPR model lines above and uncommenting the others)
It fails with the a.m. error regarding colour channels. Setting colour mode to GRAY is accepted initially, but crashes at runtime

Fiona.Chen · May 30, 2024, 5:57am

Sure it will fail. The postprocessing is for License Plate Recognition | NVIDIA NGC but not for Optical Character Recognition | NVIDIA NGC.

And you still don’t get the correct preprocessing parameters yet.

You need to consult the model engineer for the postprocessing algorithm of the model you want to use.

foreverneilyoung · May 30, 2024, 5:58am

And this is who? Mr Who?

Thanks, but this is becoming funny now. No further questions

Morganh · May 30, 2024, 6:01am

Actually this is a new feature regarding to LPDNet + OCRNet in deepstream.
For OCDNet + OCRNet in deepstream, we have deepstream_tao_apps/apps/tao_others/deepstream-nvocdr-app at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub.
For LPDNet + LPRNet in deepstream, we have deepstream_lpr_app/deepstream-lpr-app at master · NVIDIA-AI-IOT/deepstream_lpr_app · GitHub.

foreverneilyoung · May 30, 2024, 6:03am

Ah thanks, this looks helpful

Topic		Replies	Views
OCRNET parse function for DeepStream DeepStream SDK	28	206	November 12, 2024
Creating a Real-Time License Plate Detection and Recognition App TAO Toolkit	41	5172	October 12, 2021
OCDNet Tao Model Zoo TAO Toolkit jetson	7	32	October 22, 2024
ONNX model with Jetson-Inference using GPU Jetson Xavier NX tensorrt , jetson-inference , onnx	38	5578	October 18, 2021
NVIDIA-AI-IOT / deepstream_lpr_app is not working when only using LPD and LPR model DeepStream SDK	23	668	June 8, 2023
Fpenet retraining output file onnx but deepstream is using tlt TAO Toolkit	22	862	October 17, 2023
Tao-converter [ERROR] Failed to parse the model, please check the encoding key to make sure its correct TAO Toolkit deepstream	70	1654	July 10, 2023
LPRNet technical blog post: sample DeepStream app repo not found? TAO Toolkit jetson	20	70	September 3, 2024
Deepstream infrence gives no detection TAO Toolkit	28	1906	December 9, 2021
LPD training model not OK DeepStream SDK	17	385	July 18, 2022

Training OCRNet for being used for LPD/LPR

Related topics