Tao toolkit observations

This config is working now, even though it still produces an error:

ERROR: [TRT]: 3: Cannot find binding of given name: output_cov/Sigmoid
[property]
gpu-id=0
net-scale-factor=0.00392156862745098
offsets=0;0;0
infer-dims=3;384;1248
tlt-model-key=tlt_encode
network-type=0
network-mode=2
labelfile-path=models/primary-detector/resnet18-detector/labels.txt
onnx-file=models/primary-detector/resnet18-detector/resnet18_detector.onnx
#model-engine-file=models/primary-detector/resnet18-detector/resnet18_detector.onnx.b1_gpu0_int8.engine
int8-calib-file=models/primary-detector/resnet18-detector/calibration.bin
batch-size=1
num-detected-classes=3
model-color-format=0
maintain-aspect-ratio=0
output-tensor-meta=0
cluster-mode=2
gie-unique-id=1
uff-input-order=0
output-blob-names=output_cov/Sigmoid;output_bbox/BiasAdd
uff-input-blob-name=input_1


[class-attrs-all]
pre-cluster-threshold=0.2
eps=0.4
group-threshold=1

Any suggestions to the output_blob_names?

And results are way worse than with resnet18_trafficcamnet… For what reasons ever…

Even yolov7-tiny is better

If you would like to judge and help me with improving the results, I could upload 3 videos, showing one and the same Berlin street scene created using three different models…

Suggest you to generate a new topic since original issues are gone on your side.
In your new topic, please describe:

  • What models you have trained? Which dataset? spec file? Pruned or unpruned?
  • What is the new issue?

Ok, here or in the DeepStream forum?

If you are running into error with deepsteam application, you can create a topic in deepstream forum. If you believe it is an issue of the model itself, you can create a topic here.

Having no issue with DS.

Here’s the post:

These links

https://registry.ngc.nvidia.com/orgs/nvidia/models/tao_lpdnet
https://registry.ngc.nvidia.com/orgs/nvidia/models/tao_lprnet

from the blog https://developer.nvidia.com/blog/creating-a-real-time-license-plate-detection-and-recognition-app/ are ending up in 404

Thanks for this info. Would you have ONE sample for this?

The dataset for LPRNet contains cropped license plates images and corresponding label files.

And also what “characters_list.txt” contains? Let’s be more specific with a sample.

Say my image is this:

image

It is 640x385, 72 pixel/inch.

Would be the “cropped license plate image” the part in the magenta box then?

image

What would label_0000.txt have to contain then? left, top, width and height in the cropped image? And the text as “BJT304E”? That would be rubbish.

Please note: Your current LPR is able to detect that as “BJT304E”, but this is WRONG in Germany. Since the number is “B JT 304 E”. Spaces have a BIG meaning over here :)

So what would I have to have in character_list.txt in order to be able to recognize this as “B JT 304 E” instead of just “BJT304E”?

I’m sorry, if my questions are stupid, but it is really not very easy to start from zero with all this…

Sorry for inconvenient. Please use below links.
https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/lpdnet.
https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/lprnet.

For current LPR, it does not support to recognize space. You can use another network(OCRNet, OCRNet - NVIDIA Docs) to retrain the German dataset.
Please generate groudtruth in training dataset. For example, “B JT 304 E” image → B[space]JT[space]304[space]E, then use Attention decode way to retrain the OCRNet model. There is a characters_list.txt file that contains all the characters found in the dataset. In the character.list, for example, A,B,C,D,E,...,a,b,c,d,...,[space].

For current LPR, it does not support to recognize space.

Knew that already. Thanks for confirmation.

You can use another network(OCRNet, OCRNet - NVIDIA Docs ) to retrain the German dataset.

That’s interesting. In case of the sample above, what would for instance be “Dataset/images/0000.jpg”? Would it be the magenta framed part of the entire image? The number plate in different angles, resolutions, colours, light conditions?

Yes.
image

Yes, they can be.

Thanks again. Have a nice day

I would like to know about the training requirements for this model.

Last night I was fighting like hell to run the very first steps of the ocrnet training to no avail. The attempt to pull nvcr.io/nvidia/tao/tao-toolkit 5.3.0-pyt always failed with obscure reasons. I finally figured out, that the reason was a memory shortage. So I removed all old containers (e.g. all which where loaded for the detectnet_v2 training) and it worked.

The nvcr.io/nvidia/tao/tao-toolkit 5.3.0-pyt occupied 25 GB.

Now, in the flow of the ocrnet training, I’m seemingly again in that trap, because I’m trying to export the model in stage 10.

With horrors I notice, that the notebook again tries to load yet another container, now nvcr.io/nvidia/tao/tao-toolkit 5.3.0-deploy. Supposingly again > 20 GB, supposingly again bringing my machine to the edge.

My current training device is an AWS g4dn.xlarge fitted with a T4, 16 GB RAM and 125 GB SSD. That seems to be insufficient for this model to follow a tutorial.

Quick question: Are you kidding me?

Oh puhh… Just 10 GB now. That went through…

OK, finally the ocrnet/ocrnet-vit.ipynb showed up these results:

.
├── best_accuracy.onnx
├── status.json
└── trt.engine

Now I’m having a doubt, how to make use of it in my app, especially because my current lpr config looks like so and would for sure - just a guess - not work with the output of this training, wouldn’t it?


[property]
gpu-id=0
model-engine-file=models/LP/LPR/us_lprnet_baseline18_deployable.etlt_b16_gpu0_fp16.engine
labelfile-path=models/LP/LPR/labels_us.txt
tlt-encoded-model=models/LP/LPR/us_lprnet_baseline18_deployable.etlt
tlt-model-key=nvidia_tlt
batch-size=16
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=3
gie-unique-id=3
output-blob-names=tf_op_layer_ArgMax;tf_op_layer_Max
#0=Detection 1=Classifier 2=Segmentation
network-type=1
parse-classifier-func-name=NvDsInferParseCustomNVPlate
custom-lib-path=nvinfer/libnvdsinfer_custom_impl_lpr.so
process-mode=2
operate-on-gie-id=2
net-scale-factor=0.00392156862745098
#net-scale-factor=1.0
#0=RGB 1=BGR 2=GRAY
model-color-format=0

What sample am I supposed to study now, or what 30 GB container needs to be pulled to make that something useable?

Going to test the lprnet/lprnet.ipynb notebook now. I’m having my dataset and the LMDB metadata for it, I hope it will not be such a big deal as all the TAO experiements have been up to now.

For running OCDNet and OCRNet with deepstream, please take a look at deepstream_tao_apps/apps/tao_others/deepstream-nvocdr-app at b300bd1c9c6134b178f4ed67ab1e365422c15e4f · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub.
Since currently you are using LPDNet + OCRNet, you can reach help in deepstream forum for this case.

I have finished re-training of lprnet now, the first and only positive experience with TAO so far.

What I got was this:

Exported engine:
------------
total 158M
-rw-r--r-- 1 ubuntu ubuntu 29M May 29 07:04 lprnet_epoch-024.fp16.engine
-rw-r--r-- 1 ubuntu ubuntu 74M May 29 07:01 lprnet_epoch-024.fp32.engine
-rw-r--r-- 1 ubuntu ubuntu 56M May 29 06:57 lprnet_epoch-024.onnx
-rw-r--r-- 1 ubuntu ubuntu 524 May 29 07:04 status.json

Could you please elaborate, how this does map to my current model directory, which contains this?

.
├── labels_us.txt
├── us_lprnet_baseline18_deployable.etlt
└── us_lprnet_baseline18_deployable.etlt_b16_gpu0_fp16.engine

I’m missing some *.etlt file in the training results, also because this is required by configuration:

[property]
gpu-id=0
model-engine-file=models/LP/LPR/us_lprnet_baseline18_deployable.etlt_b16_gpu0_fp16.engine
labelfile-path=models/LP/LPR/labels_us.txt
tlt-encoded-model=models/LP/LPR/us_lprnet_baseline18_deployable.etlt
tlt-model-key=nvidia_tlt
batch-size=16
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=3
gie-unique-id=3
output-blob-names=tf_op_layer_ArgMax;tf_op_layer_Max
#0=Detection 1=Classifier 2=Segmentation
network-type=1
parse-classifier-func-name=NvDsInferParseCustomNVPlate
custom-lib-path=nvinfer/libnvdsinfer_custom_impl_lpr.so
process-mode=2
operate-on-gie-id=2
net-scale-factor=0.00392156862745098
#net-scale-factor=1.0
#0=RGB 1=BGR 2=GRAY
model-color-format=0

[class-attrs-all]
threshold=0.5