Conditions:
I’m creating ALPR system like this: Creating a Real-Time License Plate Detection and Recognition App | NVIDIA Developer Blog, forwarding successfully so far - Jetson TX2 recognizes car plates very good (ALPR with neuro-net trained with rus characters - YouTube) - but it is missing trucks, speaking about 'Dashcamnet’ i suppose.
Missed trucks:
Volvo - 45s
MAZ - 58s
Question: How to make it to find trucks, should i learn maybe ‘Dashcamnet’ with images of trucks, but then i have to add images of cars, and people, bicycles and road signs? But, here: (https://ngc.nvidia.com/catalog/models/nvidia:tlt_dashcamnet) Nvidia says that used:
Object Distribution
Environment Images Cars Persons Road Signs Two-Wheelers
Dashcam (5ft height) 128,000 1.7M 720,000 354,127 54,000
Traffic signal content 50,000 1.1M 53500 184000 11000
Total 178,000 2.8M 773,500 538,127 65,000
A user cannot find so much images.
Also VehicleTypeNet Model Card says: "The model described in this card is a classification network, which aims to classify car images into 6 vehicle types:
- coupe
- sedan
- SUV
- van
- large vehicle
- truck
( https://ngc.nvidia.com/catalog/models/nvidia:tlt_vehicletypenet) and says: “VehicleTypeNet is generally cascaded with DashCamNet or TrafficCamNet for smart city applications. For example, DashCamNet or TrafficCamNet acts as a primary detector, detecting the objects of interest and for each detected car the VehicleTypeNet acts as a secondary classifier determining the type of the car.” - so it can get trucks from DashCamNet and classify - but it does not.
What will you advice?
PC configuration:
Driver Version: 465.19.01
CUDA Version: 11.3
TensorRT: 7.2.3
cudnn: 8.2.0.53
deepstream-app version 5.1.0
DeepStreamSDK 5.1.0
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
alex@jetson:~$ tlt info --verbose
Configuration of the TLT Instance
dockers:
nvcr.io/nvidia/tlt-streamanalytics:
docker_tag: v3.0-dp-py3
tasks:
- augment
- classification
- detectnet_v2
- dssd
- emotionnet
- faster_rcnn
- fpenet
- gazenet
- gesturenet
- heartratenet
- lprnet
- mask_rcnn
- retinanet
- ssd
- unet
- yolo_v3
- yolo_v4
- tlt-converter
nvcr.io/nvidia/tlt-pytorch:
docker_tag: v3.0-dp-py3
tasks: - speech_to_text
- text_classification
- question_answering
- token_classification
- intent_slot_classification
- punctuation_and_capitalization
format_version: 1.0
tlt_version: 3.0
published_date: 02/02/2021
tutorial_spec.txt:
random_seed: 42
lpr_config {
hidden_units: 512
max_label_length: 9
arch: “baseline”
nlayers: 18 #setting nlayers to be 10 to use baseline10 model
}
training_config {
batch_size_per_gpu: 32
num_epochs: 24
learning_rate {
soft_start_annealing_schedule {
min_learning_rate: 1e-6
max_learning_rate: 1e-5
soft_start: 0.001
annealing: 0.5
}
}
regularizer {
type: L2
weight: 5e-4
}
}
eval_config {
validation_period_during_training: 5
batch_size: 1
}
augmentation_config {
output_width: 96
output_height: 48
output_channel: 3
keep_original_prob: 0.3
transform_prob: 0.5
rotate_degree: 5
}
dataset_config {
data_sources: {
label_directory_path: “/workspace/tlt-experiments/data/openalpr/train/label”
image_directory_path: “/workspace/tlt-experiments/data/openalpr/train/image”
}
characters_list_file: “/workspace/tlt-experiments/lprnet/specs/ru_lp_characters.txt”
validation_data_sources: {
label_directory_path: “/workspace/tlt-experiments/data/openalpr/val/label”
image_directory_path: “/workspace/tlt-experiments/data/openalpr/val/image”
}
}
ru_lp_characters.txt contains:
0
1
2
3
4
5
6
7
8
9
A
B
E
K
M
H
O
P
C
T
Y
X
D