Detecting Rotated Objects Using the NVIDIA Object Detection Toolkit

Originally published at: Detecting Rotated Objects Using the NVIDIA Object Detection Toolkit | NVIDIA Technical Blog

Figure 1. A portion of the International Society for Remote Sensing and Photogrammetry (ISPRS) Potsdam dataset. Rotated bounding boxes of the vehicle class, calculated using the segmentation masks labels, are shown in green. Object detection and classification in imagery using deep neural networks (DNNs) and convolutional neural networks (CNNs) is a well-studied area. For some…

1 Like

Interested in using this approach with an over head camera system to track moving objects using jetson nano. The FPS on T4 and V100 are great but wanted to get thoughts on running on a Jetson Nano?

Is there somewhere a github repository with the code for the augmentation for rotated bounding boxes? I mean the code in which e.g. _corners2rotatedbbox() is called. I can not find this part on the ODTK github page.

Working through a generic example using a small COCO dataset for microcontrollers at GitHub - TannerGilbert/Detectron2-Train-a-Instance-Segmentation-Model: Learn how to train a custom instance segmentation model with Detectron2

Using the following command line to process and runs to the 10000 iteration and hangs. Waited for an hour+ and did CTRL-C and included that stack trace.

Any suggestions on why it is hanging? Running on ubuntu 18 with two 1080 GPU cards.

Also wanted to confirm that if I use the resize and jitter option the mask will need to change as well as the corresponding rotated bounding box. Does the resize and jitter transformation get applied to the mask/rotated bounding box?

odtk train model.pth --backbone ResNet18FPN --iters 10000 --val-iters 1000 --lr 0.0001 --images /data/micro/segmentation/train/ --annotations /data/micro/segmentation/train.json --val-images /data/micro/segmentation/test --val-annotations /data/micro/segmentation/test.json --rotated-bbox

Initializing model…
model: RetinaNet
backbone: ResNet18FPN
classes: 80, anchors: 27
Selected optimization level O2: FP16 training with FP32 batchnorm and FP32 master weights.

Defaults for this optimization level are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)…
After processing overrides, optimization options are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : 128.0
Preparing dataset…
loader: pytorch
resize: [640, 1024], max: 1333
device: 2 GPUs
batch: 4, precision: mixed
BBOX type: rotated
Training model for 10000 iterations…
[ 53/10000] focal loss: 1.608, box loss: 27.825, 1.138s/4-batch (fw: 0.431s, bw: 0.603s), 3.5 im/s, lr: 1.5e-05
/opt/conda/lib/python3.6/site-packages/torch/optim/lr_scheduler.py:200: UserWarning: Please also save or load the state of the optimzer when saving or loading the scheduler.
warnings.warn(SAVE_STATE_WARNING, UserWarning)
[ 109/10000] focal loss: 1.573, box loss: 27.593, 1.081s/4-batch (fw: 0.402s, bw: 0.583s), 3.7 im/s, lr: 2e-05
[ 165/10000] focal loss: 1.616, box loss: 27.829, 1.081s/4-batch (fw: 0.402s, bw: 0.628s), 3.7 im/s, lr: 2.5e-05
[ 219/10000] focal loss: 1.736, box loss: 27.580, 1.118s/4-batch (fw: 0.408s, bw: 0.607s), 3.6 im/s, lr: 3e-05
[ 274/10000] focal loss: 1.630, box loss: 27.808, 1.105s/4-batch (fw: 0.418s, bw: 0.635s), 3.6 im/s, lr: 3.5e-05
[ 326/10000] focal loss: 1.788, box loss: 27.528, 1.168s/4-batch (fw: 0.417s, bw: 0.640s), 3.4 im/s, lr: 3.9e-05
[ 380/10000] focal loss: 1.582, box loss: 27.621, 1.124s/4-batch (fw: 0.426s, bw: 0.644s), 3.6 im/s, lr: 4.4e-05
[ 433/10000] focal loss: 1.838, box loss: 22.619, 1.135s/4-batch (fw: 0.405s, bw: 0.623s), 3.5 im/s, lr: 4.9e-05
[ 488/10000] focal loss: 1.572, box loss: 3.746, 1.098s/4-batch (fw: 0.395s, bw: 0.653s), 3.6 im/s, lr: 5.4e-05
[ 541/10000] focal loss: 1.879, box loss: 1.380, 1.175s/4-batch (fw: 0.428s, bw: 0.642s), 3.4 im/s, lr: 5.9e-05
[ 595/10000] focal loss: 1.593, box loss: 2.943, 1.125s/4-batch (fw: 0.436s, bw: 0.636s), 3.6 im/s, lr: 6.4e-05
[ 649/10000] focal loss: 1.712, box loss: 5.588, 1.112s/4-batch (fw: 0.406s, bw: 0.603s), 3.6 im/s, lr: 6.8e-05
[ 701/10000] focal loss: 1.625, box loss: 3.093, 1.154s/4-batch (fw: 0.449s, bw: 0.652s), 3.5 im/s, lr: 7.3e-05
[ 755/10000] focal loss: 1.594, box loss: 3.657, 1.121s/4-batch (fw: 0.428s, bw: 0.642s), 3.6 im/s, lr: 7.8e-05
[ 807/10000] focal loss: 1.633, box loss: 2.056, 1.155s/4-batch (fw: 0.412s, bw: 0.635s), 3.5 im/s, lr: 8.3e-05
[ 862/10000] focal loss: 1.599, box loss: 2.000, 1.099s/4-batch (fw: 0.400s, bw: 0.647s), 3.6 im/s, lr: 8.8e-05
[ 914/10000] focal loss: 1.583, box loss: 4.031, 1.170s/4-batch (fw: 0.417s, bw: 0.644s), 3.4 im/s, lr: 9.2e-05
[ 970/10000] focal loss: 1.570, box loss: 2.684, 1.085s/4-batch (fw: 0.412s, bw: 0.623s), 3.7 im/s, lr: 9.7e-05
No detections!
[ 1019/10000] focal loss: 1.596, box loss: 6.587, 1.229s/4-batch (fw: 0.445s, bw: 0.630s), 3.3 im/s, lr: 0.0001
[ 1073/10000] focal loss: 1.579, box loss: 2.010, 1.117s/4-batch (fw: 0.429s, bw: 0.635s), 3.6 im/s, lr: 0.0001
[ 1124/10000] focal loss: 1.641, box loss: 2.213, 1.188s/4-batch (fw: 0.438s, bw: 0.639s), 3.4 im/s, lr: 0.0001
[ 1178/10000] focal loss: 1.600, box loss: 2.111, 1.130s/4-batch (fw: 0.434s, bw: 0.647s), 3.5 im/s, lr: 0.0001
[ 1230/10000] focal loss: 1.606, box loss: 1.746, 1.159s/4-batch (fw: 0.423s, bw: 0.628s), 3.4 im/s, lr: 0.0001
[ 1285/10000] focal loss: 1.561, box loss: 1.938, 1.092s/4-batch (fw: 0.405s, bw: 0.636s), 3.7 im/s, lr: 0.0001
[ 1336/10000] focal loss: 1.779, box loss: 2.265, 1.192s/4-batch (fw: 0.442s, bw: 0.637s), 3.4 im/s, lr: 0.0001
[ 1392/10000] focal loss: 1.554, box loss: 1.222, 1.076s/4-batch (fw: 0.403s, bw: 0.624s), 3.7 im/s, lr: 0.0001
[ 1446/10000] focal loss: 1.633, box loss: 2.403, 1.124s/4-batch (fw: 0.393s, bw: 0.624s), 3.6 im/s, lr: 0.0001
[ 1500/10000] focal loss: 1.610, box loss: 1.566, 1.124s/4-batch (fw: 0.418s, bw: 0.653s), 3.6 im/s, lr: 0.0001
[ 1552/10000] focal loss: 1.660, box loss: 0.746, 1.156s/4-batch (fw: 0.404s, bw: 0.647s), 3.5 im/s, lr: 0.0001
[ 1607/10000] focal loss: 1.583, box loss: 1.120, 1.091s/4-batch (fw: 0.401s, bw: 0.641s), 3.7 im/s, lr: 0.0001
[ 1660/10000] focal loss: 1.653, box loss: 0.905, 1.152s/4-batch (fw: 0.419s, bw: 0.626s), 3.5 im/s, lr: 0.0001
[ 1715/10000] focal loss: 1.575, box loss: 2.376, 1.094s/4-batch (fw: 0.408s, bw: 0.635s), 3.7 im/s, lr: 0.0001
[ 1767/10000] focal loss: 1.689, box loss: 1.207, 1.171s/4-batch (fw: 0.411s, bw: 0.652s), 3.4 im/s, lr: 0.0001
[ 1819/10000] focal loss: 1.614, box loss: 2.224, 1.157s/4-batch (fw: 0.444s, bw: 0.659s), 3.5 im/s, lr: 0.0001
[ 1873/10000] focal loss: 1.503, box loss: 0.833, 1.125s/4-batch (fw: 0.400s, bw: 0.623s), 3.6 im/s, lr: 0.0001
[ 1930/10000] focal loss: 1.545, box loss: 0.840, 1.058s/4-batch (fw: 0.393s, bw: 0.616s), 3.8 im/s, lr: 0.0001
[ 1983/10000] focal loss: 1.678, box loss: 1.809, 1.145s/4-batch (fw: 0.411s, bw: 0.628s), 3.5 im/s, lr: 0.0001
No detections!
[ 2036/10000] focal loss: 1.584, box loss: 1.383, 1.150s/4-batch (fw: 0.432s, bw: 0.629s), 3.5 im/s, lr: 0.0001
[ 2090/10000] focal loss: 1.679, box loss: 1.153, 1.114s/4-batch (fw: 0.395s, bw: 0.615s), 3.6 im/s, lr: 0.0001
[ 2144/10000] focal loss: 1.568, box loss: 1.296, 1.114s/4-batch (fw: 0.411s, bw: 0.649s), 3.6 im/s, lr: 0.0001
[ 2197/10000] focal loss: 1.732, box loss: 0.429, 1.153s/4-batch (fw: 0.420s, bw: 0.625s), 3.5 im/s, lr: 0.0001
[ 2252/10000] focal loss: 1.495, box loss: 1.195, 1.102s/4-batch (fw: 0.415s, bw: 0.635s), 3.6 im/s, lr: 0.0001
[ 2305/10000] focal loss: 1.740, box loss: 0.764, 1.162s/4-batch (fw: 0.420s, bw: 0.634s), 3.4 im/s, lr: 0.0001
[ 2360/10000] focal loss: 1.341, box loss: 1.305, 1.097s/4-batch (fw: 0.405s, bw: 0.641s), 3.6 im/s, lr: 0.0001
[ 2413/10000] focal loss: 1.620, box loss: 1.702, 1.161s/4-batch (fw: 0.411s, bw: 0.642s), 3.4 im/s, lr: 0.0001
[ 2466/10000] focal loss: 1.300, box loss: 1.425, 1.146s/4-batch (fw: 0.443s, bw: 0.648s), 3.5 im/s, lr: 0.0001
[ 2521/10000] focal loss: 1.109, box loss: 0.828, 1.135s/4-batch (fw: 0.414s, bw: 0.619s), 3.5 im/s, lr: 0.0001
[ 2576/10000] focal loss: 1.254, box loss: 1.163, 1.106s/4-batch (fw: 0.429s, bw: 0.625s), 3.6 im/s, lr: 0.0001
[ 2629/10000] focal loss: 1.376, box loss: 1.663, 1.143s/4-batch (fw: 0.408s, bw: 0.630s), 3.5 im/s, lr: 0.0001
[ 2683/10000] focal loss: 1.228, box loss: 1.062, 1.129s/4-batch (fw: 0.439s, bw: 0.638s), 3.5 im/s, lr: 0.0001
[ 2737/10000] focal loss: 1.470, box loss: 1.112, 1.157s/4-batch (fw: 0.411s, bw: 0.642s), 3.5 im/s, lr: 0.0001
[ 2791/10000] focal loss: 1.174, box loss: 2.000, 1.119s/4-batch (fw: 0.412s, bw: 0.653s), 3.6 im/s, lr: 0.0001
[ 2845/10000] focal loss: 1.341, box loss: 0.803, 1.123s/4-batch (fw: 0.403s, bw: 0.616s), 3.6 im/s, lr: 0.0001
[ 2900/10000] focal loss: 1.133, box loss: 1.144, 1.098s/4-batch (fw: 0.417s, bw: 0.629s), 3.6 im/s, lr: 0.0001
[ 2955/10000] focal loss: 1.146, box loss: 2.585, 1.103s/4-batch (fw: 0.411s, bw: 0.591s), 3.6 im/s, lr: 0.0001
No detections!
[ 3008/10000] focal loss: 1.077, box loss: 1.042, 1.143s/4-batch (fw: 0.422s, bw: 0.633s), 3.5 im/s, lr: 0.0001
[ 3061/10000] focal loss: 1.310, box loss: 1.282, 1.171s/4-batch (fw: 0.427s, bw: 0.634s), 3.4 im/s, lr: 0.0001
[ 3116/10000] focal loss: 1.007, box loss: 0.955, 1.102s/4-batch (fw: 0.428s, bw: 0.624s), 3.6 im/s, lr: 0.0001
[ 3169/10000] focal loss: 1.185, box loss: 1.076, 1.182s/4-batch (fw: 0.435s, bw: 0.640s), 3.4 im/s, lr: 0.0001
[ 3222/10000] focal loss: 0.944, box loss: 0.855, 1.136s/4-batch (fw: 0.407s, bw: 0.676s), 3.5 im/s, lr: 0.0001
[ 3277/10000] focal loss: 0.950, box loss: 0.866, 1.143s/4-batch (fw: 0.399s, bw: 0.641s), 3.5 im/s, lr: 0.0001
[ 3332/10000] focal loss: 0.830, box loss: 1.257, 1.103s/4-batch (fw: 0.404s, bw: 0.648s), 3.6 im/s, lr: 0.0001
[ 3385/10000] focal loss: 0.737, box loss: 0.568, 1.136s/4-batch (fw: 0.412s, bw: 0.620s), 3.5 im/s, lr: 0.0001
[ 3440/10000] focal loss: 0.749, box loss: 0.626, 1.109s/4-batch (fw: 0.422s, bw: 0.637s), 3.6 im/s, lr: 0.0001
[ 3493/10000] focal loss: 0.774, box loss: 0.862, 1.162s/4-batch (fw: 0.418s, bw: 0.639s), 3.4 im/s, lr: 0.0001
[ 3546/10000] focal loss: 0.714, box loss: 1.785, 1.134s/4-batch (fw: 0.411s, bw: 0.668s), 3.5 im/s, lr: 0.0001
[ 3599/10000] focal loss: 0.666, box loss: 1.586, 1.133s/4-batch (fw: 0.422s, bw: 0.657s), 3.5 im/s, lr: 0.0001
[ 3650/10000] focal loss: 0.703, box loss: 2.287, 1.186s/4-batch (fw: 0.418s, bw: 0.656s), 3.4 im/s, lr: 0.0001
[ 3704/10000] focal loss: 0.665, box loss: 1.457, 1.113s/4-batch (fw: 0.426s, bw: 0.636s), 3.6 im/s, lr: 0.0001
[ 3756/10000] focal loss: 0.667, box loss: 1.809, 1.160s/4-batch (fw: 0.410s, bw: 0.639s), 3.4 im/s, lr: 0.0001
[ 3809/10000] focal loss: 0.657, box loss: 0.753, 1.136s/4-batch (fw: 0.428s, bw: 0.655s), 3.5 im/s, lr: 0.0001
[ 3861/10000] focal loss: 0.654, box loss: 0.820, 1.160s/4-batch (fw: 0.393s, bw: 0.657s), 3.4 im/s, lr: 0.0001
[ 3916/10000] focal loss: 0.618, box loss: 0.965, 1.094s/4-batch (fw: 0.414s, bw: 0.627s), 3.7 im/s, lr: 0.0001
[ 3968/10000] focal loss: 0.636, box loss: 1.640, 1.159s/4-batch (fw: 0.398s, bw: 0.654s), 3.5 im/s, lr: 0.0001
No detections!
[ 4020/10000] focal loss: 0.600, box loss: 2.283, 1.157s/4-batch (fw: 0.415s, bw: 0.654s), 3.5 im/s, lr: 0.0001
[ 4071/10000] focal loss: 0.627, box loss: 0.755, 1.190s/4-batch (fw: 0.433s, bw: 0.648s), 3.4 im/s, lr: 0.0001
[ 4125/10000] focal loss: 0.600, box loss: 1.796, 1.112s/4-batch (fw: 0.398s, bw: 0.663s), 3.6 im/s, lr: 0.0001
[ 4177/10000] focal loss: 0.546, box loss: 0.562, 1.172s/4-batch (fw: 0.409s, bw: 0.656s), 3.4 im/s, lr: 0.0001
[ 4230/10000] focal loss: 0.621, box loss: 1.456, 1.132s/4-batch (fw: 0.432s, bw: 0.647s), 3.5 im/s, lr: 0.0001
[ 4285/10000] focal loss: 0.561, box loss: 0.906, 1.129s/4-batch (fw: 0.420s, bw: 0.607s), 3.5 im/s, lr: 0.0001
[ 4341/10000] focal loss: 0.570, box loss: 1.330, 1.089s/4-batch (fw: 0.403s, bw: 0.635s), 3.7 im/s, lr: 0.0001
[ 4394/10000] focal loss: 0.565, box loss: 0.601, 1.146s/4-batch (fw: 0.393s, bw: 0.644s), 3.5 im/s, lr: 0.0001
[ 4448/10000] focal loss: 0.542, box loss: 1.136, 1.120s/4-batch (fw: 0.408s, bw: 0.659s), 3.6 im/s, lr: 0.0001
[ 4501/10000] focal loss: 0.498, box loss: 1.067, 1.148s/4-batch (fw: 0.407s, bw: 0.636s), 3.5 im/s, lr: 0.0001
[ 4555/10000] focal loss: 0.519, box loss: 1.624, 1.117s/4-batch (fw: 0.430s, bw: 0.635s), 3.6 im/s, lr: 0.0001
[ 4609/10000] focal loss: 0.446, box loss: 0.803, 1.165s/4-batch (fw: 0.402s, bw: 0.663s), 3.4 im/s, lr: 0.0001
[ 4662/10000] focal loss: 0.545, box loss: 1.195, 1.144s/4-batch (fw: 0.432s, bw: 0.658s), 3.5 im/s, lr: 0.0001
[ 4717/10000] focal loss: 0.512, box loss: 0.843, 1.143s/4-batch (fw: 0.412s, bw: 0.625s), 3.5 im/s, lr: 0.0001
[ 4771/10000] focal loss: 0.535, box loss: 1.508, 1.128s/4-batch (fw: 0.413s, bw: 0.662s), 3.5 im/s, lr: 0.0001
[ 4825/10000] focal loss: 0.473, box loss: 2.096, 1.166s/4-batch (fw: 0.448s, bw: 0.613s), 3.4 im/s, lr: 0.0001
[ 4879/10000] focal loss: 0.533, box loss: 2.383, 1.129s/4-batch (fw: 0.413s, bw: 0.664s), 3.5 im/s, lr: 0.0001
[ 4933/10000] focal loss: 0.464, box loss: 2.563, 1.166s/4-batch (fw: 0.410s, bw: 0.650s), 3.4 im/s, lr: 0.0001
[ 4988/10000] focal loss: 0.505, box loss: 2.629, 1.097s/4-batch (fw: 0.407s, bw: 0.639s), 3.6 im/s, lr: 0.0001
No detections!
[ 5041/10000] focal loss: 0.474, box loss: 1.094, 1.189s/4-batch (fw: 0.415s, bw: 0.632s), 3.4 im/s, lr: 0.0001
[ 5095/10000] focal loss: 0.501, box loss: 0.931, 1.125s/4-batch (fw: 0.437s, bw: 0.635s), 3.6 im/s, lr: 0.0001
[ 5149/10000] focal loss: 0.438, box loss: 0.670, 1.156s/4-batch (fw: 0.414s, bw: 0.638s), 3.5 im/s, lr: 0.0001
[ 5203/10000] focal loss: 0.450, box loss: 0.881, 1.129s/4-batch (fw: 0.434s, bw: 0.643s), 3.5 im/s, lr: 0.0001
[ 5257/10000] focal loss: 0.420, box loss: 0.725, 1.131s/4-batch (fw: 0.403s, bw: 0.625s), 3.5 im/s, lr: 0.0001
[ 5311/10000] focal loss: 0.474, box loss: 1.346, 1.114s/4-batch (fw: 0.408s, bw: 0.652s), 3.6 im/s, lr: 0.0001
[ 5365/10000] focal loss: 0.410, box loss: 0.584, 1.157s/4-batch (fw: 0.406s, bw: 0.647s), 3.5 im/s, lr: 0.0001
[ 5421/10000] focal loss: 0.437, box loss: 1.293, 1.077s/4-batch (fw: 0.409s, bw: 0.617s), 3.7 im/s, lr: 0.0001
[ 5473/10000] focal loss: 0.402, box loss: 0.727, 1.157s/4-batch (fw: 0.395s, bw: 0.652s), 3.5 im/s, lr: 0.0001
[ 5527/10000] focal loss: 0.422, box loss: 0.977, 1.122s/4-batch (fw: 0.416s, bw: 0.654s), 3.6 im/s, lr: 0.0001
[ 5581/10000] focal loss: 0.449, box loss: 0.997, 1.153s/4-batch (fw: 0.397s, bw: 0.652s), 3.5 im/s, lr: 0.0001
[ 5634/10000] focal loss: 0.433, box loss: 3.886, 1.132s/4-batch (fw: 0.419s, bw: 0.659s), 3.5 im/s, lr: 0.0001
[ 5689/10000] focal loss: 0.388, box loss: 1.617, 1.154s/4-batch (fw: 0.395s, bw: 0.658s), 3.5 im/s, lr: 0.0001
[ 5743/10000] focal loss: 0.478, box loss: 1.465, 1.118s/4-batch (fw: 0.402s, bw: 0.662s), 3.6 im/s, lr: 0.0001
[ 5797/10000] focal loss: 0.464, box loss: 2.164, 1.172s/4-batch (fw: 0.423s, bw: 0.644s), 3.4 im/s, lr: 0.0001
[ 5851/10000] focal loss: 0.429, box loss: 1.199, 1.119s/4-batch (fw: 0.417s, bw: 0.650s), 3.6 im/s, lr: 0.0001
[ 5905/10000] focal loss: 0.378, box loss: 0.750, 1.134s/4-batch (fw: 0.403s, bw: 0.626s), 3.5 im/s, lr: 0.0001
[ 5958/10000] focal loss: 0.406, box loss: 1.123, 1.139s/4-batch (fw: 0.442s, bw: 0.644s), 3.5 im/s, lr: 0.0001
No detections!
[ 6011/10000] focal loss: 0.402, box loss: 1.244, 1.134s/4-batch (fw: 0.420s, bw: 0.627s), 3.5 im/s, lr: 0.0001
[ 6062/10000] focal loss: 0.401, box loss: 1.378, 1.191s/4-batch (fw: 0.434s, bw: 0.647s), 3.4 im/s, lr: 0.0001
[ 6116/10000] focal loss: 0.394, box loss: 0.824, 1.124s/4-batch (fw: 0.429s, bw: 0.643s), 3.6 im/s, lr: 0.0001
[ 6169/10000] focal loss: 0.419, box loss: 1.083, 1.146s/4-batch (fw: 0.417s, bw: 0.622s), 3.5 im/s, lr: 0.0001
[ 6223/10000] focal loss: 0.401, box loss: 0.606, 1.112s/4-batch (fw: 0.413s, bw: 0.645s), 3.6 im/s, lr: 0.0001
[ 6275/10000] focal loss: 0.410, box loss: 1.546, 1.158s/4-batch (fw: 0.404s, bw: 0.645s), 3.5 im/s, lr: 0.0001
[ 6329/10000] focal loss: 0.401, box loss: 0.995, 1.131s/4-batch (fw: 0.438s, bw: 0.639s), 3.5 im/s, lr: 0.0001
[ 6381/10000] focal loss: 0.373, box loss: 1.228, 1.165s/4-batch (fw: 0.396s, bw: 0.663s), 3.4 im/s, lr: 0.0001
[ 6434/10000] focal loss: 0.391, box loss: 1.282, 1.145s/4-batch (fw: 0.416s, bw: 0.673s), 3.5 im/s, lr: 0.0001
[ 6486/10000] focal loss: 0.388, box loss: 1.842, 1.166s/4-batch (fw: 0.400s, bw: 0.655s), 3.4 im/s, lr: 0.0001
[ 6541/10000] focal loss: 0.376, box loss: 0.896, 1.095s/4-batch (fw: 0.407s, bw: 0.637s), 3.7 im/s, lr: 0.0001
[ 6593/10000] focal loss: 0.361, box loss: 0.763, 1.173s/4-batch (fw: 0.422s, bw: 0.642s), 3.4 im/s, lr: 0.0001
[ 6648/10000] focal loss: 0.383, box loss: 0.714, 1.106s/4-batch (fw: 0.424s, bw: 0.631s), 3.6 im/s, lr: 0.0001
[ 6700/10000] focal loss: 0.426, box loss: 0.775, 1.158s/4-batch (fw: 0.414s, bw: 0.633s), 3.5 im/s, lr: 0.0001
[ 6756/10000] focal loss: 0.375, box loss: 1.123, 1.085s/4-batch (fw: 0.407s, bw: 0.627s), 3.7 im/s, lr: 0.0001
[ 6808/10000] focal loss: 0.396, box loss: 1.614, 1.164s/4-batch (fw: 0.420s, bw: 0.632s), 3.4 im/s, lr: 0.0001
[ 6862/10000] focal loss: 0.386, box loss: 0.799, 1.125s/4-batch (fw: 0.428s, bw: 0.645s), 3.6 im/s, lr: 0.0001
[ 6914/10000] focal loss: 0.390, box loss: 0.618, 1.165s/4-batch (fw: 0.414s, bw: 0.645s), 3.4 im/s, lr: 0.0001
[ 6968/10000] focal loss: 0.377, box loss: 1.434, 1.113s/4-batch (fw: 0.407s, bw: 0.655s), 3.6 im/s, lr: 0.0001
No detections!
[ 7021/10000] focal loss: 0.385, box loss: 0.466, 1.188s/4-batch (fw: 0.417s, bw: 0.633s), 3.4 im/s, lr: 0.0001
[ 7075/10000] focal loss: 0.371, box loss: 1.603, 1.125s/4-batch (fw: 0.407s, bw: 0.665s), 3.6 im/s, lr: 0.0001
[ 7129/10000] focal loss: 0.361, box loss: 2.978, 1.159s/4-batch (fw: 0.408s, bw: 0.647s), 3.5 im/s, lr: 0.0001
[ 7184/10000] focal loss: 0.392, box loss: 0.868, 1.109s/4-batch (fw: 0.407s, bw: 0.649s), 3.6 im/s, lr: 0.0001
[ 7237/10000] focal loss: 0.364, box loss: 1.274, 1.176s/4-batch (fw: 0.413s, bw: 0.655s), 3.4 im/s, lr: 0.0001
[ 7290/10000] focal loss: 0.391, box loss: 1.061, 1.145s/4-batch (fw: 0.420s, bw: 0.669s), 3.5 im/s, lr: 0.0001
[ 7345/10000] focal loss: 0.375, box loss: 0.844, 1.130s/4-batch (fw: 0.400s, bw: 0.628s), 3.5 im/s, lr: 0.0001
[ 7399/10000] focal loss: 0.362, box loss: 0.732, 1.111s/4-batch (fw: 0.431s, bw: 0.626s), 3.6 im/s, lr: 0.0001
[ 7453/10000] focal loss: 0.404, box loss: 0.646, 1.139s/4-batch (fw: 0.398s, bw: 0.634s), 3.5 im/s, lr: 0.0001
[ 7508/10000] focal loss: 0.355, box loss: 0.695, 1.101s/4-batch (fw: 0.401s, bw: 0.646s), 3.6 im/s, lr: 0.0001
[ 7561/10000] focal loss: 0.362, box loss: 0.707, 1.153s/4-batch (fw: 0.416s, bw: 0.631s), 3.5 im/s, lr: 0.0001
[ 7615/10000] focal loss: 0.353, box loss: 1.325, 1.112s/4-batch (fw: 0.431s, bw: 0.629s), 3.6 im/s, lr: 0.0001
[ 7669/10000] focal loss: 0.346, box loss: 0.762, 1.158s/4-batch (fw: 0.408s, bw: 0.646s), 3.5 im/s, lr: 0.0001
[ 7722/10000] focal loss: 0.361, box loss: 0.610, 1.139s/4-batch (fw: 0.420s, bw: 0.666s), 3.5 im/s, lr: 0.0001
[ 7776/10000] focal loss: 0.347, box loss: 1.628, 1.119s/4-batch (fw: 0.416s, bw: 0.650s), 3.6 im/s, lr: 0.0001
[ 7830/10000] focal loss: 0.353, box loss: 0.710, 1.132s/4-batch (fw: 0.414s, bw: 0.617s), 3.5 im/s, lr: 0.0001
[ 7884/10000] focal loss: 0.357, box loss: 1.223, 1.115s/4-batch (fw: 0.411s, bw: 0.652s), 3.6 im/s, lr: 0.0001
[ 7935/10000] focal loss: 0.344, box loss: 2.945, 1.197s/4-batch (fw: 0.427s, bw: 0.661s), 3.3 im/s, lr: 0.0001
[ 7989/10000] focal loss: 0.361, box loss: 1.690, 1.124s/4-batch (fw: 0.426s, bw: 0.645s), 3.6 im/s, lr: 0.0001
No detections!
[ 8042/10000] focal loss: 0.369, box loss: 1.466, 1.151s/4-batch (fw: 0.394s, bw: 0.617s), 3.5 im/s, lr: 0.0001
[ 8096/10000] focal loss: 0.362, box loss: 0.753, 1.121s/4-batch (fw: 0.400s, bw: 0.669s), 3.6 im/s, lr: 0.0001
[ 8148/10000] focal loss: 0.354, box loss: 0.929, 1.176s/4-batch (fw: 0.412s, bw: 0.654s), 3.4 im/s, lr: 0.0001
[ 8203/10000] focal loss: 0.354, box loss: 0.899, 1.093s/4-batch (fw: 0.391s, bw: 0.652s), 3.7 im/s, lr: 0.0001
[ 8255/10000] focal loss: 0.434, box loss: 2.358, 1.159s/4-batch (fw: 0.403s, bw: 0.648s), 3.5 im/s, lr: 0.0001
[ 8310/10000] focal loss: 0.359, box loss: 1.237, 1.094s/4-batch (fw: 0.396s, bw: 0.645s), 3.7 im/s, lr: 0.0001
[ 8361/10000] focal loss: 0.357, box loss: 0.735, 1.198s/4-batch (fw: 0.434s, bw: 0.651s), 3.3 im/s, lr: 0.0001
[ 8414/10000] focal loss: 0.354, box loss: 0.787, 1.143s/4-batch (fw: 0.425s, bw: 0.664s), 3.5 im/s, lr: 0.0001
[ 8466/10000] focal loss: 0.357, box loss: 0.731, 1.159s/4-batch (fw: 0.417s, bw: 0.634s), 3.5 im/s, lr: 0.0001
[ 8522/10000] focal loss: 0.377, box loss: 0.574, 1.080s/4-batch (fw: 0.395s, bw: 0.634s), 3.7 im/s, lr: 0.0001
[ 8575/10000] focal loss: 0.378, box loss: 2.198, 1.136s/4-batch (fw: 0.401s, bw: 0.631s), 3.5 im/s, lr: 0.0001
[ 8630/10000] focal loss: 0.395, box loss: 0.847, 1.098s/4-batch (fw: 0.415s, bw: 0.629s), 3.6 im/s, lr: 0.0001
[ 8680/10000] focal loss: 0.423, box loss: 1.142, 1.206s/4-batch (fw: 0.450s, bw: 0.641s), 3.3 im/s, lr: 0.0001
[ 8734/10000] focal loss: 0.341, box loss: 1.229, 1.113s/4-batch (fw: 0.423s, bw: 0.638s), 3.6 im/s, lr: 0.0001
[ 8786/10000] focal loss: 0.350, box loss: 0.521, 1.168s/4-batch (fw: 0.399s, bw: 0.660s), 3.4 im/s, lr: 0.0001
[ 8840/10000] focal loss: 0.349, box loss: 0.756, 1.127s/4-batch (fw: 0.421s, bw: 0.652s), 3.5 im/s, lr: 0.0001
[ 8894/10000] focal loss: 0.395, box loss: 0.733, 1.119s/4-batch (fw: 0.374s, bw: 0.644s), 3.6 im/s, lr: 0.0001
[ 8948/10000] focal loss: 0.376, box loss: 2.057, 1.127s/4-batch (fw: 0.418s, bw: 0.656s), 3.5 im/s, lr: 0.0001
No detections!
[ 9001/10000] focal loss: 0.309, box loss: 1.173, 1.185s/4-batch (fw: 0.411s, bw: 0.630s), 3.4 im/s, lr: 0.0001
[ 9055/10000] focal loss: 0.387, box loss: 1.072, 1.127s/4-batch (fw: 0.439s, bw: 0.635s), 3.6 im/s, lr: 0.0001
[ 9108/10000] focal loss: 0.342, box loss: 1.415, 1.140s/4-batch (fw: 0.438s, bw: 0.647s), 3.5 im/s, lr: 0.0001
[ 9162/10000] focal loss: 0.368, box loss: 0.844, 1.128s/4-batch (fw: 0.377s, bw: 0.648s), 3.5 im/s, lr: 0.0001
[ 9215/10000] focal loss: 0.361, box loss: 0.831, 1.141s/4-batch (fw: 0.435s, bw: 0.651s), 3.5 im/s, lr: 0.0001
[ 9267/10000] focal loss: 0.344, box loss: 1.460, 1.156s/4-batch (fw: 0.397s, bw: 0.650s), 3.5 im/s, lr: 0.0001
[ 9321/10000] focal loss: 0.329, box loss: 1.530, 1.130s/4-batch (fw: 0.422s, bw: 0.655s), 3.5 im/s, lr: 0.0001
[ 9374/10000] focal loss: 0.343, box loss: 1.406, 1.146s/4-batch (fw: 0.396s, bw: 0.643s), 3.5 im/s, lr: 0.0001
[ 9428/10000] focal loss: 0.328, box loss: 0.683, 1.125s/4-batch (fw: 0.421s, bw: 0.653s), 3.6 im/s, lr: 0.0001
[ 9481/10000] focal loss: 0.406, box loss: 1.147, 1.141s/4-batch (fw: 0.413s, bw: 0.623s), 3.5 im/s, lr: 0.0001
[ 9533/10000] focal loss: 0.349, box loss: 0.781, 1.154s/4-batch (fw: 0.435s, bw: 0.665s), 3.5 im/s, lr: 0.0001
[ 9585/10000] focal loss: 0.310, box loss: 1.570, 1.159s/4-batch (fw: 0.405s, bw: 0.644s), 3.5 im/s, lr: 0.0001
[ 9640/10000] focal loss: 0.343, box loss: 1.507, 1.107s/4-batch (fw: 0.423s, bw: 0.630s), 3.6 im/s, lr: 0.0001
[ 9693/10000] focal loss: 0.319, box loss: 2.601, 1.139s/4-batch (fw: 0.414s, bw: 0.618s), 3.5 im/s, lr: 0.0001
[ 9746/10000] focal loss: 0.347, box loss: 1.979, 1.146s/4-batch (fw: 0.427s, bw: 0.665s), 3.5 im/s, lr: 0.0001
[ 9798/10000] focal loss: 0.315, box loss: 1.664, 1.166s/4-batch (fw: 0.416s, bw: 0.641s), 3.4 im/s, lr: 0.0001
[ 9852/10000] focal loss: 0.322, box loss: 0.816, 1.125s/4-batch (fw: 0.422s, bw: 0.649s), 3.6 im/s, lr: 0.0001
[ 9902/10000] focal loss: 0.331, box loss: 0.884, 1.213s/4-batch (fw: 0.426s, bw: 0.672s), 3.3 im/s, lr: 0.0001
[ 9957/10000] focal loss: 0.327, box loss: 2.321, 1.100s/4-batch (fw: 0.431s, bw: 0.614s), 3.6 im/s, lr: 0.0001
[10000/10000] focal loss: 0.327, box loss: 1.680, 1.128s/4-batch (fw: 0.429s, bw: 0.630s), 3.5 im/s, lr: 0.0001

^CTraceback (most recent call last):
File “/opt/conda/bin/odtk”, line 11, in
load_entry_point(‘odtk’, ‘console_scripts’, ‘odtk’)()
File “/workspace/retinanet/retinanet/main.py”, line 245, in main
torch.multiprocessing.spawn(worker, args=(args, world, model, state), nprocs=world)
File “/opt/conda/lib/python3.6/site-packages/torch/multiprocessing/spawn.py”, line 200, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method=‘spawn’)
File “/opt/conda/lib/python3.6/site-packages/torch/multiprocessing/spawn.py”, line 158, in start_processes
while not context.join():
File “/opt/conda/lib/python3.6/site-packages/torch/multiprocessing/spawn.py”, line 78, in join
timeout=timeout,
File “/opt/conda/lib/python3.6/multiprocessing/connection.py”, line 911, in wait
ready = selector.select(timeout)
File “/opt/conda/lib/python3.6/selectors.py”, line 376, in select
fd_event_list = self._poll.poll(timeout)
KeyboardInterrupt

Hi. im quite new using nvidia dockers and nvidia tools.
i tried this example on my dataset, but its seems something is wrong, i believe its my labels.
so im trying to get usa plate numbers. when the plate are quite rotates and with a perspective.
i am trying first with a sintetic database
Here 2 cases

  • nonrotated plates… they are axis aligned… i ran the example with my database and it worked quite well
  • augmented plates ( rotation, perspective, affine, and noise). for each of the original set. i used the imgaug to augment the img and the boxes, and i converted box to polygon so i get the rotated polygon, and from that i used the _corners2rotatedbbox for the annotation.
    in this second case… no matter how small it always says reduce lr as it is diverging

can anyone giveme some hint ? where to look for error, or what to do?

I am also having a problem with training against the coco dataset. I have tried version 20.03 and the latest version of odtk with the same result. I am training about 10K iterations.

I have a visualizer for the rotated bounding boxes using tensor board and they look correct. However, I am interpreting the rotation being about the center of the box as opposed to the xy min. Also is the range -pi/4 to pi/4 or -pi/2 to pi/2?

Have you made progress since you posted this question?

Was trying to use object detection for given video, but its failing, seems like a rotating object in video is hard to grab.

Did you find code for this “Many datasets (for example, COCO and ISPRS) come with segmentation masks. These masks can be converted into rotated bounding boxes by using a geometry package.”?