Hardware Platform (Jetson / GPU)
GPU
DeepStream Version
nvcr.io/nvidia/deepstream:6.0-devel
NVIDIA GPU Driver Version (valid for GPU only)
NVIDIA-SMI 495.46 Driver Version: 495.46 CUDA Version: 11.5
Issue Type( questions, new requirements, bugs)
I have been trying to replicate the Darknet YoloV4 results for the COCO dataset as I really like the TAO workflow but have been unable to match Darknet in terms of accuracy (mAP
) as I am consistently lower.
Given the resources at your disposal, are you able to produce a training spec (with bonus points for an official NGC model AI Models - Computer Vision, Conversational AI, and More | NVIDIA NGC) that produces per class accuracy similar to these which were calculated by running the Darknet official yolov4.weights and yolov4.cfg against the COCO2017 Validation set (5000 images)? I am sure this would be very helpful as a starting point for training custom YoloV4 models.
class_id = 0, name = person, ap = 79.29% (TP = 7956, FP = 3157)
class_id = 1, name = bicycle, ap = 60.28% (TP = 173, FP = 94)
class_id = 2, name = car, ap = 68.93% (TP = 1290, FP = 703)
class_id = 3, name = motorcycle, ap = 74.49% (TP = 266, FP = 134)
class_id = 4, name = airplane, ap = 90.64% (TP = 124, FP = 26)
class_id = 5, name = bus, ap = 84.81% (TP = 221, FP = 56)
class_id = 6, name = train, ap = 93.01% (TP = 168, FP = 37)
class_id = 7, name = truck, ap = 61.90% (TP = 253, FP = 214)
class_id = 8, name = boat, ap = 54.23% (TP = 223, FP = 132)
class_id = 9, name = traffic light, ap = 55.11% (TP = 371, FP = 216)
class_id = 10, name = fire hydrant, ap = 89.23% (TP = 86, FP = 11)
class_id = 11, name = stop sign, ap = 77.69% (TP = 56, FP = 19)
class_id = 12, name = parking meter, ap = 68.42% (TP = 38, FP = 14)
class_id = 13, name = bench, ap = 43.16% (TP = 178, FP = 195)
class_id = 14, name = bird, ap = 53.50% (TP = 223, FP = 102)
class_id = 15, name = cat, ap = 90.56% (TP = 167, FP = 52)
class_id = 16, name = dog, ap = 82.53% (TP = 178, FP = 65)
class_id = 17, name = horse, ap = 85.73% (TP = 226, FP = 70)
class_id = 18, name = sheep, ap = 78.52% (TP = 287, FP = 136)
class_id = 19, name = cow, ap = 80.75% (TP = 287, FP = 93)
class_id = 20, name = elephant, ap = 87.41% (TP = 228, FP = 64)
class_id = 21, name = bear, ap = 92.45% (TP = 62, FP = 5)
class_id = 22, name = zebra, ap = 91.89% (TP = 226, FP = 41)
class_id = 23, name = giraffe, ap = 93.04% (TP = 206, FP = 33)
class_id = 24, name = backpack, ap = 33.61% (TP = 132, FP = 189)
class_id = 25, name = umbrella, ap = 69.18% (TP = 283, FP = 163)
class_id = 26, name = handbag, ap = 33.49% (TP = 196, FP = 262)
class_id = 27, name = tie, ap = 57.87% (TP = 140, FP = 74)
class_id = 28, name = suitcase, ap = 71.06% (TP = 201, FP = 112)
class_id = 29, name = frisbee, ap = 88.07% (TP = 99, FP = 34)
class_id = 30, name = skis, ap = 51.67% (TP = 118, FP = 73)
class_id = 31, name = snowboard, ap = 56.31% (TP = 39, FP = 23)
class_id = 32, name = sports ball, ap = 62.70% (TP = 168, FP = 87)
class_id = 33, name = kite, ap = 67.57% (TP = 218, FP = 135)
class_id = 34, name = baseball bat, ap = 61.75% (TP = 83, FP = 36)
class_id = 35, name = baseball glove, ap = 65.70% (TP = 95, FP = 44)
class_id = 36, name = skateboard, ap = 79.67% (TP = 142, FP = 33)
class_id = 37, name = surfboard, ap = 63.34% (TP = 163, FP = 83)
class_id = 38, name = tennis racket, ap = 85.23% (TP = 188, FP = 64)
class_id = 39, name = bottle, ap = 58.25% (TP = 583, FP = 424)
class_id = 40, name = wine glass, ap = 58.73% (TP = 180, FP = 112)
class_id = 41, name = cup, ap = 64.70% (TP = 567, FP = 425)
class_id = 42, name = fork, ap = 59.49% (TP = 117, FP = 93)
class_id = 43, name = knife, ap = 35.51% (TP = 107, FP = 113)
class_id = 44, name = spoon, ap = 36.94% (TP = 89, FP = 139)
class_id = 45, name = bowl, ap = 61.86% (TP = 382, FP = 320)
class_id = 46, name = banana, ap = 43.44% (TP = 152, FP = 144)
class_id = 47, name = apple, ap = 29.17% (TP = 83, FP = 115)
class_id = 48, name = sandwich, ap = 57.77% (TP = 97, FP = 84)
class_id = 49, name = orange, ap = 40.90% (TP = 139, FP = 173)
class_id = 50, name = broccoli, ap = 45.10% (TP = 139, FP = 156)
class_id = 51, name = carrot, ap = 35.05% (TP = 162, FP = 275)
class_id = 52, name = hot dog, ap = 54.20% (TP = 60, FP = 36)
class_id = 53, name = pizza, ap = 73.71% (TP = 207, FP = 95)
class_id = 54, name = donut, ap = 62.85% (TP = 222, FP = 154)
class_id = 55, name = cake, ap = 62.36% (TP = 188, FP = 126)
class_id = 56, name = chair, ap = 56.48% (TP = 998, FP = 835)
class_id = 57, name = couch, ap = 65.76% (TP = 165, FP = 125)
class_id = 58, name = potted plant, ap = 52.67% (TP = 192, FP = 198)
class_id = 59, name = bed, ap = 72.57% (TP = 113, FP = 52)
class_id = 60, name = dining table, ap = 47.17% (TP = 368, FP = 401)
class_id = 61, name = toilet, ap = 85.77% (TP = 150, FP = 42)
class_id = 62, name = tv, ap = 83.08% (TP = 230, FP = 82)
class_id = 63, name = laptop, ap = 80.98% (TP = 180, FP = 74)
class_id = 64, name = mouse, ap = 82.85% (TP = 85, FP = 34)
class_id = 65, name = remote, ap = 60.85% (TP = 166, FP = 115)
class_id = 66, name = keyboard, ap = 76.71% (TP = 115, FP = 70)
class_id = 67, name = cell phone, ap = 62.18% (TP = 165, FP = 97)
class_id = 68, name = microwave, ap = 77.63% (TP = 44, FP = 22)
class_id = 69, name = oven, ap = 65.43% (TP = 90, FP = 65)
class_id = 70, name = toaster, ap = 60.70% (TP = 5, FP = 5)
class_id = 71, name = sink, ap = 65.99% (TP = 148, FP = 80)
class_id = 72, name = refrigerator, ap = 81.52% (TP = 100, FP = 47)
class_id = 73, name = book, ap = 26.10% (TP = 298, FP = 378)
class_id = 74, name = clock, ap = 73.27% (TP = 200, FP = 77)
class_id = 75, name = vase, ap = 58.27% (TP = 175, FP = 153)
class_id = 76, name = scissors, ap = 51.90% (TP = 17, FP = 8)
class_id = 77, name = teddy bear, ap = 71.03% (TP = 134, FP = 63)
class_id = 78, name = hair drier, ap = 7.12% (TP = 1, FP = 3)
class_id = 79, name = toothbrush, ap = 40.25% (TP = 27, FP = 28)
for conf_thresh = 0.25, precision = 0.69, recall = 0.65, F1-score = 0.67
for conf_thresh = 0.25, TP = 24077, FP = 10831, FN = 12704, average IoU = 56.97 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.703672, or 70.37 %