YOLO Object Detection Plugin for DeepStream 2.0

Hi all, we have released a new sample plugin for DeepStream 2.0 performing YOLO (You Only Look Once) object detection, accelerated with TensorRT. Supported models include YOLO v2 & v3.

Find the sources here: https://github.com/vat-nvidia/deepstream-plugins

Hi.
I am trying to use my yolov3 model with the plugin.

I modefy as bellows.

#define MODEL_V3

const uint OUTPUT_CLASSES = 1;
const std::vectorstd::string CLASS_NAMES = {“person”};

const float PROB_THRESH = 0.5f;
const float NMS_THRESH = 0.5f;

When I test it, the accuracy rate is lower than 50%(I had tested my yolov3 model with darknet, the accuracy rate is higher than 95%).

Is there any parameter that I haven’t modified ?

Hi dusty,

Did you try running yolov2-tiny using this code? When I use the cfg and associated weights file from pjreddie’s website the code runs without any problems (after also changing the kOUTPUT_BLOB_NAME variable to region_16 instead of region_32). For the rest I guess everything remains the same. The downsampling factor of 32 and anchor boxes are the same for both models.

However, it is not detecting anything on the data/dog.jpg example. The original darknet code using this cfg and weightfile does output 3 objects for this image.

After some investigation I noticed that yolov2-tiny contains a maxpool layer with size=2 and stride=1. So there is no downsampling after this layer. However because the default padding option for the MaxPool layer is 0 the output size of this layer becomes 12x12 instead of 13x13.

Any idea how do correctly handle this situation?
Without padding the resulting dimensions are 12x12. With padding we get 14x14. We should be ending up with 13x13 though…


Adding an imaginary row and column with -FLOAT_MAX values should do the trick but how to do this in TensorRT?

Update: nevermind I have figured it out myself. If I find some time later on I can make a pull request on github.

Hi Beerend,
How did you solve the max-pooling padding problem for tiny yolov3?
I don’t want to add max-pooling plugin. :(

Hi wtiandong,

I had implemented a workaround using a padding layer but this was actually not 100% correct. However, by now this problem is fixed in a better way in the original repository so I follow this approach.

Here’s roughly which changes you should make.

trt_utils.h defines a class to compute Padding sizes for MaxPool layers.

class YoloTinyMaxpoolPaddingFormula : public nvinfer1::IOutputDimensionsFormula

You can then give your network a pointer to a YoloTinyMaxpoolPaddingFormula instance which will then be used to compute the correct adding for any MaxPool layer that will be added to the network.

yolo.cpp

network->setPoolingOutputDimensionsFormula(m_TinyMaxpoolPaddingFormula.get());

Basically what it does is that it will use “valid” padding for all layers except the ones that are explicitly marked by name to require “same” padding.

yolo.cpp

else if (m_configBlocks.at(i).at("type") == "maxpool")
{
    // Add same padding layers
    if (m_configBlocks.at(i).at("size") == "2" && m_configBlocks.at(i).at("stride") == "1")
    {
        m_TinyMaxpoolPaddingFormula->addSamePaddingLayer("maxpool_" + std::to_string(i));
    }
    std::string inputVol = dimsToString(previous->getDimensions());
    nvinfer1::ILayer* out = netAddMaxpool(i, m_configBlocks.at(i), previous, network);
    previous = out->getOutput(0);
    assert(previous != nullptr);
    std::string outputVol = dimsToString(previous->getDimensions());
    tensorOutputs.push_back(out->getOutput(0));
    printLayerInfo(layerIndex, "maxpool", inputVol, outputVol, std::to_string(weightPtr));
}

Note that the formulas will only work for square networks. If your network is rectangular you need to make some additional minor changes. Best to run this code in Debug mode because there are some assertions that should warn you when things are going wrong.

Hi Beerend,
That’s really nice of you. Thanks, I will have a try. :)