Yolo for Jetson

albertr · September 29, 2018, 4:52pm

Question for folks who upgraded to JetPack 3.3: Are you still able to run the original models included in DeepStream SDK 1.5 after the Jetpack upgrade? I’ve just upgraded tensortrt to 4.0 on our Tx2 system and while I can run yolo plugin, the original SDK 1.5 models are sig’faulting now.

Just curious if models included in SDK 1.5 can be run with tensortrt at all… I think someone mentioned that they are not compatible?

-albertr

CJR · September 29, 2018, 6:14pm

@albertr

nvvidconv plugin is available in DS 1.5 for tegra platforms.
Clear gstreamer cache using

$ sudo rm -rf ${HOME}/.cache/*

Clear all the ‘.cache’ files in /home/nvidia/Model/ dir and all its sub dirs

Run the nvgstiva-app again and the default models should work.

albertr · September 29, 2018, 8:12pm

Thank you for suggestion on removing model caches, that did it!

I’ll try to look on color space conversion next.

-albertr

prince15046 · September 30, 2018, 12:03pm

Trying with Multiple Batch Size.

I tried with all batch_sizes by changing the parameter.

But while supplying the lower batch_size collection of images while testing in trt-yolo-app. The code crashes.

I want to test with less images than batch_size. Can it be done ?

prashanth.bhat · October 1, 2018, 12:39pm

I am now able to get Jetpack 3.3 working with DeepStream 1.5, thanks to the suggestion in Comment #22 by NvCJR. Removing the Model .cache files was the missing step. Thanks again

CJR · October 1, 2018, 7:25pm

@prince15046 Yes, it is possible. The app is only designed to serve as a reference to enable yolo in Deepstream / TensorRT. You can always modify it according to your requirements. You can start by modifying this line of code - https://github.com/vat-nvidia/deepstream-plugins/blob/master/sources/apps/trt-yolo/trt-yolo-app.cpp#L80

It resizes the number of inference images into a multiple of batch size. In your case it would make the number of inference images 0. Keep in mind that the TensortRT context also requires the number of images being used in each forward pass. Right now batch size is being used for this value. Reference - https://github.com/vat-nvidia/deepstream-plugins/blob/master/sources/lib/yolov3.cpp#L76

prince15046 · October 2, 2018, 2:29pm

@NvCJR

I am facing issue with the results of new weights and cfg of Tiny V3

Files changed -
network_config.cpp
cfg file
weights file

After training for a good amount of time, still the model predicts multiple false positives.
Changes - Mask’s according to the data, Anchor’s, Image size(increased to 800).

Any other changes that you would suggest ?

#ifdef MODEL_V3_TINY

const float kPROB_THRESH = 0.5f;
const float kNMS_THRESH = 0.5f;
const std::string kYOLO_CONFIG_PATH = "data/yolov3-tiny.cfg";
const std::string kTRAINED_WEIGHTS_PATH = "data/yolov3-tiny.weights";
const std::string kNETWORK_TYPE = "yolov3-tiny";
const std::string kCALIB_TABLE_PATH = kDS_LIB_PATH + "calibration/yolov3-tiny-calibration.table";
const uint kBBOXES = 3;
const std::vector<float> kANCHORS
    = {10.0, 14.0, 23.0, 27.0, 37.0, 58.0, 81.0, 82.0, 135.0, 169.0, 344.0, 319.0};
#endif

// Model V3 specific common global vars
namespace yoloV3Tiny
{
const uint kSTRIDE_1 = 32;
const uint kSTRIDE_2 = 16;
const uint kGRID_SIZE_1 = kINPUT_H / kSTRIDE_1;
const uint kGRID_SIZE_2 = kINPUT_H / kSTRIDE_2;
const uint64_t kOUTPUT_SIZE_1 = kGRID_SIZE_1 * kGRID_SIZE_1 * (kBBOXES * (5 + kOUTPUT_CLASSES));
const uint64_t kOUTPUT_SIZE_2 = kGRID_SIZE_2 * kGRID_SIZE_2 * (kBBOXES * (5 + kOUTPUT_CLASSES));
const std::vector<int> kMASK_1 = {3, 4, 5};
const std::vector<int> kMASK_2 = {0, 1, 2};
const std::string kOUTPUT_BLOB_NAME_1 = "yolo_17";
const std::string kOUTPUT_BLOB_NAME_2 = "yolo_24";
} // namespace yoloV3Tiny
} // namespace config

Resolved. The model is able to inference properly now. There was a correction in decodeTensor to load different masking values.

prashanth.bhat · October 13, 2018, 11:00am

I think there’s a bug in the plugin code, in the function that parses the input Darknet cfg file.

sources/lib/trt_utils.cpp:loadWeights()

The code skips 4 int32 or 5 int32 bytes at the beginning of the cfg file, depending on the Network Type.

if (networkType == “yolov2”)
{
// Remove 4 int32 bytes of data from the stream belonging to the header
file.ignore(4 * 4);
}
else if ((networkType == “yolov3”) || (networkType == “yolov3-tiny”)
|| (networkType == “yolov2-tiny”))
{
// Remove 5 int32 bytes of data from the stream belonging to the header
file.ignore(4 * 4);
}

The proposed correction - the number of bytes to skip should be based on the major and minor version, which are in the first 2 int32 values, as in the Darknet github parser.c, load_weights_upto():

fread(&major, sizeof(int), 1, fp);
fread(&minor, sizeof(int), 1, fp);
fread(&revision, sizeof(int), 1, fp);
if ((major*10 + minor) >= 2 && major < 1000 && minor < 1000){
    fread(net->seen, sizeof(size_t), 1, fp);
} else {
    int iseen = 0;
    fread(&iseen, sizeof(int), 1, fp);
    *net->seen = iseen;
}

Could someone please double check my proposed change?
Thanks.

Beerend · October 25, 2018, 2:56pm

Hi NvCJR,

What exactly does a “YOLORegionPlugin” do?

I was looking at the decodeYoloOutput function and found this part a bit peculiar:

float prob = detections[bbindex + numGridCells * (b * (5 + numOutputClasses) + (5 + i))];

I would expect the probability density vector to be contiguous in memory such that I could use std::max_element instead of the manual for-loop to find the largest probability so I was surprised about this layout.

Also it would be interesting if you could make these plugin layers open-source in another repository. Any chance of this happening?

prashanth.bhat · October 30, 2018, 1:35pm

NvCJR: It’s nice that the YOLO plugin runs outside of DeepStream as well, (trt-yolo-app) using just TensorRT.

Any plans of providing a Python variant of this independent app?

Thanks.

prashanth.bhat · October 31, 2018, 1:02pm

One more question on the YOLO plugin, this time on the implementation.

The inference happens on the GPU, and the resulting bounding boxes are copied over to the CPU, where the rest of the processing happens (non max suppression, probability filtering, IoU).

Wouldn’t it have been better to do these post processing steps also on the GPU?

prashanth.bhat · November 3, 2018, 4:45am

I would like to run the trt-yolo-app, without DeepStream, on a Tesla platform.

The README file of the plugin makes mention of this use case:

“To use just the stand alone trt-yolo-app, Deepstream Installation can be skipped. However CUDA 9.2 and TensorRT 4.x should be installed.”

I would appreciate some more instructions on making this work. i.e. what exactly must be manually installed, and which Makefiles must be modified?

Thanks

CJR · November 5, 2018, 6:39pm

@Beerend,
yolo v3 region layer is open sourced and is available in the same github repo - https://github.com/vat-nvidia/deepstream-plugins

@prashanth.bhat,

No python support is planned right now.
Yes, the postprocessing can be further optimized by doing NMS on the GPU.
You can either try cuda 9.0 and TensorRT 4 or cuda 10.0 and TensorRT 5. You dont have to make any changes to the Makefile but will have to update Makefile.config with install paths for OpenCV and TensorRT. I would suggest you go ahead with TensoRT 5 and cuda 10. Please refer to the following bug for a minor change in code you would need to make for TensorRT 5 - https://devtalk.nvidia.com/default/topic/1043214/general/trt-for-yolov3-fp16-and-int8-optimization-failed/post/

prashanth.bhat · November 6, 2018, 5:09am

NvCJR: thanks for replying. Regarding the last point - running trt-yolo-app without DeepStream, I haven’t been successful with this effort, perhaps it needs some changes. Have you or someone at NVIDIA actually tried this out on a setup without DeepStream installed?
I got error messages that DeepStream is required.

Thanks.

CJR · November 8, 2018, 7:24am

@prashanth.bhat Please create a new topic with more details regarding your setup and error messages so that we can help you out. The standalone implementation has been tested on both Tesla and Tegra devices and it is expected to work without any issues.

prashanth.bhat · November 8, 2018, 9:27am

Thanks, NvCJR. I created a new topic in the “DeepStream for Tesla” forum.

liuhang20011 · November 21, 2018, 8:32am

why deepstream-yolo app can only be used on TESLA, but not TEGRA?
and when can you prepare the new deepstream version for TX2 to support deepstream yolo?

545766231 · May 11, 2019, 3:07am

Hi NvCJR:

I’m using jetpack3.3 and deepstream 1.5 on TX2 and I have run trt-yolo-app successfully. Now I want to use the deepstream-yolo application. I did as the readme says and my deepstream sample application is working fine. But when I’m executing make && sudo make install in source/apps/deepstream_yolo directory it seems I missed the file named gstnvivameta_api.h. I searched my computer it turns out that I dnt have that file. Did I missed some dependency installations? By the way I’m using the code from DS32 release not the code from the original github page given.

Please clarify.

Thanks a million!

kayccc · May 21, 2019, 3:16am

Hi 545766231,

You could refer to below topic to see if can help; if not, please file a new topic with log.
https://devtalk.nvidia.com/default/topic/1046293/jetson-tx2/makefile-deepstream-example-on-jetson-tx2/post/5309363/#5309363

Thanks

545766231 · May 22, 2019, 12:57am

Ok, Thanks for replying. I’ll check that.