YOLO TensorRT model for sample_object_detector

imugly1029 · November 20, 2018, 7:56am

HI!
I try to use tensorRT_optimization to generate a tensorRT model from YOLO.
The YOLO’s .prototxt file is from https://github.com/TLESORT/YOLO-TensorRT-GIE- and it’s .caffemodel is converted by using https://github.com/xingwangsfu/caffe-yolo/.
Now I successfully generated a optimized.bin model from tensorRT_optimization. I used the command line:
./tensorRT_optimization --modelType=caffe --prototxt=/path/to/my/prototxt/file.prototxt --caffemodel=/path/to/my/caffemodel/file.caffemodel --outputBlobs=result

Then, I tried to use sample_object_detector with the transfered model optimized.bin.
The command line:
./sample_object_detector --tensorRT_model=/path/to/optimized.bin

However, it shows errors.

mec-lab@meclab-System-Product-Name:~/LBX/driveworks-1.2/bin$ ./sample_object_detector --tensorRT_model=/home/mec-lab/LBX/driveworks-1.2/tools/dnn/optimized.bin
[19-11-2018 12:54:51] Initialize DriveWorks SDK v1.2.400
[19-11-2018 12:54:51] Release build with GNU 4.8.5 from v1.2.0-rc11-0-ga7f5475
[19-11-2018 12:54:51] Platform: Detected Generic x86 Platform
[19-11-2018 12:54:51] TimeSource: monotonic epoch time offset is 1542597839247350
[19-11-2018 12:54:51] Platform: number of GPU devices detected 1
[19-11-2018 12:54:51] Platform: currently selected GPU device discrete ID 0
[19-11-2018 12:54:51] SDK: Resources mounted from .././data/resources
[19-11-2018 12:54:51] SensorFactory::createSensor() -> camera.virtual, tensorRT_model=/home/mec-lab/LBX/driveworks-1.2/tools/dnn/optimized.bin,video=.././data/samples/sfm/triangulation/video_0.h264
[19-11-2018 12:54:51] CameraNVCUVID: no seek table found at .././data/samples/sfm/triangulation/video_0.h264.seek, seeking is not available.
Camera image: 1280x800
Camera image with 1280x800 at 30 FPS
[19-11-2018 12:54:52] Added linear block of size 12845056
[19-11-2018 12:54:52] Added linear block of size 12845056
[19-11-2018 12:54:52] Added linear block of size 12845056
[19-11-2018 12:54:52] Added linear block of size 1605632
[19-11-2018 12:54:53] Driveworks exception thrown: DW_INVALID_ARGUMENT: blobIndex is larger than output binding count

terminate called after throwing an instance of 'std::runtime_error'
  what():  [2018-11-19 12:54:53] DW Error DW_INVALID_ARGUMENT executing DW function:
 dwDNN_getOutputSize(&m_networkOutputDimensions[1], 1U, m_dnn)
 at /builds/driveav/dw/sdk/samples/dnn/sample_object_detector/main.cpp:271
Aborted (core dumped)

By the way, I didn’t train the YOLO model by my own data. Is there something to do with this reasons?
Please give me some hints. Thanks!

NVES · November 20, 2018, 5:02pm

Hello, is this specific to the DriveOS platform?

NVES · November 20, 2018, 5:02pm

Moving to Drive PX2 for better coverage.

imugly1029 · November 21, 2018, 10:44am

Dear NVES
Yes I moved to DPX2 to generate the YOLO tensorRT model, but it shows the same error.
Just like mentioned here: [url]https://devtalk.nvidia.com/default/topic/1023364/general/how-to-run-sample-code-using-my-trained-data-custom-caffemodel-amp-custom-prototxt-on-host-pc-/[/url]

I used the same command for sample_object_detector
./sample_object_detector --tensorRT_model=/path/to/optimized.bin

By the way, may I asked the meaning about “better coverage”?
Very appreciate for reply me!

NVES · November 21, 2018, 5:35pm

Hello,

This seems like a DriveOS PX2 question, so moved your topic to the PX2 forum so you’ll get more visibility from PX2 experts, hence “better coverage” for your question.

imugly1029 · November 21, 2018, 5:45pm

Dear NVES
OK! I will move my topic to PX2 forum, Very appreciate for your help!

NVES · November 21, 2018, 5:58pm

fyi. I already moved your topic. No action required.

SivaRamaKrishnaNV · November 22, 2018, 3:53am

Dear imugly1029,
Could you please tell pdk version? We will look into the issue

imugly1029 · November 22, 2018, 4:50am

Dear SivaRamaKrishna
I have NVIDIA DRIVE™ PX 2 AutoChauffeur (P2379) for PDK and NVIDIA DRIVE 5.0.5.0bL for SDK.

SivaRamaKrishnaNV · November 22, 2018, 6:19am

Dear imgugly,
It seems the network prototxt are different. Could you please get the caffemodel corresponding to https://github.com/TLESORT/YOLO-TensorRT-GIE-/blob/master/yolo_small_modified.prototxt from the developer and give a try and let us know if you have any issue

imugly1029 · November 22, 2018, 8:14am

Dear SivaRamaKrishna
So it means that I should use the yolo_small_modified.prototxt and my own data to do training to generate my own caffemodel?

However, I have tried the prototxt from [url]https://github.com/xingwangsfu/caffe-yolo/blob/master/prototxt/yolo_small_deploy.prototxt[/url] and caffemodel generated from [url]https://github.com/xingwangsfu/caffe-yolo/blob/master/create_yolo_caffemodel.py[/url] by using the weights file from
[i]wget http://pjreddie.com/media/files/yolo-small.weights[/i]
It still didn’t work. I think it’s because the weights file isn’t corresponding to the prototxt, right?

So I come up an idea. I may use that weights file and generate it’s own prototxt and caffemodel by using [url]https://github.com/xingwangsfu/caffe-yolo/blob/master/create_yolo_prototxt.py[/url] and [url]https://github.com/xingwangsfu/caffe-yolo/blob/master/create_yolo_caffemodel.py[/url]

I will reply you if I get any issue. Many thanks for your help!

bcchoi · November 22, 2018, 12:45pm

@imugly1029
You are using Driveworks 1.2
If the number of output dimensions is one, you must modify all of the parts of the m_networkOutputDimensions variable to one dimension.
I hope this helps.

imugly1029 · November 22, 2018, 2:24pm

Dear bcchoi
Yes I found that problem during tracking the sample_object_detector code. Originally, variable m_networkOutputDimensions is an array with 2 elements.

dwBlobSize m_networkOutputDimensions[2];

So as the m_totalSizesOutput, *m_dnnOutputsDevice and m_dnnOutputsHost

float32_t *m_dnnOutputsDevice[2];
std::unique_ptr<float32_t[]> m_dnnOutputsHost[2];
uint32_t m_totalSizesOutput[2];

As your suggestion, I try to modify them into just one variable.

dwBlobSize m_networkOutputDimensions;
float32_t *m_dnnOutputsDevice;
std::unique_ptr<float32_t[]> m_dnnOutputsHost;
uint32_t m_totalSizesOutput;

But I found something interesting in the last line of function onProcess()

interpretOutput(m_dnnOutputsHost[m_cvgIdx].get(), m_dnnOutputsHost[m_bboxIdx].get(),
                        &m_detectionRegion);

It seems m_dnnOutputsHost was assigned a variable m_cvgIdx and I am not sure about the purpose of that variable. Is it something about confidence score? Also the function interpretOutput needs three input parameters. With your suggestion, I should modify all of them, right?
So I really want to know the purpose of variable m_cvgIdx. Do you have any idea?

imugly1029 · November 23, 2018, 3:56am

Dear SivaRamaKrishna
As I mentioned above, I try to use pre-Trained Darknet19 448x448's cfg file and weight file from https://pjreddie.com/darknet/imagenet/#extraction to generate it’s own prototxt and caffemodel.
And it can be transfer to tensorRT model successfully by using trnsorRT_optimization.

mec-lab@meclab-System-Product-Name:~/LBX/driveworks-1.2/tools/dnn$ ./tensorRT_optimization --outputBlobs=prob --out=yolo_trt.bin --modelType=caffe --caffemodel=/home/mec-lab/LBX/caffe-yolo/yolo_v1.caffemodel --prototxt=/home/mec-lab/LBX/caffe-yolo/prototxt/yolo_v1.prototxt
Initializing network optimizer on model /home/mec-lab/LBX/caffe-yolo/prototxt/yolo_v1.prototxt with weights from /home/mec-lab/LBX/caffe-yolo/yolo_v1.caffemodel
Input "data": 3x448x448
Output "prob": 1000x1x1
Iteration 0: 5.1801 ms.
Iteration 1: 5.20602 ms.
Iteration 2: 5.21466 ms.
Iteration 3: 5.67603 ms.
Iteration 4: 5.39424 ms.
Iteration 5: 5.22403 ms.
Iteration 6: 5.68115 ms.
Iteration 7: 5.38576 ms.
Iteration 8: 5.20051 ms.
Iteration 9: 5.21114 ms.
Average over 10 runs is 5.33736 ms.

However, the same error show up during run the sample_object_detector.

mec-lab@meclab-System-Product-Name:~/LBX/driveworks-1.2/bin$ ./sample_object_detector --tensorRT_model=/home/mec-lab/LBX/driveworks-1.2/tools/dnn/yolo_trt.bin
[23-11-2018 11:30:15] Initialize DriveWorks SDK v1.2.400
[23-11-2018 11:30:15] Release build with GNU 4.8.5 from v1.2.0-rc11-0-ga7f5475
[23-11-2018 11:30:15] Platform: Detected Generic x86 Platform
[23-11-2018 11:30:15] TimeSource: monotonic epoch time offset is 1542936490120149
[23-11-2018 11:30:15] Platform: number of GPU devices detected 1
[23-11-2018 11:30:15] Platform: currently selected GPU device discrete ID 0
[23-11-2018 11:30:15] SDK: Resources mounted from .././data/resources
[23-11-2018 11:30:15] SensorFactory::createSensor() -> camera.virtual, tensorRT_model=/home/mec-lab/LBX/driveworks-1.2/tools/dnn/yolo_trt.bin,video=.././data/samples/sfm/triangulation/video_0.h264
[23-11-2018 11:30:15] CameraNVCUVID: no seek table found at .././data/samples/sfm/triangulation/video_0.h264.seek, seeking is not available.
Camera image: 1280x800
Camera image with 1280x800 at 30 FPS
[23-11-2018 11:30:15] Added linear block of size 25690112
[23-11-2018 11:30:15] Added linear block of size 6422528
[23-11-2018 11:30:15] Added linear block of size 903168
[23-11-2018 11:30:17] DNN: Missing or incompatible parameter in metadata (ignoreAspectRatio). Parameter is set to default value.
[23-11-2018 11:30:17] DNN: Missing or incompatible parameter in metadata (doPerPlaneMeanNormalization). Parameter is set to default value.
[23-11-2018 11:30:17] DNN: Missing or incompatible parameter in metadata (tonemapType). Parameter is set to default value.
[23-11-2018 11:30:17] Driveworks exception thrown: DW_INVALID_ARGUMENT: blobIndex is larger than output binding count

terminate called after throwing an instance of 'std::runtime_error'
  what():  [2018-11-23 11:30:17] DW Error DW_INVALID_ARGUMENT executing DW function:
 dwDNN_getOutputSize(&m_networkOutputDimensions[1], 1U, m_dnn)
 at /builds/driveav/dw/sdk/samples/dnn/sample_object_detector/main.cpp:271
Aborted (core dumped)

I think it may refer to bcchoi mentioned above…

SivaRamaKrishnaNV · November 23, 2018, 7:00am

Dear imugly1029,
The sample_object_detector expects your network to have two outputs coverage and bounding box.
Can you confirm if you have set the correct blob names corresponding to these two. Also, can you double check the network input ans outsizes in the code if they are matching with the prototxt.

SivaRamaKrishnaNV · November 27, 2018, 6:50am

Dear imugly1029,
Is this issue resolved?

imugly1029 · December 3, 2018, 3:13am

Dear SivaRamaKrishna
So sorry for replying you so lately. I’m busy on other works so I didn’t try it.
By the way, to confirm the correct two blob names means to confirm them in sample_object_detector?

// Get coverage and bounding box blob indices
const char *coverageBlobName = "coverage";
const char *boundingBoxBlobName = "bboxes";
CHECK_DW_ERROR(dwDNN_getOutputIndex(&m_cvgIdx, coverageBlobName, m_dnn));
CHECK_DW_ERROR(dwDNN_getOutputIndex(&m_bboxIdx, boundingBoxBlobName, m_dnn));

or in the .prototxt file?

And where to check they are matching in input size and output size?

imugly1029 · December 7, 2018, 9:35am

Dear SivaRamaKrishna
Is there any new hint?
I really don’t know how to start the modification.

imugly1029 · December 16, 2018, 10:27am

I solved this problem by using the approach that bcchoi mentioned.
The output blob of the YOLO network I uesd is only one, and it’s size is 1 x 1 x 1470.
However sample_object_detector needs our network to have two output.
So I changed all the array that corresponding to two output to be one.
Moreover, I commented the function nonMaxSuppression and modified function interpretOutput so I can get the data stored in the YOLO’s output blob and use it to draw bounding box.

nehal · September 8, 2022, 11:41am

Hey,
Could I get the updated sample object detector for YOLO?

Topic		Replies	Views
Sample object detector tracker YOLOV3 model error inference DRIVE AGX Xavier General driveworks	14	1509	August 4, 2022
Driveworks dnn custom model for objects detection (yolo) DRIVE AGX Xavier General driveworks	13	2226	August 5, 2022
Inferring detectnet_v2 .trt model in python TAO Toolkit tensorrt	58	4286	August 17, 2021
TensorRT YOLO inference error Jetson TX1	21	12812	October 18, 2021
Integrate YoloV8 in Sample Object Detector DRIVE AGX Orin General driveworks-dnn-framework	24	497	January 29, 2025
Modified sample object detector tracker DRIVE AGX Orin General driveos-dl	7	525	March 21, 2024
Using custom TensorRT model on sample applications DRIVE AGX Xavier General driveworks-dnn-framework	4	1296	March 22, 2022
Error in applying tensorrt_optimized model in driveworks-0.6 ./sample_object_detector DriveWorks	0	657	May 23, 2018
Tiny Yolo ver-2 giving wrong output Jetson TX2	4	761	October 18, 2021
Converting Caffe model to TensorRT Jetson TX2	33	11969	October 18, 2021

YOLO TensorRT model for sample_object_detector

Related topics