re : jetson nano start kit : hello world ai : detectnet failure

anton.reinhardt · July 21, 2019, 8:48am

Hi

disclaimer : I am utterly new rookie on this wonderful technology / science and inspired by this following the availability of the hardware (nano) and the dusty-nv (software) basic tutorials. At least I am reasonable fluent with Ubuntu as an OS and c++ (QT).

The entire setup is pretty easy following a lot that is already prepacked in the sd card image . So I downloaded and flashed a 32G sd card image with the 32.2 version of ubuntu/jetpack (saturday 21/07)
followed the setup meticulously and loaded only the default networks and did not download pytorch (yet) . I believe pytorch is only for the next step of transfer learning. I cloned the repo , cmake’d it , make and make install .
I attached a rpi v 2.1 camera to the system.
Sooner than later I was able to run the imagenet-console and imagenet-camera test which passed flying colors . so by this I infer that a lot is working on my system .
However detectnet-console and detectnet-camera fails dismally . After many attempts , I decided to re-flash the SD card again and build everything from first principles again in the hope that I have missed some steps (that’s what beginners do !) . In short , the debug printout lists that the max number of bounding boxes is 0 , and finally it prints that detecnet model fails.
I then discovered that common to detectnet-net console and detectnet-camera is the c++ class detectnet , so I started to insert some debug printf message of my own in order to report where in the init chain the class fails.

Long story short :

7 the allocDetections() functions fails . The reason being that mMaxDetections = 0 . Then when cudaAllocMapped is called it is zero and fails.

8 I then did something naughty and inserted mMaxDetections = 1 , overriding the equations that sets the value. I recompiled with make , ran all four of the detectnet-console examples and all is well . the output jpg images = same as the reference samples on the website.

So I assume that I have done something wrong , but I do hope that the changes that I have temporarily perform on the ‘holy ground’ of the library source could help somebody to understand my problem.

Regards

Anton Reinhardt

dusty_nv · July 21, 2019, 12:03pm

Hi Anton, can you provide the text from the terminal log when you run the detectnet-console program on an image? Thanks.

anton.reinhardt · July 21, 2019, 2:30pm

Hi Dusty

Thank for your response . As requested :

[b]listing 1 : (this is the untampered standard version) :

command = ./detectnet-console peds-003.jpg output.jpg >anton.txt

output :[/b]

detectNet – loading detection network model from:
– prototxt networks/ped-100/deploy.prototxt
– model networks/ped-100/snapshot_iter_70800.caffemodel
– input_blob ‘data’
– output_cvg ‘coverage’
– output_bbox ‘bboxes’
– mean_pixel 0.000000
– mean_binary NULL
– class_labels networks/ped-100/class_labels.txt
– threshold 0.500000
– batch_size 1

[TRT] TensorRT version 5.1.6
[TRT] loading NVIDIA plugins…
[TRT] Plugin Creator registration succeeded - GridAnchor_TRT
[TRT] Plugin Creator registration succeeded - NMS_TRT
[TRT] Plugin Creator registration succeeded - Reorg_TRT
[TRT] Plugin Creator registration succeeded - Region_TRT
[TRT] Plugin Creator registration succeeded - Clip_TRT
[TRT] Plugin Creator registration succeeded - LReLU_TRT
[TRT] Plugin Creator registration succeeded - PriorBox_TRT
[TRT] Plugin Creator registration succeeded - Normalize_TRT
[TRT] Plugin Creator registration succeeded - RPROI_TRT
[TRT] Plugin Creator registration succeeded - BatchedNMS_TRT
[TRT] completed loading NVIDIA plugins.
[TRT] detected model format - caffe (extension ‘.caffemodel’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file networks/ped-100/snapshot_iter_70800.caffemodel.1.1.GPU.FP16.engine
[TRT] loading network profile from engine cache… networks/ped-100/snapshot_iter_70800.caffemodel.1.1.GPU.FP16.engine
[TRT] device GPU, networks/ped-100/snapshot_iter_70800.caffemodel loaded
[TRT] device GPU, CUDA engine context initialized with 3 bindings
[TRT] binding – index 0
– name ‘data’
– type FP32
– in/out INPUT
– # dims 3
– dim #0 3 (CHANNEL)
– dim #1 512 (SPATIAL)
– dim #2 1024 (SPATIAL)
[TRT] binding – index 1
– name ‘coverage’
– type FP32
– in/out OUTPUT
– # dims 3
– dim #0 1 (CHANNEL)
– dim #1 32 (SPATIAL)
– dim #2 64 (SPATIAL)
[TRT] binding – index 2
– name ‘bboxes’
– type FP32
– in/out OUTPUT
– # dims 3
– dim #0 4 (CHANNEL)
– dim #1 32 (SPATIAL)
– dim #2 64 (SPATIAL)
[TRT] binding to input 0 data binding index: 0
[TRT] binding to input 0 data dims (b=1 c=3 h=512 w=1024) size=6291456
[TRT] binding to output 0 coverage binding index: 1
[TRT] binding to output 0 coverage dims (b=1 c=1 h=32 w=64) size=8192
[TRT] binding to output 1 bboxes binding index: 2
[TRT] binding to output 1 bboxes dims (b=1 c=4 h=32 w=64) size=32768
device GPU, networks/ped-100/snapshot_iter_70800.caffemodel initialized.
detectNet – number object classes: 1
detectNet – maximum bounding boxes: 0
detectnet-console: failed to initialize detectNet

[b]listing 2 : (this is the standard version with additional debug messages (ar_dbg) :

command = ./detectnet-console peds-003.jpg output.jpg >anton.txt

output : (only the tail shown) [/b]

[TRT] binding to input 0 data binding index: 0
[TRT] binding to input 0 data dims (b=1 c=3 h=512 w=1024) size=6291456
[TRT] binding to output 0 coverage binding index: 1
[TRT] binding to output 0 coverage dims (b=1 c=1 h=32 w=64) size=8192
[TRT] binding to output 1 bboxes binding index: 2
[TRT] binding to output 1 bboxes dims (b=1 c=4 h=32 w=64) size=32768
device GPU, networks/ped-100/snapshot_iter_70800.caffemodel initialized.
ar_dbg : entering allocDetections
ar_dbg : model type != MODEL_UFF and model type != MODEL_ONXX
detectNet – number object classes: 1
detectNet – maximum bounding boxes: 0
ar_dbg : cudaAllocMapped function failed . aborted
detectnet-console: failed to initialize detectNet

[b]listing 3 : (this is the 'tampered version where I set maxdetections to 1 :

command = ./detectnet-console peds-003.jpg output.jpg >anton.txt

output : (only the tail shown) [/b]

detectNet – loading detection network model from:
– prototxt networks/ped-100/deploy.prototxt
– model networks/ped-100/snapshot_iter_70800.caffemodel
– input_blob ‘data’
– output_cvg ‘coverage’
– output_bbox ‘bboxes’
– mean_pixel 0.000000
– mean_binary NULL
– class_labels networks/ped-100/class_labels.txt
– threshold 0.500000
– batch_size 1

[TRT] TensorRT version 5.1.6
[TRT] loading NVIDIA plugins…
[TRT] Plugin Creator registration succeeded - GridAnchor_TRT
[TRT] Plugin Creator registration succeeded - NMS_TRT
[TRT] Plugin Creator registration succeeded - Reorg_TRT
[TRT] Plugin Creator registration succeeded - Region_TRT
[TRT] Plugin Creator registration succeeded - Clip_TRT
[TRT] Plugin Creator registration succeeded - LReLU_TRT
[TRT] Plugin Creator registration succeeded - PriorBox_TRT
[TRT] Plugin Creator registration succeeded - Normalize_TRT
[TRT] Plugin Creator registration succeeded - RPROI_TRT
[TRT] Plugin Creator registration succeeded - BatchedNMS_TRT
[TRT] completed loading NVIDIA plugins.
[TRT] detected model format - caffe (extension ‘.caffemodel’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file networks/ped-100/snapshot_iter_70800.caffemodel.1.1.GPU.FP16.engine
[TRT] loading network profile from engine cache… networks/ped-100/snapshot_iter_70800.caffemodel.1.1.GPU.FP16.engine
[TRT] device GPU, networks/ped-100/snapshot_iter_70800.caffemodel loaded
[TRT] device GPU, CUDA engine context initialized with 3 bindings
[TRT] binding – index 0
– name ‘data’
– type FP32
– in/out INPUT
– # dims 3
– dim #0 3 (CHANNEL)
– dim #1 512 (SPATIAL)
– dim #2 1024 (SPATIAL)
[TRT] binding – index 1
– name ‘coverage’
– type FP32
– in/out OUTPUT
– # dims 3
– dim #0 1 (CHANNEL)
– dim #1 32 (SPATIAL)
– dim #2 64 (SPATIAL)
[TRT] binding – index 2
– name ‘bboxes’
– type FP32
– in/out OUTPUT
– # dims 3
– dim #0 4 (CHANNEL)
– dim #1 32 (SPATIAL)
– dim #2 64 (SPATIAL)
[TRT] binding to input 0 data binding index: 0
[TRT] binding to input 0 data dims (b=1 c=3 h=512 w=1024) size=6291456
[TRT] binding to output 0 coverage binding index: 1
[TRT] binding to output 0 coverage dims (b=1 c=1 h=32 w=64) size=8192
[TRT] binding to output 1 bboxes binding index: 2
[TRT] binding to output 1 bboxes dims (b=1 c=4 h=32 w=64) size=32768
device GPU, networks/ped-100/snapshot_iter_70800.caffemodel initialized.
ar_dbg : entering allocDetections
ar_dbg : model type != MODEL_UFF and model type != MODEL_ONXX
detectNet – number object classes: 1
detectNet – maximum bounding boxes: 1
ar_dbg : leaving allocDetections . return true
detectNet – loaded 1 class info entries
detectNet – number of object classes: 1
[image] loaded ‘peds-003.jpg’ (1024 x 611, 3 channels)
5 objects detected
detected obj 0 class #0 (person) confidence=0.872070
bounding box 0 (692.062500, 43.632202) (841.000000, 459.890869) w=148.937500 h=416.258667
detected obj 1 class #0 (person) confidence=0.899902
bounding box 1 (851.187500, 59.966309) (1014.125000, 490.470703) w=162.937500 h=430.504395
detected obj 2 class #0 (person) confidence=1.076172
bounding box 2 (16.687500, 13.723633) (227.250000, 558.939697) w=210.562500 h=545.216064
detected obj 3 class #0 (person) confidence=0.681152
bounding box 3 (374.250000, 34.756592) (619.109375, 598.320557) w=244.859375 h=563.563965
detected obj 4 class #0 (person) confidence=0.959961
bounding box 4 (549.156250, 130.001587) (617.781250, 319.223633) w=68.625000 h=189.222046

[TRT] ----------------------------------------------
[TRT] Timing Report networks/ped-100/snapshot_iter_70800.caffemodel
[TRT] ----------------------------------------------
[TRT] Pre-Process CPU 0.08740ms CUDA 8.18458ms
[TRT] Network CPU 243.50406ms CUDA 234.84735ms
[TRT] Post-Process CPU 2.01807ms CUDA 1.92625ms
[TRT] Visualize CPU 0.26687ms CUDA 61.72536ms
[TRT] Total CPU 245.87640ms CUDA 306.68356ms
[TRT] ----------------------------------------------

[TRT] note – when processing a single image, run ‘sudo jetson_clocks’ before
to disable DVFS for more accurate profiling/timing measurements

detectnet-console: writing 1024x611 image to ‘output.jpg’
detectnet-console: successfully wrote 1024x611 image to ‘output.jpg’
detectnet-console: shutting down…
detectnet-console: shutdown complete

below is the code that i have changed

// allocDetections
bool detectNet::allocDetections()
{
printf(“ar_dbg : entering allocDetections\n”);
// determine max detections
if( IsModelType(MODEL_UFF) ) // TODO: fixme
{
printf(“ar_dbg : model type = MODEL_UFF\n”);
printf(“W = %u H = %u C = %u\n”, DIMS_W(mOutputs[OUTPUT_UFF].dims), DIMS_H(mOutputs[OUTPUT_UFF].dims), DIMS_C(mOutputs[OUTPUT_UFF].dims));
mMaxDetections = DIMS_H(mOutputs[OUTPUT_UFF].dims) * DIMS_C(mOutputs[OUTPUT_UFF].dims);
}
else if( IsModelType(MODEL_ONNX) )
{
printf(“ar_dbg : model type = MODEL_ONNX\n”);
mMaxDetections = 1;
mNumClasses = 1;
printf(“detectNet – using ONNX model\n”);
}
else
{
printf(“ar_dbg : model type != MODEL_UFF and model type != MODEL_ONXX\n”);
mMaxDetections = DIMS_W(mOutputs[OUTPUT_CVG].dims) * DIMS_H(mOutputs[OUTPUT_CVG].dims) /** DIMS_C(mOutputs[OUTPUT_CVG].dims)*/ * mNumClasses;
mNumClasses = DIMS_C(mOutputs[OUTPUT_CVG].dims);
printf(“detectNet – number object classes: %u\n”, mNumClasses);
}

//----------------------
// ar forced values
//---------------------
mMaxDetections = 1;

printf("detectNet -- maximum bounding boxes:  %u\n", mMaxDetections);

// allocate array to store detection results
const size_t det_size = sizeof(Detection) * mNumDetectionSets * mMaxDetections;

if( !cudaAllocMapped((void**)&mDetectionSets[0], (void**)&mDetectionSets[1], det_size) )
{
	printf("ar_dbg : cudaAllocMapped function failed . aborted\n");
	return false;
}
memset(mDetectionSets[0], 0, det_size);

printf("ar_dbg : leaving allocDetections . return true \n");
return true;

}

Thank you in advance.
Best Regards

Anton

dusty_nv · July 21, 2019, 9:40pm

Sorry Anton, looks like I broke that code on Friday - thanks for letting me know. Just checked in the fix on GitHub with commit f98f6b.

If you re-clone the repo, it should be working again without further modification.

anton.reinhardt · July 22, 2019, 6:41am

Hi Dusty

Thank you for your response and solution . I am new here , but what I have taken from this so far is that you are serving our community in a super awesome manner , so no need to be sorry . I am super excited about this nvidia jetson platform , thank you for bringing it to our doorsteps . I am here to stay.

Best Regards

Anton

Topic		Replies	Views
Jetson Inference DetectNet Problems Jetson Nano tensorrt , jetson-inference , nvbugs	17	2757	October 15, 2021
Jetson Nano - Limiting the results shown by the DetectNet example. Jetson Nano	8	2329	October 14, 2021
detectnet-camera fails Jetson Nano	3	1936	October 18, 2021
imagenet-examples works but detectnet-examples dont. Jetson Nano	6	1207	October 18, 2021
detectnet-console not working on Nano Jetson Nano	5	2463	October 18, 2021
Jetson TX2 and USB camera: DetectNet error after initial success Jetson TX2	3	856	October 18, 2021
Inference detectnet Jetson Nano jetson-inference	2	712	August 29, 2021
Detectnet-console failed to profile custom model on TX2 Jetson TX2	6	1032	October 18, 2021
openCv + detectNet in python Jetson Nano camera , python	11	2408	October 15, 2021
detection result difference between jetson-inference2.3 and Digits5.1 Jetson TX1	14	2750	October 18, 2021

re : jetson nano start kit : hello world ai : detectnet failure

Related topics