detection result difference between jetson-inference2.3 and Digits5.1

hello!
I tried to train customer data using Tianx and detect using Tx1, but detection result between jetson-inference2.3 and Digits5.1 is different. How I can do it?
1,I have trained customer data(640x480) using “Object Detection Model” of Digits5.1 on Titanx. And test the detection result. It is right.
2,I copied the deploy.prototxt,snapshot_iter_44718.caffemodel,mean.binaryproto from digit job(desktop) to jetson-inference/data/networks/can-detect(Tx1)
3,I added the codes to detectNet.cpp

else if( networkType == CANNET )
          return Create("can-detect/deploy.prototxt", "can-detect/snapshot_iter\
_44718.caffemodel", NULL, threshold );

And add CANNET to NetworkType

enum NetworkType
        {
                PEDNET = 0,             /**< Pedestrian / person detector */
		PEDNET_MULTI,   /**< Multi-class pedestrian + baggage detector */
                FACENET,                        /**< Human facial detector trained on FDDB */
                CANNET
	};

4,I change detectnet-console.cpp

detectNet* net = detectNet::Create( detectNet::CANNET );

5,compile source code. and do following
cd build/aarch64/bin/
./detectnet-console ~/000001.png test.png
I got the following log. And no bounding boxes detected
detectnet-console
args (3): 0 [./detectnet-console] 1 [/home/ubuntu/000001.png] 2 [test.png]

[GIE] attempting to open cache file can-detect/snapshot_iter_44718.caffemodel.tensorcache
[GIE] loading network profile from cache… can-detect/snapshot_iter_44718.caffemodel.tensorcache
[GIE] platform has FP16 support.
[GIE] can-detect/snapshot_iter_44718.caffemodel loaded
[GIE] CUDA engine context initialized with 3 bindings
[GIE] can-detect/snapshot_iter_44718.caffemodel input binding index: 0
[GIE] can-detect/snapshot_iter_44718.caffemodel input dims (c=3 h=480 w=640) size=3686400
[cuda] cudaAllocMapped 3686400 bytes, CPU 0x100be0000 GPU 0x100be0000
[GIE] can-detect/snapshot_iter_44718.caffemodel output 0 coverage binding index: 1
[GIE] can-detect/snapshot_iter_44718.caffemodel output 0 coverage dims (c=1 h=30 w=40) size=4800
[cuda] cudaAllocMapped 4800 bytes, CPU 0x100f80000 GPU 0x100f80000
[GIE] can-detect/snapshot_iter_44718.caffemodel output 1 bboxes binding index: 2
[GIE] can-detect/snapshot_iter_44718.caffemodel output 1 bboxes dims (c=4 h=30 w=40) size=19200
[cuda] cudaAllocMapped 19200 bytes, CPU 0x101080000 GPU 0x101080000
can-detect/snapshot_iter_44718.caffemodel initialized.
[cuda] cudaAllocMapped 16 bytes, CPU 0x101180000 GPU 0x101180000
maximum bounding boxes: 4800
[cuda] cudaAllocMapped 76800 bytes, CPU 0x101280000 GPU 0x101280000
[cuda] cudaAllocMapped 19200 bytes, CPU 0x101084c00 GPU 0x101084c00
loaded image /home/ubuntu/000001.png (640 x 480) 4915200 bytes
[cuda] cudaAllocMapped 4915200 bytes, CPU 0x101380000 GPU 0x101380000
detectnet-console: beginning processing network (1486882943368)
[GIE] layer data_zipper - 2.631052 ms
[GIE] layer deploy_transform - 12.228993 ms
[GIE] layer conv1/7x7_s2 + conv1/relu_7x7 - 53.802921 ms
[GIE] layer pool1/3x3_s2 - 19.526569 ms
[GIE] layer pool1/norm1 - 10.890100 ms
[GIE] layer conv2/3x3_reduce + conv2/relu_3x3_reduce - 6.804786 ms
[GIE] layer conv2/3x3 + conv2/relu_3x3 - 79.133278 ms
[GIE] layer conv2/norm2 - 30.936037 ms
[GIE] layer pool2/3x3_s2 - 11.421520 ms
[GIE] layer inception_3a/1x1 + inception_3a/relu_1x1 - 4.354577 ms
[GIE] layer inception_3a/pool - 8.521206 ms
[GIE] layer inception_3a/3x3_reduce + inception_3a/relu_3x3_reduce||inception_3a/5x5_reduce + inception_3a/relu_5x5_reduce - 8.130891 ms
[GIE] layer inception_3a/pool_proj + inception_3a/relu_pool_proj - 4.320050 ms
[GIE] layer inception_3a/3x3 + inception_3a/relu_3x3 - 19.570831 ms
[GIE] layer inception_3a/5x5 + inception_3a/relu_5x5 - 8.269312 ms
[GIE] layer inception_3a/output - 0.004579 ms
[GIE] layer inception_3b/1x1 + inception_3b/relu_1x1 - 10.281731 ms
[GIE] layer inception_3b/pool - 17.472097 ms
[GIE] layer inception_3b/3x3_reduce + inception_3b/relu_3x3_reduce||inception_3b/5x5_reduce + inception_3b/relu_5x5_reduce - 4.868997 ms
[GIE] layer inception_3b/pool_proj + inception_3b/relu_pool_proj - 1.542526 ms
[GIE] layer inception_3b/3x3 + inception_3b/relu_3x3 - 10.316626 ms
[GIE] layer inception_3b/5x5 + inception_3b/relu_5x5 - 5.218524 ms
[GIE] layer inception_3b/output - 0.001526 ms
[GIE] layer pool3/3x3_s2 - 3.395788 ms
[GIE] layer inception_4a/1x1 + inception_4a/relu_1x1 - 2.107104 ms
[GIE] layer inception_4a/pool - 1.957631 ms
[GIE] layer inception_4a/3x3_reduce + inception_4a/relu_3x3_reduce||inception_4a/5x5_reduce + inception_4a/relu_5x5_reduce - 1.083473 ms
[GIE] layer inception_4a/pool_proj + inception_4a/relu_pool_proj - 1.021736 ms
[GIE] layer inception_4a/3x3 + inception_4a/relu_3x3 - 2.887999 ms
[GIE] layer inception_4a/5x5 + inception_4a/relu_5x5 - 0.477894 ms
[GIE] layer inception_4a/output - 0.001526 ms
[GIE] layer inception_4b/1x1 + inception_4b/relu_1x1 - 2.262683 ms
[GIE] layer inception_4b/pool - 2.072526 ms
[GIE] layer inception_4b/3x3_reduce + inception_4b/relu_3x3_reduce||inception_4b/5x5_reduce + inception_4b/relu_5x5_reduce - 2.240104 ms
[GIE] layer inception_4b/pool_proj + inception_4b/relu_pool_proj - 0.983473 ms
[GIE] layer inception_4b/3x3 + inception_4b/relu_3x3 - 3.339946 ms
[GIE] layer inception_4b/5x5 + inception_4b/relu_5x5 - 0.742736 ms
[GIE] layer inception_4b/output - 0.001421 ms
[GIE] layer inception_4c/1x1 + inception_4c/relu_1x1 - 1.230631 ms
[GIE] layer inception_4c/pool - 2.033367 ms
[GIE] layer inception_4c/3x3_reduce + inception_4c/relu_3x3_reduce||inception_4c/5x5_reduce + inception_4c/relu_5x5_reduce - 2.198315 ms
[GIE] layer inception_4c/pool_proj + inception_4c/relu_pool_proj - 1.055736 ms
[GIE] layer inception_4c/3x3 + inception_4c/relu_3x3 - 3.942683 ms
[GIE] layer inception_4c/5x5 + inception_4c/relu_5x5 - 0.739473 ms
[GIE] layer inception_4c/output - 0.001684 ms
[GIE] layer inception_4d/1x1 + inception_4d/relu_1x1 - 1.157157 ms
[GIE] layer inception_4d/pool - 2.064420 ms
[GIE] layer inception_4d/3x3_reduce + inception_4d/relu_3x3_reduce||inception_4d/5x5_reduce + inception_4d/relu_5x5_reduce - 2.274368 ms
[GIE] layer inception_4d/pool_proj + inception_4d/relu_pool_proj - 0.934473 ms
[GIE] layer inception_4d/3x3 + inception_4d/relu_3x3 - 3.617524 ms
[GIE] layer inception_4d/5x5 + inception_4d/relu_5x5 - 0.873684 ms
[GIE] layer inception_4d/output - 0.001105 ms
[GIE] layer inception_4e/1x1 + inception_4e/relu_1x1 - 1.850052 ms
[GIE] layer inception_4e/pool - 1.319473 ms
[GIE] layer inception_4e/3x3_reduce + inception_4e/relu_3x3_reduce||inception_4e/5x5_reduce + inception_4e/relu_5x5_reduce - 1.708210 ms
[GIE] layer inception_4e/pool_proj + inception_4e/relu_pool_proj - 1.015473 ms
[GIE] layer inception_4e/3x3 + inception_4e/relu_3x3 - 4.248103 ms
[GIE] layer inception_4e/5x5 + inception_4e/relu_5x5 - 1.423105 ms
[GIE] layer inception_4e/output - 0.001210 ms
[GIE] layer inception_5a/1x1 + inception_5a/relu_1x1 - 2.831367 ms
[GIE] layer inception_5a/pool - 2.076315 ms
[GIE] layer inception_5a/3x3_reduce + inception_5a/relu_3x3_reduce||inception_5a/5x5_reduce + inception_5a/relu_5x5_reduce - 2.638472 ms
[GIE] layer inception_5a/pool_proj + inception_5a/relu_pool_proj - 1.510052 ms
[GIE] layer inception_5a/3x3 + inception_5a/relu_3x3 - 4.171524 ms
[GIE] layer inception_5a/5x5 + inception_5a/relu_5x5 - 1.419473 ms
[GIE] layer inception_5a/output - 0.001211 ms
[GIE] layer inception_5b/1x1 + inception_5b/relu_1x1 - 4.310366 ms
[GIE] layer inception_5b/pool - 1.914525 ms
[GIE] layer inception_5b/3x3_reduce + inception_5b/relu_3x3_reduce||inception_5b/5x5_reduce + inception_5b/relu_5x5_reduce - 2.785683 ms
[GIE] layer inception_5b/pool_proj + inception_5b/relu_pool_proj - 1.477104 ms
[GIE] layer inception_5b/3x3 + inception_5b/relu_3x3 - 5.633261 ms
[GIE] layer inception_5b/5x5 + inception_5b/relu_5x5 - 2.040999 ms
[GIE] layer inception_5b/output - 0.001105 ms
[GIE] layer cvg/classifier - 1.237052 ms
[GIE] layer bbox/regressor - 1.227894 ms
[GIE] layer coverage/sig - 0.026211 ms
[GIE] layer bboxes_unzipper - 0.010421 ms
[GIE] layer coverage_unzipper - 0.006263 ms
[GIE] layer network time - 423.834290 ms
detectnet-console: finished processing network (1486882943832)
0 bounding boxes detected
detectnet-console: writing 640x480 image to ‘test.png’
detectnet-console: successfully wrote 640x480 image to ‘test.png’

shutting down…

Hi jiangxuan11,
Thanks for your problem, we are trying to clarify this issue now. Will update information later.

Hi,

Did you open fp16 in inferencing which may slight degrade inference accuracy?
If yes, could you turn it off and try it again?

Hi AastaLLL

I also tried fp32 mode,but the problem is not solved.
I call net->DisableFP16() in “Create” function to use fp32.

detectNet* detectNet::Create( const char* prototxt, const char* model, const char* mean_binary, float threshold, const char* input_blob, const char* coverage_blob, const char* bbox_blob )
{
        detectNet* net = new detectNet();

        if( !net )
                return NULL;

        net->EnableDebug();
        net->DisableFP16();

The log is following.
detectnet-console
args (3): 0 [./detectnet-console] 1 [/home/ubuntu/000001.png] 2 [test.png]

[GIE] attempting to open cache file can-detect/snapshot_iter_44718.caffemodel.tensorcache
[GIE] loading network profile from cache… can-detect/snapshot_iter_44718.caffemodel.tensorcache
[GIE] platform does not have FP16 support.
[GIE] can-detect/snapshot_iter_44718.caffemodel loaded
[GIE] enabling context debug sync.
[GIE] CUDA engine context initialized with 3 bindings
[GIE] can-detect/snapshot_iter_44718.caffemodel input binding index: 0
[GIE] can-detect/snapshot_iter_44718.caffemodel input dims (c=3 h=480 w=640) size=3686400
[cuda] cudaAllocMapped 3686400 bytes, CPU 0x100be0000 GPU 0x100be0000
[GIE] can-detect/snapshot_iter_44718.caffemodel output 0 coverage binding index: 1
[GIE] can-detect/snapshot_iter_44718.caffemodel output 0 coverage dims (c=1 h=30 w=40) size=4800
[cuda] cudaAllocMapped 4800 bytes, CPU 0x100f80000 GPU 0x100f80000
[GIE] can-detect/snapshot_iter_44718.caffemodel output 1 bboxes binding index: 2
[GIE] can-detect/snapshot_iter_44718.caffemodel output 1 bboxes dims (c=4 h=30 w=40) size=19200
[cuda] cudaAllocMapped 19200 bytes, CPU 0x101080000 GPU 0x101080000
can-detect/snapshot_iter_44718.caffemodel initialized.
[cuda] cudaAllocMapped 16 bytes, CPU 0x101180000 GPU 0x101180000
maximum bounding boxes: 4800
[cuda] cudaAllocMapped 76800 bytes, CPU 0x101280000 GPU 0x101280000
[cuda] cudaAllocMapped 19200 bytes, CPU 0x101084c00 GPU 0x101084c00
loaded image /home/ubuntu/000001.png (640 x 480) 4915200 bytes
[cuda] cudaAllocMapped 4915200 bytes, CPU 0x101380000 GPU 0x101380000
detectnet-console: beginning processing network (1486993267697)
[GIE] layer data_zipper - 2.690630 ms
[GIE] layer deploy_transform - 12.456941 ms
[GIE] layer conv1/7x7_s2 + conv1/relu_7x7 - 14.975150 ms
[GIE] layer pool1/3x3_s2 - 4.994155 ms
[GIE] layer pool1/norm1 - 2.163999 ms
[GIE] layer conv2/3x3_reduce + conv2/relu_3x3_reduce - 1.709578 ms
[GIE] layer conv2/3x3 + conv2/relu_3x3 - 20.219936 ms
[GIE] layer conv2/norm2 - 8.515470 ms
[GIE] layer pool2/3x3_s2 - 7.279943 ms
[GIE] layer inception_3a/1x1 + inception_3a/relu_1x1 - 0.902105 ms
[GIE] layer inception_3a/pool - 1.195421 ms
[GIE] layer inception_3a/3x3_reduce + inception_3a/relu_3x3_reduce||inception_3a/5x5_reduce + inception_3a/relu_5x5_reduce - 1.201946 ms
[GIE] layer inception_3a/pool_proj + inception_3a/relu_pool_proj - 0.807158 ms
[GIE] layer inception_3a/3x3 + inception_3a/relu_3x3 - 2.315578 ms
[GIE] layer inception_3a/5x5 + inception_3a/relu_5x5 - 1.045788 ms
[GIE] layer inception_3a/output - 0.154842 ms
[GIE] layer inception_3b/1x1 + inception_3b/relu_1x1 - 1.314895 ms
[GIE] layer inception_3b/pool - 1.530209 ms
[GIE] layer inception_3b/3x3_reduce + inception_3b/relu_3x3_reduce||inception_3b/5x5_reduce + inception_3b/relu_5x5_reduce - 2.033315 ms
[GIE] layer inception_3b/pool_proj + inception_3b/relu_pool_proj - 0.957842 ms
[GIE] layer inception_3b/3x3 + inception_3b/relu_3x3 - 4.172155 ms
[GIE] layer inception_3b/5x5 + inception_3b/relu_5x5 - 3.120315 ms
[GIE] layer inception_3b/output - 0.156684 ms
[GIE] layer pool3/3x3_s2 - 1.365841 ms
[GIE] layer inception_4a/1x1 + inception_4a/relu_1x1 - 1.139947 ms
[GIE] layer inception_4a/pool - 0.887947 ms
[GIE] layer inception_4a/3x3_reduce + inception_4a/relu_3x3_reduce||inception_4a/5x5_reduce + inception_4a/relu_5x5_reduce - 0.788789 ms
[GIE] layer inception_4a/pool_proj + inception_4a/relu_pool_proj - 0.653842 ms
[GIE] layer inception_4a/3x3 + inception_4a/relu_3x3 - 1.308788 ms
[GIE] layer inception_4a/5x5 + inception_4a/relu_5x5 - 0.518790 ms
[GIE] layer inception_4a/output - 0.175684 ms
[GIE] layer inception_4b/1x1 + inception_4b/relu_1x1 - 1.203841 ms
[GIE] layer inception_4b/pool - 0.879263 ms
[GIE] layer inception_4b/3x3_reduce + inception_4b/relu_3x3_reduce||inception_4b/5x5_reduce + inception_4b/relu_5x5_reduce - 1.180210 ms
[GIE] layer inception_4b/pool_proj + inception_4b/relu_pool_proj - 0.715579 ms
[GIE] layer inception_4b/3x3 + inception_4b/relu_3x3 - 1.422841 ms
[GIE] layer inception_4b/5x5 + inception_4b/relu_5x5 - 0.646052 ms
[GIE] layer inception_4b/output - 0.128579 ms
[GIE] layer inception_4c/1x1 + inception_4c/relu_1x1 - 0.819947 ms
[GIE] layer inception_4c/pool - 0.879526 ms
[GIE] layer inception_4c/3x3_reduce + inception_4c/relu_3x3_reduce||inception_4c/5x5_reduce + inception_4c/relu_5x5_reduce - 1.185526 ms
[GIE] layer inception_4c/pool_proj + inception_4c/relu_pool_proj - 0.770368 ms
[GIE] layer inception_4c/3x3 + inception_4c/relu_3x3 - 1.863525 ms
[GIE] layer inception_4c/5x5 + inception_4c/relu_5x5 - 0.765105 ms
[GIE] layer inception_4c/output - 0.151158 ms
[GIE] layer inception_4d/1x1 + inception_4d/relu_1x1 - 0.939841 ms
[GIE] layer inception_4d/pool - 0.899000 ms
[GIE] layer inception_4d/3x3_reduce + inception_4d/relu_3x3_reduce||inception_4d/5x5_reduce + inception_4d/relu_5x5_reduce - 1.214473 ms
[GIE] layer inception_4d/pool_proj + inception_4d/relu_pool_proj - 0.644105 ms
[GIE] layer inception_4d/3x3 + inception_4d/relu_3x3 - 2.072209 ms
[GIE] layer inception_4d/5x5 + inception_4d/relu_5x5 - 0.747316 ms
[GIE] layer inception_4d/output - 0.238736 ms
[GIE] layer inception_4e/1x1 + inception_4e/relu_1x1 - 1.621263 ms
[GIE] layer inception_4e/pool - 1.023947 ms
[GIE] layer inception_4e/3x3_reduce + inception_4e/relu_3x3_reduce||inception_4e/5x5_reduce + inception_4e/relu_5x5_reduce - 1.346473 ms
[GIE] layer inception_4e/pool_proj + inception_4e/relu_pool_proj - 0.992052 ms
[GIE] layer inception_4e/3x3 + inception_4e/relu_3x3 - 2.858419 ms
[GIE] layer inception_4e/5x5 + inception_4e/relu_5x5 - 1.278947 ms
[GIE] layer inception_4e/output - 0.134684 ms
[GIE] layer inception_5a/1x1 + inception_5a/relu_1x1 - 2.172262 ms
[GIE] layer inception_5a/pool - 1.370894 ms
[GIE] layer inception_5a/3x3_reduce + inception_5a/relu_3x3_reduce||inception_5a/5x5_reduce + inception_5a/relu_5x5_reduce - 1.956999 ms
[GIE] layer inception_5a/pool_proj + inception_5a/relu_pool_proj - 1.241894 ms
[GIE] layer inception_5a/3x3 + inception_5a/relu_3x3 - 2.802999 ms
[GIE] layer inception_5a/5x5 + inception_5a/relu_5x5 - 1.214157 ms
[GIE] layer inception_5a/output - 0.134790 ms
[GIE] layer inception_5b/1x1 + inception_5b/relu_1x1 - 3.170103 ms
[GIE] layer inception_5b/pool - 1.394105 ms
[GIE] layer inception_5b/3x3_reduce + inception_5b/relu_3x3_reduce||inception_5b/5x5_reduce + inception_5b/relu_5x5_reduce - 2.310209 ms
[GIE] layer inception_5b/pool_proj + inception_5b/relu_pool_proj - 1.330631 ms
[GIE] layer inception_5b/3x3 + inception_5b/relu_3x3 - 4.198577 ms
[GIE] layer inception_5b/5x5 + inception_5b/relu_5x5 - 8.941469 ms
[GIE] layer inception_5b/output - 0.235473 ms
[GIE] layer cvg/classifier - 1.211895 ms
[GIE] layer bbox/regressor - 1.159420 ms
[GIE] layer coverage/sig - 0.506000 ms
[GIE] layer bboxes_unzipper - 0.349736 ms
[GIE] layer coverage_unzipper - 0.321369 ms
[GIE] layer network time - 167.399658 ms
input width 640 height 480
cells x 40 y 30
cell width 16.000000 height 16.000000
scale x 1.000000 y 1.000000
detectnet-console: finished processing network (1486993267900)
0 bounding boxes detected
detectnet-console: writing 640x480 image to ‘test.png’
detectnet-console: successfully wrote 640x480 image to ‘test.png’

shutting down…

Hi,

Could you modify the threshold in detectNet process function since this value may not be suitable for your use-case?

Hi AastaLL

Yes,I have tried modifying threshold from 0.5 to 0.2.
1 bounding box detected,and coordinate of box is wrong.
Actually I got 8 bounding boxes using digits.
I don’t think threshold is problem,because default threshold of digits (0.6) is bigger than jetson-inference (0.5).

Hi,

I have tested your modification with our own network but things go well in my side.

Do you use mean file on DIGITs.
If yes, could you try to load mean file as well?

detectNet* net = detectNet::Create("can-detect/deploy.prototxt", "can-detect/snapshot_iter\_44718.caffemodel", "can-detect/mean.binaryproto" );

Mean file is located at jobs folder of your database.

Hi AastaLL

Thank you for your test.
I tried using mean file, but it does not work.no bounding box detected.
Could you tell me the input data size of your network?
Maybe the size of customer data (640x480) cause the problem.
I can send you my network, if necessary

Sure, please share your network to us for better investigation.

Thanks.

Hi,

Sorry for the late reply.
This issue is caused by duplicated mean subtraction.

In detectNet.cpp, following function subtract mean with value (104, 116, 122)

if( CUDA_FAILED(cudaPreImageNetMean((float4*)rgba, width, height, mInputCUDA, mWidth, mHeight, make_float3(104.0069879317889f, 116.66876761696767f, 122.6789143406786f))) )

But in deploy.prototxt, following layer also subtract mean with 127.

layer {
  name: "deploy_transform"
  type: "Power"
  bottom: "data"
  top: "transformed_data"
  power_param {
    shift: -127.0
  }
}

Duplicate mean subtraction make the result poor.

Please apply this change to turn off cuda mean subtract.

diff --git a/detectNet.cpp b/detectNet.cpp
index 888f330..bcdd2fb 100644
--- a/detectNet.cpp
+++ b/detectNet.cpp
@@ -164,7 +164,7 @@ bool detectNet::Detect( float* rgba, uint32_t width, uint32_t height, float* bou
        
        // downsample and convert to band-sequential BGR
        if( CUDA_FAILED(cudaPreImageNetMean((float4*)rgba, width, height, mInputCUDA, mWidth, mHeight,
-                                                                 make_float3(104.0069879317889f, 116.66876761696767f, 122.6789143406786f))) )
+                                                                 make_float3(0.0f, 0.0f, 0.0f))) )^M
        {
                printf("detectNet::Classify() -- cudaPreImageNetMean failed\n");
                return false;

With above change, I got 8 bounding box detected. I think it should be the same as DIGITs.

8 bounding boxes detected
bounding box 0 (182.777344, 127.896484) (210.812500, 179.750000) w=28.035156 h=51.853516
bounding box 1 (214.523438, 127.635742) (236.656250, 178.625000) w=22.132812 h=50.989258
bounding box 2 (283.410156, 128.017578) (310.054688, 175.875000) w=26.644531 h=47.857422
bounding box 3 (329.093750, 127.913086) (353.156250, 175.078125) w=24.062500 h=47.165039
bounding box 4 (0.173340, 130.625000) (17.187500, 189.750000) w=17.014160 h=59.125000
bounding box 5 (17.638672, 129.140625) (71.183594, 185.875000) w=53.544922 h=56.734375
bounding box 6 (87.726562, 130.281250) (157.937500, 186.187500) w=70.210938 h=55.906250
bounding box 7 (241.193359, 128.992188) (264.609375, 179.687500) w=23.416016 h=50.695312
draw boxes 8 0 0.000000 200.000000 255.000000 100.00

Hi AastaLL

Thank you for your reply.
Your comment is very helpful to me

Thanks

Why do you apply a mean correction of 0.0, instead of just deleting that step? Seems like it would waste resources doing nothing this way.

Yes, you can just delete it.

There is actually a variant of the function called cudaPreImageNet() which does no mean pixel subtraction, see here: [url]jetson-inference/imageNet.cu at 29b7c6828f8ca6d6a81e5b1e23b3b5847e1a9877 · dusty-nv/jetson-inference · GitHub

Although the difference in performance is very small because it’s a constant subtraction. Regardless of if mean pixel subtraction is used or not, both functions still reorganize the image data into the C x H x W planar format that TensorRT expects (as opposed to the typical packed band-interleaved RGB image from the camera).

Also in a recent change to master, I added an alternative API to DetectNet class for networks that already have mean value subtraction as part of a network layer. See here: [url]Jetson Inference: detectNet Class Reference

If a mean pixel of 0.0 is specified, the cudaPreImageNet() function is used, otherwise cudaPreImageNetMean() is used to subtract the specified mean pixel value from the input image (from the IF statement occurring here)