Different output value between TensorRT and Darknet

TensorRT version :
CUDA version : 9.0
cuDNN version : 7.3.1
Driver version : 384.145
GPU : V100


I have tested ‘yolov3_onnx’ sample in ‘TensorRT5.0.2/samples/python’. but output value is different with darknet detector.
double checked ‘nms threshold’ and ‘obj threshold’. and there are no quantization(fp32). but output values are different.

is it ok? I don’t know why output of model is different between ‘tensorrt model’ and ‘original model’

i found some caution in developer guide.
‘yolo v3’ example has to use onnx-tensorrt because tensorrt not support ‘leakyrelu’ and ‘upsample’.
is it true?
but i found also support operation list for onnx in developer guide contain ‘leakyrelu’ and ‘upsample’.
yes, i’m confused :(


can you provide a specific example where Darknet detector produces a different result compared to TRT?


Linux : Ubuntu 16.04
Python version 2.7.15


I downloaded ‘Darknet’ using ‘git clone https://github.com/pjreddie/darknet.git’.
And I downloaded ‘TensorRT-’ from download page and I used python sample code for yolov3 to tensorrt.
I didn’t any change the code. and I used the same ‘.cfg file’ and ‘.weights file’(these are from ‘yolov3_to_onnx.py’)

I ran two detector(Darknet, TensorRT) for same image.
But I got different outputs.

from Darknet:
dog: 85%
dog: 84%
dog: 53%
person: 50%

from TensorRT:
[[180.32643636 6.89801149 107.12263617 162.63818179]
[189.34186364 57.27135054 94.68533009 115.64401249]
[ 74.13400486 107.20060694 55.11068005 70.82084432]]
[0.63056643 0.92838418 0.71350949]
[ 0 16 16]

Darknet detect three dogs and one person but TensorRT detect two dogs and one person.
Also there are different output values(the probability of object). e.g, TensorRT detect person as 63% confidence but Darknet detect person as 50% confidence(it is huge gap).

I thought if there are no quantization, then output values are identical. is it right?
please, clarify why output values are different after convert to TensorRT.


p.s, I uploaded test image, TensorRT sample code and result images.
yolov3_TensorRT.zip (15.2 KB)



any updates…?

Just to confirm, i think you have your file names reversed? TensorRT_predictions.png shows 3 dogs. Darknet_predictions.jpg shows 2 dogs 1 person.


Per engineering: This appears to be an issue with using light-weight PIL (trt example) instead of OpenCV (darknet) for loading + pre-processing the input .

The image you provided has a resolution of 177x284 px and has to be upscaled by a huge factor to 608x608.

The assumption is that using OpenCV (darknet) vs. using PIL in the sample may make a big difference. Also the interpolation method will have an influence - we use PIL.Image.BICUBIC.

The original dog image for this sample has a resolution of 576x786 px. Here the difference between DarkNet and TensorRT for the original dog.jpg image.

dog: 100%
truck: 92%
bicycle: 99%

[0.99854713 0.99880403 0.93829263] [16 1 7]
(16=dog, 1=bicycle, 7=truck)

The recommendation is either to supply higher resolution input image into sample, or modify TRT sample to use opencv instead of PIL.



Thanks for your comment.
I have tried high-resolution image but I got same problem.
And I have modified both Darknet and TRT sample to use OpenCV for image resize.

I used L2 norm to check both resized images are same. And I have exactly the same L2 norm value.

[TRT sample]
L2 norm of input : 804.7459466025994
L2 norm of input : 804.7459466

But, sadly…I got different output values.

[TRT sample]
dog: 0.9275025682158862
dog: 0.6527860108775222
dog: 93%
dog: 57%

one dog has similar confidence(probability). but the other dog has huge different confidence(about 10%).
well…I also checked parsed weight value for both Darknet and TRT sample and these are same.
where came from error?
Could you clarify this problem?

And I attach the modified files for TRT sample and Darknet.
yolov3_onnx.zip (15.4 KB)
modified_darknet.zip (16.4 KB)



seems TRT has higher confidence than Darknet for the “2nd” dog ? 0.65 vs. .57? or is it reversed?


In #8 reply, I’m seeing better inference result for TRT than darknet. or is the data reversed? In general, we cannot guarantee bit-correctness to open-source repos.


no, it’s not reversed.
but…that’s not my point of this thread.
my point is that same input and same weights make the same output.

okay, thanks :)


In general, we cannot guarantee bit-correctness to open-source repos that are constantly updated by the community (the YOLO layer implementation in this sample is from ~July 2018)

As I excute the yolo3 to onnx sample code,it gave me some error message like this:

Traceback (most recent call last):
  File "yolov3_to_onnx.py", line 761, in <module>
  File "yolov3_to_onnx.py", line 754, in main
  File "/usr/local/lib/python2.7/dist-packages/onnx/checker.py", line 86, in check_model
onnx.onnx_cpp2py_export.checker.ValidationError: Node (086_upsample) has input size 1 not in range [min=2, max=2].

==> Context: Bad node spec: input: "085_convolutional_lrelu" output: "086_upsample" name: "086_upsample" op_type: "Upsample" attribute { name: "mode" s: "nearest" type: STRING } attribute { name: "scales" floats: 1 floats: 1 floats: 2 floats: 2 type: FLOATS }

did anyone have this problem?

By the way, my environment is:

TensorRT version :
 CUDA version : 9.0
 cuDNN version : 7.4.2

i have the same problem and i replaced the version of onnx (1.2.1).


I am trying the same repo to convert YOLO to ONNX to TensorRT.

I can successfully get the .trt file but getting a segmentation error on inference.

TensorRT version :
CUDA version : 10.1
cuDNN version : 7.4.2
GPU : V100 (AWS)

Error Dump :-

Loading ONNX file from path yolov3.onnx...
Beginning ONNX file parsing
Completed parsing of ONNX file
Building an engine from file yolov3.onnx; this may take a while...
Completed creating Engine
Running inference on image dog.jpg...
Fatal Python error: Segmentation fault

Current thread 0x00007fbb1850b700 (most recent call first):
  File "/workspace/tensorrt/samples/python/yolov3_onnx/../common.py", line 145 in do_inference
  File "onnx_to_tensorrt.py", line 160 in main
  File "onnx_to_tensorrt.py", line 183 in <module>
Segmentation fault (core dumped)

Below is the function where it throws the error:-

def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):
    start = time.time()
    # Transfer input data to the GPU.
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
    # Run inference.
    context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)
    # Transfer predictions back from the GPU.
    [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
    # Synchronize the stream
    # Return only the host outputs.
    print("=> time: %.4f" %(time.time()-start))
    return [out.host for out in outputs]

Any help is appreciated.