TensorRT version : 5.0.2.10
CUDA version : 9.0
cuDNN version : 7.3.1
Driver version : 384.145
GPU : V100
Hello,
I have tested ‘yolov3_onnx’ sample in ‘TensorRT5.0.2/samples/python’. but output value is different with darknet detector.
double checked ‘nms threshold’ and ‘obj threshold’. and there are no quantization(fp32). but output values are different.
is it ok? I don’t know why output of model is different between ‘tensorrt model’ and ‘original model’
i found some caution in developer guide.
‘yolo v3’ example has to use onnx-tensorrt because tensorrt not support ‘leakyrelu’ and ‘upsample’.
is it true?
but i found also support operation list for onnx in developer guide contain ‘leakyrelu’ and ‘upsample’.
yes, i’m confused :(
I downloaded ‘Darknet’ using ‘git clone https://github.com/pjreddie/darknet.git’.
And I downloaded ‘TensorRT-5.0.2.6’ from download page and I used python sample code for yolov3 to tensorrt.
I didn’t any change the code. and I used the same ‘.cfg file’ and ‘.weights file’(these are from ‘yolov3_to_onnx.py’)
I ran two detector(Darknet, TensorRT) for same image.
But I got different outputs.
from Darknet:
dog: 85%
dog: 84%
dog: 53%
person: 50%
Darknet detect three dogs and one person but TensorRT detect two dogs and one person.
Also there are different output values(the probability of object). e.g, TensorRT detect person as 63% confidence but Darknet detect person as 50% confidence(it is huge gap).
I thought if there are no quantization, then output values are identical. is it right?
please, clarify why output values are different after convert to TensorRT.
Thanks.
p.s, I uploaded test image, TensorRT sample code and result images.
Per engineering: This appears to be an issue with using light-weight PIL (trt example) instead of OpenCV (darknet) for loading + pre-processing the input .
The image you provided has a resolution of 177x284 px and has to be upscaled by a huge factor to 608x608.
The assumption is that using OpenCV (darknet) vs. using PIL in the sample may make a big difference. Also the interpolation method will have an influence - we use PIL.Image.BICUBIC.
The original dog image for this sample has a resolution of 576x786 px. Here the difference between DarkNet and TensorRT for the original dog.jpg image.
Thanks for your comment.
I have tried high-resolution image but I got same problem.
And I have modified both Darknet and TRT sample to use OpenCV for image resize.
I used L2 norm to check both resized images are same. And I have exactly the same L2 norm value.
[TRT sample]
L2 norm of input : 804.7459466025994
[Darknet]
L2 norm of input : 804.7459466
one dog has similar confidence(probability). but the other dog has huge different confidence(about 10%).
well…I also checked parsed weight value for both Darknet and TRT sample and these are same.
where came from error?
Could you clarify this problem?
And I attach the modified files for TRT sample and Darknet. yolov3_onnx.zip (15.4 KB)
In #8 reply, I’m seeing better inference result for TRT than darknet. or is the data reversed? In general, we cannot guarantee bit-correctness to open-source repos.
In general, we cannot guarantee bit-correctness to open-source repos that are constantly updated by the community (the YOLO layer implementation in this sample is from ~July 2018)
I am trying the same repo to convert YOLO to ONNX to TensorRT.
I can successfully get the .trt file but getting a segmentation error on inference.
TensorRT version : 5.1.2.2
CUDA version : 10.1
cuDNN version : 7.4.2
GPU : V100 (AWS)
Error Dump :-
Loading ONNX file from path yolov3.onnx...
Beginning ONNX file parsing
Completed parsing of ONNX file
Building an engine from file yolov3.onnx; this may take a while...
Completed creating Engine
Running inference on image dog.jpg...
Fatal Python error: Segmentation fault
Current thread 0x00007fbb1850b700 (most recent call first):
File "/workspace/tensorrt/samples/python/yolov3_onnx/../common.py", line 145 in do_inference
File "onnx_to_tensorrt.py", line 160 in main
File "onnx_to_tensorrt.py", line 183 in <module>
Segmentation fault (core dumped)
Below is the function where it throws the error:-
def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):
start = time.time()
# Transfer input data to the GPU.
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
# Run inference.
context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)
# Transfer predictions back from the GPU.
[cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
# Synchronize the stream
stream.synchronize()
# Return only the host outputs.
print("=> time: %.4f" %(time.time()-start))
return [out.host for out in outputs]