Generated engine YOLOv4 from Darknet in Deepstream can not run properly in TensorRT Python API?

I generated YOLOv4 engine from this repo with specific SHA commit GitHub - marcoslucianops/DeepStream-Yolo at ab6de54c4398c1daeb6fc31b06af29f97663f211 (use Darknet YOLOV4 .weights and .cfg files)
I can run succesfully, save predictions and evaluate engine models in Deepstream. But now I want to evaluate the YOLOv4 engine model in TensorRT Python API. I call shared library by using ctypes.DLL(path-to-generated-file-.so.) but the result is seem wrong and detect nothing.

I run debug and recognize that there is a problem with output of engine model.
data[3] is class indices, it must from 0 to 79. But I receive very small values. I check for some images and it is very small too. It seems that value is not correct and like random value that system assigned to allocate memory.

Hi,

This looks like a Deepstream related issue. We will move this post to the Deepstream forum.

Thanks!

Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

My environment:
Jetpack4.6.3
Tensorrt 8.2.1.9
Deepstream 6.0

noticing the deepstream app run fine, is this test releated to deepstream?

could you elaborate on this test? Thanks!

It worked in Deepstream, predictions in Deepstream is ok. But the generated engine in Deepstream not worked properly with Tensorrt python API.

I attached debug image and details in the first post. I used method execute_async() for context.

as you know, deepstream SDK is a c lib, it uses TensorRT API to generate engine and do inference. if deesptream app works fine, the engine generated by deepstream-app should be correct.
did you test other model’s TRT engine by this method? wondering if it’s the right method.

Yeah, if I use latest commit of this repo, generated engine model in DS worked fine with TensorRT Python API (I can use execute_async_v2()). But with old commit GitHub - marcoslucianops/DeepStream-Yolo at ab6de54c4398c1daeb6fc31b06af29f97663f211 it seem that, it used implicit batch and I only could use execute_async(), if I used execute_async_v2() the error was happen.

In the both cases, original model is same (Darknet yolov4), and use .so file in inference code with Tensorrt python API.

@fanzh
Please check for me again.

after checking the code, it dose not use implicit batch because there is no “force-implicit-batch-dim=1”.

Yeah, input shape is (3, 608, 608) (3 dimensions can only use execute_async()), it is not (1, 3, 608, 608) (4 dimensions can use execute_async_v2(). I read NVIDIA docs execute_async() is fro implicit batch and execute_async_v2() for explicit batch.

  1. yes, execute_async_v2 is for “no implicit batch”, please refer to IExecutionContext — NVIDIA TensorRT Standard Python API Documentation 8.6.1 documentation
  2. if needing explicit batch, please comment out “force-implicit-batch-dim=1”. after checking, the configuration files of two branches are same. can master’s app use the engine the branch creates? can branch use the engine master creates? wondering if there is different between the two models.
  3. please refer to his C version tensorrt_yolov4.

When I set “force-implicit-batch-dim=1”, input shape is also (3, 608, 808) when checking binding shape, and it did not work too.
YOLOv4 engine from ONNX I check and I worked. But I want to check YOLOv4 engine from Darknet.

can this model support dynamic batch? you can try this sample command:
/usr/src/tensorrt/bin/trtexec --fp16 --onnx=./models/yolov5/yolov5s.onnx
–saveEngine=./models/yolov5/1/yolov5s.onnx_b4_gpu0_fp16.engine --minShapes=images:1x3x672x672
–optShapes=images:4x3x672x672 --maxShapes=images:4x3x672x672 --shapes=images:4x3x672x672 --workspace=10000

Thanks for quick respone.

I generated engine from Darknet yolov4 model (.weights and .config), not from ONNX model.

  1. can you combine the weights and .config to an onnx model?
  2. can you share the whole log when force-implicit-batch-dim=1?

I only want to check engine directly from Darknet.
Here is my log file.
full_log.txt (18.0 KB)

the first number 1 is batch-size. Darknet yolov4 model dose not support dynamic batch even if bath-size is set. you can use this command to check.
/usr/src/tensorrt/bin/trtexec --loadEngine=xx.engine --int8, you can see the line of “Created input binding for Input with dimensions”.

Yeah, here is output

[07/11/2023-05:42:51] [I] Setting persistentCacheLimit to 0 bytes.
[07/11/2023-05:42:51] [I] Using random values for input data
[07/11/2023-05:42:51] [I] Created input binding for data with dimensions 3x608x608
[07/11/2023-05:42:51] [I] Using random values for output num_detections
[07/11/2023-05:42:51] [I] Created output binding for num_detections with dimensions 1
[07/11/2023-05:42:51] [I] Using random values for output detection_boxes
[07/11/2023-05:42:51] [I] Created output binding for detection_boxes with dimensions 22743x4
[07/11/2023-05:42:51] [I] Using random values for output detection_scores
[07/11/2023-05:42:51] [I] Created output binding for detection_scores with dimensions 22743
[07/11/2023-05:42:51] [I] Using random values for output detection_classes
[07/11/2023-05:42:51] [I] Created output binding for detection_classes with dimensions 22743
[07/11/2023-05:42:51] [I] Starting inference
[07/11/2023-05:42:54] [I] Warmup completed 9 queries over 200 ms
[07/11/2023-05:42:54] [I] Timing trace has 204 queries over 3.03516 s

Sorry for the late reply, Is this still an DeepStream issue to support? Thanks