I generated YOLOv4 engine from this repo with specific SHA commit GitHub - marcoslucianops/DeepStream-Yolo at ab6de54c4398c1daeb6fc31b06af29f97663f211 (use Darknet YOLOV4 .weights and .cfg files)
I can run succesfully, save predictions and evaluate engine models in Deepstream. But now I want to evaluate the YOLOv4 engine model in TensorRT Python API. I call shared library by using ctypes.DLL(path-to-generated-file-.so.) but the result is seem wrong and detect nothing.
I run debug and recognize that there is a problem with output of engine model. data[3] is class indices, it must from 0 to 79. But I receive very small values. I check for some images and it is very small too. It seems that value is not correct and like random value that system assigned to allocate memory.
Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
as you know, deepstream SDK is a c lib, it uses TensorRT API to generate engine and do inference. if deesptream app works fine, the engine generated by deepstream-app should be correct.
did you test other model’s TRT engine by this method? wondering if it’s the right method.
Yeah, if I use latest commit of this repo, generated engine model in DS worked fine with TensorRT Python API (I can use execute_async_v2()). But with old commit GitHub - marcoslucianops/DeepStream-Yolo at ab6de54c4398c1daeb6fc31b06af29f97663f211 it seem that, it used implicit batch and I only could use execute_async(), if I used execute_async_v2() the error was happen.
In the both cases, original model is same (Darknet yolov4), and use .so file in inference code with Tensorrt python API.
Yeah, input shape is (3, 608, 608) (3 dimensions can only use execute_async()), it is not (1, 3, 608, 608) (4 dimensions can use execute_async_v2(). I read NVIDIA docs execute_async() is fro implicit batch and execute_async_v2() for explicit batch.
if needing explicit batch, please comment out “force-implicit-batch-dim=1”. after checking, the configuration files of two branches are same. can master’s app use the engine the branch creates? can branch use the engine master creates? wondering if there is different between the two models.
When I set “force-implicit-batch-dim=1”, input shape is also (3, 608, 808) when checking binding shape, and it did not work too.
YOLOv4 engine from ONNX I check and I worked. But I want to check YOLOv4 engine from Darknet.
can this model support dynamic batch? you can try this sample command:
/usr/src/tensorrt/bin/trtexec --fp16 --onnx=./models/yolov5/yolov5s.onnx
–saveEngine=./models/yolov5/1/yolov5s.onnx_b4_gpu0_fp16.engine --minShapes=images:1x3x672x672
–optShapes=images:4x3x672x672 --maxShapes=images:4x3x672x672 --shapes=images:4x3x672x672 --workspace=10000
the first number 1 is batch-size. Darknet yolov4 model dose not support dynamic batch even if bath-size is set. you can use this command to check.
/usr/src/tensorrt/bin/trtexec --loadEngine=xx.engine --int8, you can see the line of “Created input binding for Input with dimensions”.
[07/11/2023-05:42:51] [I] Setting persistentCacheLimit to 0 bytes.
[07/11/2023-05:42:51] [I] Using random values for input data
[07/11/2023-05:42:51] [I] Created input binding for data with dimensions 3x608x608
[07/11/2023-05:42:51] [I] Using random values for output num_detections
[07/11/2023-05:42:51] [I] Created output binding for num_detections with dimensions 1
[07/11/2023-05:42:51] [I] Using random values for output detection_boxes
[07/11/2023-05:42:51] [I] Created output binding for detection_boxes with dimensions 22743x4
[07/11/2023-05:42:51] [I] Using random values for output detection_scores
[07/11/2023-05:42:51] [I] Created output binding for detection_scores with dimensions 22743
[07/11/2023-05:42:51] [I] Using random values for output detection_classes
[07/11/2023-05:42:51] [I] Created output binding for detection_classes with dimensions 22743
[07/11/2023-05:42:51] [I] Starting inference
[07/11/2023-05:42:54] [I] Warmup completed 9 queries over 200 ms
[07/11/2023-05:42:54] [I] Timing trace has 204 queries over 3.03516 s