TensorRT Divide by 0 Errors with YOLOv8 Seg model

Please provide complete information as applicable to your setup.

  • Hardware Platform: Jetson Orin AGX
  • DeepStream Version: 7.0
  • JetPack Version: 6.0
  • TensorRT Version: 8.6.2.3
  • Issue Type: Bug
  • How to reproduce:
    1. Clone DeepStream-Yolo-Seg
    2. Make sure ultralytics pip package is installed
    3. Change line 11 in utils/export_yoloV8_seg.py to read
      from ultralytics.utils.torch_utils import select_device
    4. Download yolov8n-seg.pt
    5. Run python utils/export_yoloV8_seg.py -w yolov8n-seg.pt
    6. Run CUDA_VER=12.2 make -C nvdsinfer_custom_impl_Yolo_seg
    7. Change line 5 and 6 in config_infer_primary_yoloV8_seg.txt to
      onnx-file=yolov8n-seg.onnx
      model-engine-file=yolov8n-seg.onnx_b1_gpu0_fp32.engine
    8. Run deepstream-app -c deepstream_app_config.txt

The pipeline runs for a few seconds then crashes with the error Error Code 1: Myelin (Division by 0 detected in the shape graph. Tensor (Divisor) “sp__mye3” is equal to 0.; ). This error is referenced in two other issues with no real resolution. Someone mentioned they were able to get it working with the container http://nvcr.io/nvidia/deepstream-l4t:6.2-base, but when I run deepstream-app in that container I get a command not found error since the DeepStream installation seems to be missing its bin folder where deepstream-app is contained. Trying with http://nvcr.io/nvidia/deepstream-l4t:6.2-samples or http://nvcr.io/nvidia/deepstream-l4t:6.3-samples which have the deepstream-app binary also fail due to GLIBC_2.34 and GLIBCXX_3.4.29 not being found. @fanzh posted 6 months ago that this divide by zero error was an issue in TensorRT and would be fixed in a later release. There are new versions of TensorRT but DeepStream does not support them so I am not sure where to go from here. Any advice on how to work around the known bug or a timeline for when DeepStream will support a fixed version of TensorRT would be appreciated.

1.This is a bug of TRT-8.6. And it is part of Jetpack. The DS-7.0 still depends TRT-8.6. So don’t upgrade TRT, it may cause DS-7.0 to stop working.

2.Some libraries and device node are shared between host and docker on jetson.
This means that you need to burn the same Jetpack version to run the corresponding Deepstream docker image.

3.There is no clear timetable for upgrading TRT. If you want a workaround to avoid this issue, recommended to burn JP-5.1.2 GA and DS-6.3

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.