Unexpected exception an illegal memory access was encountered

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
GPU
• DeepStream Version
6.0
• JetPack Version (valid for Jetson only)
• TensorRT Version
8
• NVIDIA GPU Driver Version (valid for GPU only)
470.57.02 / 510.06
• Issue Type( questions, new requirements, bugs)
bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I am encountering an issue with NvInfer running a Yolo V5 ONNX model. The engine builds just fine, but encounters an error on the first frame. Error occurs on T4 (470.57.02) and RTX3060 (510.06):

Sample pipeline:

gst-launch-1.0 uridecodebin uri=“file://test.mp4” ! nvvideoconvert ! “video/x-raw(memory:NVMM), format=NV12, width=640, height=640” ! m.sink_0 nvstreammux name=“m” batch-size=1 ! nvinfer config-file-path=“config.txt” ! fakesink

Config file:

[property]
gpu-id=0
net-scale-factor=1
model-color-format=0
onnx-file=yolov5l_fp16_640.onnx
batch-size=1
network-mode=2
interval=0
gie-unique-id=1
process-mode=1
network-type=100
output-tensor-meta=1

Log (RTX3060):

ERROR: nvdsinfer_context_impl.cpp:1763 Failed to synchronize on cuda copy-coplete-event, cuda err_no:700, err_str:cudaErrorIllegalAddress
ERROR: [TRT]: 1: [convolutionRunner.cpp::checkCaskExecError::440] Error Code 1: Cask (Cask Convolution execution)
ERROR: [TRT]: 1: [apiCheck.cpp::apiCatchCudaError::17] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
ERROR: nvdsinfer_backend.cpp:506 Failed to enqueue trt inference batch
ERROR: nvdsinfer_context_impl.cpp:1643 Infer context enqueue buffer failed, nvinfer error:NVDSINFER_TENSORRT_ERROR
error: Failed to dequeue output from inferencing. NvDsInferContext error: NVDSINFER_CUDA_ERROR
0:00:04.332562194 e[335m 2950e[00m 0x556eda4aaa80 e[33;01mWARN e[00m e[00m nvinfer gstnvinfer.cpp:2325:gst_nvinfer_output_loop:e[00m error: Failed to dequeue output from inferencing. NvDsInferContext error: NVDSINFER_CUDA_ERROR
NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::releaseBatchOutput() <nvdsinfer_context_impl.cpp:1789> [UID = 1]: Tried to release an unknown outputBatchID
0:00:04.332700098 e[335m 2950e[00m 0x556eda4aaa80 e[33;01mWARN e[00m e[00m nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:e[00m NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::releaseBatchOutput() <nvdsinfer_context_impl.cpp:1789> [UID = 1]: Tried to release an unknown outputBatchID
error: Failed to queue input batch for inferencing
0:00:04.335288474 e[335m 2950e[00m 0x556eda4aaad0 e[33;01mWARN e[00m e[00m nvinfer gstnvinfer.cpp:1324:gst_nvinfer_input_queue_loop:e[00m error: Failed to queue input batch for inferencing
ERROR: nvdsinfer_context_impl.cpp:341 Failed to make stream wait on event, cuda err_no:700, err_str:cudaErrorIllegalAddress
ERROR: nvdsinfer_context_impl.cpp:1619 Preprocessor transform input data failed., nvinfer error:NVDSINFER_CUDA_ERROR
error: Failed to queue input batch for inferencing
0:00:04.335523770 e[335m 2950e[00m 0x556eda4aaad0 e[33;01mWARN e[00m e[00m nvinfer gstnvinfer.cpp:1324:gst_nvinfer_input_queue_loop:e[00m error: Failed to queue input batch for inferencing
ERROR: nvdsinfer_context_impl.cpp:341 Failed to make stream wait on event, cuda err_no:700, err_str:cudaErrorIllegalAddress
ERROR: nvdsinfer_context_impl.cpp:1619 Preprocessor transform input data failed., nvinfer error:NVDSINFER_CUDA_ERROR
error: Failed to queue input batch for inferencing
0:00:04.335633321 e[335m 2950e[00m 0x556eda4aaad0 e[33;01mWARN e[00m e[00m nvinfer gstnvinfer.cpp:1324:gst_nvinfer_input_queue_loop:e[00m error: Failed to queue input batch for inferencing

Log (T4):

ERROR: [TRT]: Unexpected exception an illegal memory access was encountered
ERROR: [TRT]: [apiCheck.cpp::apiCatchCudaError::17] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
ERROR: nvdsinfer_backend.cpp:506 Failed to enqueue trt inference batch
ERROR: nvdsinfer_context_impl.cpp:1643 Infer context enqueue buffer failed, nvinfer error:NVDSINFER_TENSORRT_ERROR
error: Failed to queue input batch for inferencing
error: Failed to queue input batch for inferencing
error: Failed to queue input batch for inferencing
error: Failed to queue input batch for inferencing
0:09:03.982781980 1 0x556393c95ed0 WARN nvinfer gstnvinfer.cpp:1324:gst_nvinfer_input_queue_loop: error: Failed to queue input batch for inferencing
ERROR: nvdsinfer_context_impl.cpp:341 Failed to make stream wait on event, cuda err_no:700, err_str:cudaErrorIllegalAddress
ERROR: nvdsinfer_context_impl.cpp:1619 Preprocessor transform input data failed., nvinfer error:NVDSINFER_CUDA_ERROR
0:09:03.983365184 1 0x556393c95ed0 WARN nvinfer gstnvinfer.cpp:1324:gst_nvinfer_input_queue_loop: error: Failed to queue input batch for inferencing
ERROR: nvdsinfer_context_impl.cpp:341 Failed to make stream wait on event, cuda err_no:700, err_str:cudaErrorIllegalAddress
ERROR: nvdsinfer_context_impl.cpp:1619 Preprocessor transform input data failed., nvinfer error:NVDSINFER_CUDA_ERROR
0:09:03.983417618 1 0x556393c95ed0 WARN nvinfer gstnvinfer.cpp:1324:gst_nvinfer_input_queue_loop: error: Failed to queue input batch for inferencing
ERROR: nvdsinfer_context_impl.cpp:341 Failed to make stream wait on event, cuda err_no:700, err_str:cudaErrorIllegalAddress
ERROR: nvdsinfer_context_impl.cpp:1619 Preprocessor transform input data failed., nvinfer error:NVDSINFER_CUDA_ERROR
0:09:03.983564729 1 0x556393c95ed0 WARN nvinfer gstnvinfer.cpp:1324:gst_nvinfer_input_queue_loop: error: Failed to queue input batch for inferencing

Any suggestions are welcome.

Thanks,

yolov5l_fp16_640.onnx (93.0 MB)

Hi @IvensaMDH ,
I can reproduce the T4 error, and by cuda-gdb , seems it crashed on scatterKernel() of ScatterND plugin?
Is it possible to change your onnx to fp32 network and try again?

Thanks!

Interesting…

I am getting the same error with the FP32 model Download.

Thanks,

EDIT:
It seems to be working when setting inplace=false when converting the Yolo model to .onnx format yolo.py#L61

Working FP16 model attached.

/M
yolov5l_fp16_1_noinplace.onnx (89.2 MB)

Cool! Thank you so much for the update!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.