Yolov3 Implementation error in inference

Please provide the following info (check/uncheck the boxes after creating this topic):
Software Version
DRIVE OS Linux 5.2.6
[+] DRIVE OS Linux 5.2.6 and DriveWorks 4.0
DRIVE OS Linux 5.2.0
DRIVE OS Linux 5.2.0 and DriveWorks 3.5
NVIDIA DRIVE™ Software 10.0 (Linux)
NVIDIA DRIVE™ Software 9.0 (Linux)
other DRIVE OS version
other

Target Operating System
[+] Linux
QNX
other

Hardware Platform
[+] NVIDIA DRIVE™ AGX Xavier DevKit (E3550)
NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)
other

SDK Manager Version
[+] 1.8.0.10363
other

Host Machine Version
[+] native Ubuntu 18.04
other

Hi team,

I am trying to implement yolov3 in dw 4 by modifying the sample object detection and tracking.
I followed the yolov3 (tensorrt sample provided). Converted the model in onnx format and later to .bin file uisng tensorrt optimization tool.
But am getting error stating ‘illegal memory access’. Following that I ran the cuda-memcheck and the result is attached.
I did not get any error while creating tensorrt optimized model. Also its able to initialise the DNN and return the output blob sizes etc.
What could be the reason for this error?
am attaching the main.cpp and memcheck result and dataconditioner parameters used.

One thing to add: Intermittently I was able to run the code and generate the output, but most of the time I was getting the above stated error.
cuda_mem_check_yolov3.txt (215.0 KB)
main.cpp (40.3 KB)
model32.bin.json (321 Bytes)

Dear @nithin.m1,
Intermittently I was able to run the code and generate the output, but most of the time I was getting the above stated error

So, you could successfully load model and perform inference couple of times in the multiple runs of same executable(loading same trt model)?

yes but previously. I did not change core code after that other than adding some functions (means did not touch input output initialization /allocation and inference part etc). Now every time am getting this error.

Dear @nithin.m1,
Is the issue occurring in tensorRT model loading now? were you able to narrow down which line/API/function is hitting the issue?

Hi Siva,

While running the code, right after the inference using dwDNN_inferRaw line, we are copying the outputs to host from device, there am getting this ‘illegal memory access’ error. Thats why I ran cuda-memcheck and attached.

Dear @nithin.m1,

Commands

/usr/local/driveworks-4.0/tools/dnn/tensorRT_optimization --modelType=onnx --onnxFile=/usr/src/tensorrt/samples/python/yolov3_onnx/yolov3.onnx --out=~/yolov3.bin
./sample_object_detector_tracker_yolov3 --tensorRT_model=/home/sindarapu/yolov3.bin --video=</path/to/DW_data>/samples/sfm/triangulation/video_0.h264 

I see output like below

[22-06-2022 15:06:01] Platform: Detected Generic x86 Platform
[22-06-2022 15:06:01] TimeSource: monotonic epoch time offset is 1655885559259828
[22-06-2022 15:06:01] Platform: number of GPU devices detected 1
[22-06-2022 15:06:01] Platform: currently selected GPU device discrete ID 0
[22-06-2022 15:06:01] Context::mountResourceCandidateDataPath resource FAILED to mount from '/home/sindarapu/DW-4.0-Host/src/dnn/sample_object_detector_tracker_yolov3/data/': VirtualFileSystem: Failed to mount '/home/sindarapu/DW-4.0-Host/src/dnn/sample_object_detector_tracker_yolov3/data/[.pak]'
[22-06-2022 15:06:01] Context::findDataRootInPathWalk data/DATA_ROOT found at: /usr/local/driveworks-4.0/data
[22-06-2022 15:06:01] Context::mountResourceCandidateDataPath resource FAILED to mount from '/usr/local/driveworks-4.0/data/': VirtualFileSystem: Failed to mount '/usr/local/driveworks-4.0/data/[.pak]'
[22-06-2022 15:06:01] Context::findDataRootInPathWalk data/DATA_ROOT found at: /usr/local/driveworks-4.0/data
[22-06-2022 15:06:01] Context::mountResourceCandidateDataPath resource FAILED to mount from '/usr/local/driveworks-4.0/data/': VirtualFileSystem: Failed to mount '/usr/local/driveworks-4.0/data/[.pak]'
[22-06-2022 15:06:01] SDK: No resources(.pak) mounted, some modules will not function properly
[22-06-2022 15:06:01] TimeSource: monotonic epoch time offset is 1655885559259828
[22-06-2022 15:06:01] Initialize DriveWorks SDK v4.0.0
[22-06-2022 15:06:01] Release build with GNU 7.4.0 from no-gitversion-build
[22-06-2022 15:06:01] SensorFactory::createSensor() -> camera.virtual, offscreen=0,profiling=1,tensorRT_model=/home/sindarapu/yolov3.bin,video=/home/sindarapu/DW-4.0-Host/_data/samples/sfm/triangulation/video_0.h264
[22-06-2022 15:06:01] CameraVirtual: defaulting to non SIPL
[22-06-2022 15:06:01] CameraBase: pool size set to 8
[22-06-2022 15:06:01] CameraNVCUVID: no seek table found at /home/sindarapu/DW-4.0-Host/_data/samples/sfm/triangulation/video_0.h264.seek, seeking is not available.
SimpleCamera: Camera image: 1280x800
Camera image with 1280x800 at 30 FPS
[22-06-2022 15:06:02] Loaded engine size: 351 MB
[22-06-2022 15:06:02] [MemUsageSnapshot] deserializeCudaEngine begin: CPU 154 MB, GPU 584 MB
[22-06-2022 15:06:03] [MemUsageChange] Init cuDNN: CPU +430, GPU +148, now: CPU 585, GPU 1084 (MB)
[22-06-2022 15:06:03] [MemUsageChange] Init cuBlas: CPU +58, GPU +44, now: CPU 643, GPU 1128 (MB)
[22-06-2022 15:06:03] Deserialize required 772327 microseconds.
[22-06-2022 15:06:03] [MemUsageSnapshot] deserializeCudaEngine end: CPU 643 MB, GPU 1110 MB
[22-06-2022 15:06:03] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 643, GPU 1120 (MB)
[22-06-2022 15:06:03] [MemUsageChange] Init cuBlas: CPU +0, GPU +8, now: CPU 643, GPU 1128 (MB)
[22-06-2022 15:06:03] DNN: Metadata json file could not be found. Metadata has been filled with default values. Please place <network_filename>.json in the same directory as the network file if custom metadata is needed.
m_largeIdx= 0m_mediumIdx= 1m_smallIdx= 2
[22-06-2022 15:06:03] DataConditioner: Scale transformation has been configured with 1.0000000.
[22-06-2022 15:06:03] DataConditioner: Mean value subtract transformation has been configured with {0.00000, 0.00000, 0.00000}.
[22-06-2022 15:06:03] DataConditioner: Standard deviation has been configured with {1.00000, 1.00000, 1.00000}.
m_detectionRegion.x= 0
bbox_score= 1.50015e-06
bbox_score= 1.19788e-07
bbox_score= 8.95024e-08
bbox_score= 8.56612e-08
bbox_score= 3.45667e-08
bbox_score= 2.07359e-11
bbox_score= 5.34504e-18
bbox_score= 1.32322e-26
bbox_score= 3.1229e-23
bbox_score= 1.39939e-23



I used the main.cpp provided in the description and I see there is no memory access issues on my host.

Hi @SivaRamaKrishnaNV ,

Thanks for trying it out. yes even its running fine for me when I converted your onnx file to .bin file and tried.
something should have gone wrong while I converted model to onnx format. But as I mentioned I just followed the link. TensorRT/yolov3_to_onnx.py at main · NVIDIA/TensorRT · GitHub
Let me check on that.

Thank you,
Nithin

Dear @nithin.m1,
please use files at /usr/src/tensorrt/samples/python/yolov3_onnx