Using custom TensorRT model on sample applications

Please provide the following info (check/uncheck the boxes after creating this topic):
Software Version
DRIVE OS Linux 5.2.6
DRIVE OS Linux 5.2.6 and DriveWorks 4.0
DRIVE OS Linux 5.2.0
[*] DRIVE OS Linux 5.2.0 and DriveWorks 3.5
NVIDIA DRIVE™ Software 10.0 (Linux)
NVIDIA DRIVE™ Software 9.0 (Linux)
other DRIVE OS version
other

Target Operating System
[*] Linux
QNX
other

Hardware Platform
NVIDIA DRIVE™ AGX Xavier DevKit (E3550)
[*] NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)
other

SDK Manager Version
[*] 1.7.1.8928
other

Host Machine Version
[*] native Ubuntu 18.04
other

Ran sample application using pre-recorded video data using custom TensorRT model converted through Onnx to TRT with the help of this notebook TensorRT/4. Using PyTorch through ONNX.ipynb at main · NVIDIA/TensorRT · GitHub but we are getting the below error.

Command we used for conversion inside the jupyter notebook is:

# convert the ONNX model to a TRT engine using trtexec
if USE_FP16:
    !/usr/src/tensorrt/bin/trtexec --onnx=resnet50_pytorch.onnx --saveEngine=resnet_engine_pytorch.trt  --explicitBatch --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --fp16
else:
    !/usr/src/tensorrt/bin/trtexec --onnx=resnet50_pytorch.onnx --saveEngine=resnet_engine_pytorch.trt  --explicitBatch

Please advise the right way for converting models to TensorRT engine compatible for Driveworks apps.

Nvidia_Drive@tegra-ubuntu:/usr/local/driveworks-3.5/bin$ ./sample_dnn_tensor --input-type=video --video=/usr/local/driveworks-3.5/tools/capture/camera_front_center_120fov.h264 --tensorRT_model=/home/Nvidia_Drive/Documents/resnet_engine_pytorch.trt
[04-03-2022 16:43:23] Platform: Detected DDPX - Tegra A
[04-03-2022 16:43:23] TimeSource: monotonic epoch time offset is 1646133716873257
[04-03-2022 16:43:23] TimeSource: Could not detect valid PTP time source at nvpps. Fallback to eth0
[04-03-2022 16:43:23] TimeSource Eth: Lost PTP time synchronizaton. Synchronized time will not be available from this timesource.
[04-03-2022 16:43:23] TimeSource: Could not detect valid PTP time source at 'eth0'. Fallback to CLOCK_MONOTONIC.
[04-03-2022 16:43:23] Platform: number of GPU devices detected 2
[04-03-2022 16:43:23] Platform: currently selected GPU device discrete ID 0
[04-03-2022 16:43:23] Context::getDataPathFromSelfLocation DATA_ROOT found at: /usr/local/driveworks-3.5/data
[04-03-2022 16:43:23] SDK: No resources(.pak) mounted, some modules will not function properly
[04-03-2022 16:43:23] SDK: Create NvMediaDevice
[04-03-2022 16:43:23] SDK: Create NvMedia2D
[04-03-2022 16:43:23] SDK: use EGL display as provided
[04-03-2022 16:43:24] TimeSource: monotonic epoch time offset is 1646133716873257
[04-03-2022 16:43:24] TimeSource: Could not detect valid PTP time source at nvpps. Fallback to eth0
[04-03-2022 16:43:24] TimeSource Eth: Lost PTP time synchronizaton. Synchronized time will not be available from this timesource.
[04-03-2022 16:43:24] TimeSource: Could not detect valid PTP time source at 'eth0'. Fallback to CLOCK_MONOTONIC.
[04-03-2022 16:43:24] Initialize DriveWorks SDK v3.5.75
[04-03-2022 16:43:24] Release build with GNU 7.3.1 from heads/buildbrain-branch-0-gc61a9a35bd0 against Drive PDK v5.2.0.0
[04-03-2022 16:43:24] Initialize DriveWorks VisualizationSDK v3.5.75
[04-03-2022 16:43:24] Initialize DriveWorksGL SDK v3.5.75
[04-03-2022 16:43:24] SensorFactory::createSensor() -> camera.virtual, camera-group=a,camera-index=0,camera-type=ar0231-rccb-bae-sf3324,input-type=video,offscreen=0,profiling=1,slave=0,tensorRT_model=/home/Nvidia_Drive/Documents/resnet_engine_pytorch.trt,video=/usr/local/driveworks-3.5/tools/capture/camera_front_center_120fov.h264
[04-03-2022 16:43:24] CameraVirtual: defaulting to non SIPL
[04-03-2022 16:43:24] CameraBase: pool size set to 8
[04-03-2022 16:43:24] CameraVirtualNvMedia: no seek table found at /usr/local/driveworks-3.5/tools/capture/camera_front_center_120fov.h264.seek, seeking is not available.
SimpleCamera: Camera image: 1920x1208
Camera image with 1920x1208 at 30 FPS
[04-03-2022 16:43:24] StreamConsumerGL: successfully initialized
[04-03-2022 16:43:24] StreamProducerCUDA: successfully initialized
[04-03-2022 16:43:24] DNN: TensorRT model file has wrong magic number. Please ensure that the model has been created by TensorRT_optimization tool in DriveWorks. The model might be incompatible.
[04-03-2022 16:43:24] DNN: TensorRT model file has wrong magic number. Please ensure that the model has been created by TensorRT_optimization tool in DriveWorks. The model might be incompatible.
[04-03-2022 16:43:25] INVALID_CONFIG: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
[04-03-2022 16:43:25] INVALID_CONFIG: Deserialize the cuda engine failed.
[04-03-2022 16:43:25] Driveworks exception thrown: DW_INTERNAL_ERROR: DNN: Unable to load model.

terminate called after throwing an instance of 'std::runtime_error'
  what():  [2022-03-04 16:43:25] DW Error DW_INTERNAL_ERROR executing DW function:
 dwDNN_initializeTensorRTFromFile(&m_dnn, tensorRTModel.c_str(), nullptr, DW_PROCESSOR_TYPE_GPU, m_sdk)
 at /dvs/git/dirty/gitlab-master_av/dw/sdk/samples/dnn/sample_dnn_tensor/main.cpp:234
Aborted (core dumped)

Dear @raji,
You need to TensorRT_optimization tool(DriveWorks SDK Reference: TensorRT Optimizer Tool) to integrate your model into driveworks. TensorRT models generated using trtexec can not loaded directly in DW.

Thanks @SivaRamaKrishnaNV ,

I’m able to convert using TensorRT_optmization tool. Here is the log:

Nvidia_Drive@tegra-ubuntu:/usr/local/driveworks-3.5/tools/dnn$ sudo ./tensorRT_optimization --modelType=onnx --onnxFile=/home/Nvidia_Drive/Documents/Onnx/resnet50-v1-7.onnx
[sudo] password for Nvidia_Drive: 


--------------------------------------------------------------
WARNING: Using default Logger, most probably DriveWorks
         library was linked more than once.
--------------------------------------------------------------


DefaultLogger: [07-03-2022 11:53:12] DefaultLogger: WARNING: ExplicitBatch is enabled by default for ONNX models.
Initializing TensorRT generation on model /home/Nvidia_Drive/Documents/Onnx/resnet50-v1-7.onnx.
----------------------------------------------------------------
Input filename:   /home/Nvidia_Drive/Documents/Onnx/resnet50-v1-7.onnx
ONNX IR version:  0.0.3
Opset version:    8
Producer name:    
Producer version: 
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
Input "data": 1x3x224x224
Output "resnetv17_dense0_fwd": 1x1000
Building Engine...
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Iteration 0: 2.6872 ms.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Iteration 1: 2.90829 ms.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Iteration 2: 2.79078 ms.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Iteration 3: 2.76314 ms.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Iteration 4: 2.77763 ms.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Iteration 5: 2.79658 ms.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Iteration 6: 2.81472 ms.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Iteration 7: 2.78045 ms.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Iteration 8: 2.76851 ms.
Explicit batch network detected and batch size specified, use enqueue without batch size instead.
Iteration 9: 2.78554 ms.
CUDA graph OFF, Average over 10 runs is 2.78728 ms.

But I’m facing blobIndex error as indicated below:

Nvidia_Drive@tegra-ubuntu:/usr/local/driveworks-3.5/bin$ ./sample_object_detector_tracker --input-type=video --video=/usr/local/driveworks-3.5/tools/capture/camera_front_center_120fov.h264 --tensorRT_model=/usr/local/driveworks-3.5/tools/dnn/optimized.bin
[07-03-2022 12:16:51] Platform: Detected DDPX - Tegra A
[07-03-2022 12:16:51] TimeSource: monotonic epoch time offset is 1646133716873257
[07-03-2022 12:16:51] TimeSource: Could not detect valid PTP time source at nvpps. Fallback to eth0
[07-03-2022 12:16:51] TimeSource Eth: Lost PTP time synchronizaton. Synchronized time will not be available from this timesource.
[07-03-2022 12:16:51] TimeSource: Could not detect valid PTP time source at 'eth0'. Fallback to CLOCK_MONOTONIC.
[07-03-2022 12:16:51] Platform: number of GPU devices detected 2
[07-03-2022 12:16:51] Platform: currently selected GPU device discrete ID 0
[07-03-2022 12:16:51] Context::getDataPathFromSelfLocation DATA_ROOT found at: /usr/local/driveworks-3.5/data
[07-03-2022 12:16:51] SDK: No resources(.pak) mounted, some modules will not function properly
[07-03-2022 12:16:51] SDK: Create NvMediaDevice
[07-03-2022 12:16:51] SDK: Create NvMedia2D
[07-03-2022 12:16:51] SDK: use EGL display as provided
[07-03-2022 12:16:51] TimeSource: monotonic epoch time offset is 1646133716873257
[07-03-2022 12:16:51] TimeSource: Could not detect valid PTP time source at nvpps. Fallback to eth0
[07-03-2022 12:16:51] TimeSource Eth: Lost PTP time synchronizaton. Synchronized time will not be available from this timesource.
[07-03-2022 12:16:51] TimeSource: Could not detect valid PTP time source at 'eth0'. Fallback to CLOCK_MONOTONIC.
[07-03-2022 12:16:51] Initialize DriveWorks SDK v3.5.75
[07-03-2022 12:16:51] Release build with GNU 7.3.1 from heads/buildbrain-branch-0-gc61a9a35bd0 against Drive PDK v5.2.0.0
[07-03-2022 12:16:51] Initialize DriveWorks VisualizationSDK v3.5.75
[07-03-2022 12:16:51] Initialize DriveWorksGL SDK v3.5.75
[07-03-2022 12:16:51] SensorFactory::createSensor() -> camera.virtual, camera-group=a,camera-index=0,camera-type=ar0231-rccb-bae-sf3324,input-type=video,offscreen=0,profiling=1,slave=0,tensorRT_model=/usr/local/driveworks-3.5/tools/dnn/optimized.bin,video=/usr/local/driveworks-3.5/tools/capture/camera_front_center_120fov.h264
[07-03-2022 12:16:51] CameraVirtual: defaulting to non SIPL
[07-03-2022 12:16:51] CameraBase: pool size set to 8
[07-03-2022 12:16:51] CameraVirtualNvMedia: no seek table found at /usr/local/driveworks-3.5/tools/capture/camera_front_center_120fov.h264.seek, seeking is not available.
SimpleCamera: Camera image: 1920x1208
Camera image with 1920x1208 at 30 FPS
[07-03-2022 12:16:51] StreamConsumerGL: successfully initialized
[07-03-2022 12:16:51] StreamProducerCUDA: successfully initialized
[07-03-2022 12:16:57] Deserialize required 3994112 microseconds.
[07-03-2022 12:16:57] DNN: Metadata json file could not be found. Metadata has been filled with default values. Please place <network_filename>.json in the same directory as the network file if custom metadata is needed.
[07-03-2022 12:16:57] Driveworks exception thrown: DW_INVALID_ARGUMENT: blobIndex is larger than output binding count

terminate called after throwing an instance of 'std::runtime_error'
  what():  [2022-03-07 12:16:57] DW Error DW_INVALID_ARGUMENT executing DW function:
 dwDNN_getOutputSize(&m_networkOutputDimensions[1], 1U, m_dnn)
 at /dvs/git/dirty/gitlab-master_av/dw/sdk/samples/dnn/sample_object_detector_tracker/main.cpp:275
Aborted (core dumped)

Here is the content of the Data conditioner file /usr/local/driveworks-3.5/data/samples/detector/pascal/tensorRT_model.bin.json:

{
  "dataConditionerParams" : {
    "meanValue" : [0.0, 0.0, 0.0],
    "stdev": [1.0, 1.0, 1.0],
    "splitPlanes" : true,
    "pixelScaleCoefficient": 1.0,
    "ignoreAspectRatio" : false,
    "doPerPlaneMeanNormalization" : false
  },
  "tonemapType" : "none",
  "__comment": "tonemapType can be one of {none, agtm}"
}

What changes should I do to resolve the issue?

Dear @raji,
Could you check if it is due to network input image size mismatch ? Also, please confirm if the network has only 1 input and 1 output? The error indicates you are accessing out of range blob indices.

Dear @raji,
Do you still have issue?