While attempting to do a custom Yocto build for the AGX Devkit we ran into a problem where we found that our trained F-RCNN model yielded less detections than when we ran on Nvidia’s standard Ubuntu distribution on the AGX Devkit. In our Yocto build we were following the steps as described in this repo:
for building a libnvinfer_plugin.so that contained the necessary cropAndResizePlugin and proposalPlugin plugins. We followed Option #2 for deepstream 4.0.2 & TRT 6.0. We found two problems with https://github.com/NVIDIA/TensorRT/tree/release/6.0:
The README.md is inaccurate regarding CMAKE_BUILD_TYPE. The top-level cmake files do not specify ‘Release’ as the default. Only by directly specifying it when invoking cmake or by enabling parsers to be built either implicitly (the default) or explicitly via -DBUILD_PARSERS=ON will ‘Release’ be set as the build type. The onnx parser will set the build type to ‘Release’ if has not already been defined. This is important because it is the ‘Release’ build type which setups up the default CXX_FLAGS to ‘-O3 -DNDEBUG’. Otherwise, these flags are omitted for debug builds. We stumbled into this because when enabling parsers and the samples in our yocto build the make install step fails. So, we disabled both and then discovered problem #2 below.
For debug builds when ‘-O3’ is not specified our own model yields less detections. We ran a sample image through a gstreamer pipeline which yielded only 4 detections when not compiled with ‘-O3’ vs 13 detections when compiled with it.
To recreate run the faster rcnn sample from deepstream_4.x_apps with the plugin compiled both ways. It appears to also yield different numbers of detections just based from visual inspection.
JetPack 4.3 on Jetson Xavier AGX Devkit