Error running multimedia api backend sample with detectnet model for object detection

We are trying to use the tegra-multimedia-api samples for running inference over input video files.

Specifically, we are checking to see if the ‘backend’ sample can be used as a video processing pipeline.

The network model has been trained using detectnet for images with resolution 1920x1080.

We are encountering the following error, while running the backend example on Jetson TX2 (flashed with Jetpack 3.0)

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nvidia@tegra-ubuntu:~/tegra_multimedia_api/samples/backend$ ./backend 1 ~/Videos/walsh_santomas_20170603_016.h264 H264 --gie-deployfile ~/Downloads/deploy.prototxt --gie-modelfile ~/Downloads/snapshot_iter_118980.caffemodel --gie-forcefp32 0 --gie-proc-interval 1 -fps 11
Net has batch_size, channel, net_height, net_width:1 3 1088 1920
Using cached GIE model
outputDim c 1 w 240 h 132
outputDimsBBOX c 4 w 240 h 132
Failed to query video capabilities: Inappropriate ioctl for device
NvMMLiteOpen : Block : BlockType = 261
TVMR: NvMMLiteTVMRDecBlockOpen: 7818: NvMMLiteBlockOpen
NvMMLiteBlockCreate : Block : BlockType = 261
Failed to query video capabilities: Inappropriate ioctl for device
Failed to query video capabilities: Inappropriate ioctl for device
Starting decoder capture loop thread
TVMR: cbBeginSequence: 1190: BeginSequence 1920x1088, bVPR = 0
TVMR: LowCorner Frequency = 0
TVMR: cbBeginSequence: 1583: DecodeBuffers = 5, pnvsi->eCodec = 4, codec = 0
TVMR: cbBeginSequence: 1654: Display Resolution : (1920x1080)
TVMR: cbBeginSequence: 1655: Display Aspect Ratio : (1920x1080)
TVMR: cbBeginSequence: 1697: ColorFormat : 5
TVMR: cbBeginSequence:1711 ColorSpace = NvColorSpace_YCbCr601
TVMR: cbBeginSequence: 1839: SurfaceLayout = 3
TVMR: cbBeginSequence: 1936: NumOfSurfaces = 7, InteraceStream = 0, InterlaceEnabled = 0, bSecure = 0, MVC = 0 Semiplanar = 1, bReinit = 1, BitDepthForSurface = 8 LumaBitDepth = 8, ChromaBitDepth = 8, ChromaFormat = 5
TVMR: cbBeginSequence: 1938: BeginSequence ColorPrimaries = 2, TransferCharacteristics = 2, MatrixCoefficients = 2
[INFO] (NvEglRenderer.cpp:109) Setting Screen width 1920 height 1200
libv4l2_nvvidconv (0):(761) (INFO) : Allocating (10) OUTPUT PLANE BUFFERS Layout=1
libv4l2_nvvidconv (0):(771) (INFO) : Allocating (10) CAPTURE PLANE BUFFERS Layout=0
libv4l2_nvvidconv (1):(761) (INFO) : Allocating (10) OUTPUT PLANE BUFFERS Layout=0
libv4l2_nvvidconv (1):(771) (INFO) : Allocating (10) CAPTURE PLANE BUFFERS Layout=0
Query and set capture successful
cuCtxSynchronize failed after memcpy
cuGraphicsEGLUnRegisterResource failed: 719
cudnnConvolutionLayer.cpp (234) - Cuda Error in execute1: 8
Cuda failure: 4Aborted (core dumped)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Video information:

Input #0, h264, from ‘/home/nvidia/Videos/walsh_santomas_20170603_016.h264’:
Duration: N/A, bitrate: N/A
Stream #0:0: Video: h264 (High), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 1200k tbn, 59.94 tbc

Requesting any assistance to help troubleshoot the issue.

Please let me know in case you need any other information.

Hi,

We are checking this issue internally. Will update information to you later.

Your input video is 1920x1080, and TensorRT output is

outputDim c 1 w 240 h 132
outputDimsBBOX c 4 w 240 h 132

Is this correct?

Yes

Update our last results here:

  1. Input image size should be divided by 16. -> Use 1920x1088 instead of 1920x1080
  2. Need to subtract mean:
layer {
  name: "deploy_transform"
  type: "Power"
  bottom: "data"
  top: "transformed_data"
  power_param {
    shift: -127.0
  }
}
  1. If trained with latest DIGITs, please use TensorRT2.1 to avoid the incompatible protobuf issue.
  2. Check your model:
  • a.The output log ‘outputDimsBBOX c 4 w 240 h 132’ is calculated by input 960x540. If they train model with 1920x1088, this model is incorrect.
  • b.The bboxes channel to 1, but it should be 4.