Segment fault when runing detectnet-console (installed by jetson-inference) on jetson agx xavier

Hi, I trained a DetectNet following Two Days to a Demo (DIGITS) instruments.
Then, downloaded the model and tried to run detectnet-console on Jetson Xavier Agx(# R32 (release), REVISION: 4.3, GCID: 21589087, BOARD: t186ref, EABI: aarch64, DATE: Fri Jun 26 04:34:27 UTC 2020) and get following error:

detectnet-console ./desk.jpg inf.jpg --prototxt=./deploy.prototxt --model=./snapshot_iter_23000.caffemodel
[video]  created imageLoader from file:///home/zhoi/ai/deskm/./desk.jpg
------------------------------------------------
imageLoader video options:
------------------------------------------------
  -- URI: file:///home/zhoi/ai/deskm/./desk.jpg
     - protocol:  file
     - location:  ./desk.jpg
     - extension: jpg
  -- deviceType: file
  -- ioType:     input
  -- codec:      unknown
  -- width:      0
  -- height:     0
  -- frameRate:  0.000000
  -- bitRate:    0
  -- numBuffers: 4
  -- zeroCopy:   true
  -- flipMethod: none
  -- loop:       0
------------------------------------------------
[video]  created imageWriter from file:///home/zhoi/ai/deskm/inf.jpg
------------------------------------------------
imageWriter video options:
------------------------------------------------
  -- URI: file:///home/zhoi/ai/deskm/inf.jpg
     - protocol:  file
     - location:  inf.jpg
     - extension: jpg
  -- deviceType: file
  -- ioType:     output
  -- codec:      unknown
  -- width:      0
  -- height:     0
  -- frameRate:  0.000000
  -- bitRate:    0
  -- numBuffers: 4
  -- zeroCopy:   true
  -- flipMethod: none
  -- loop:       0
------------------------------------------------

detectNet -- loading detection network model from:
          -- prototxt     ./deploy.prototxt
          -- model        ./snapshot_iter_23000.caffemodel
          -- input_blob   'data'
          -- output_cvg   'coverage'
          -- output_bbox  'bboxes'
          -- mean_pixel   0.000000
          -- mean_binary  NULL
          -- class_labels NULL
          -- threshold    0.500000
          -- batch_size   1

[TRT]    TensorRT version 7.1.3
[TRT]    loading NVIDIA plugins...
[TRT]    Registered plugin creator - ::GridAnchor_TRT version 1
[TRT]    Registered plugin creator - ::NMS_TRT version 1
[TRT]    Registered plugin creator - ::Reorg_TRT version 1
[TRT]    Registered plugin creator - ::Region_TRT version 1
[TRT]    Registered plugin creator - ::Clip_TRT version 1
[TRT]    Registered plugin creator - ::LReLU_TRT version 1
[TRT]    Registered plugin creator - ::PriorBox_TRT version 1
[TRT]    Registered plugin creator - ::Normalize_TRT version 1
[TRT]    Registered plugin creator - ::RPROI_TRT version 1
[TRT]    Registered plugin creator - ::BatchedNMS_TRT version 1
[TRT]    Could not register plugin creator -  ::FlattenConcat_TRT version 1
[TRT]    Registered plugin creator - ::CropAndResize version 1
[TRT]    Registered plugin creator - ::DetectionLayer_TRT version 1
[TRT]    Registered plugin creator - ::Proposal version 1
[TRT]    Registered plugin creator - ::ProposalLayer_TRT version 1
[TRT]    Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TRT]    Registered plugin creator - ::ResizeNearest_TRT version 1
[TRT]    Registered plugin creator - ::Split version 1
[TRT]    Registered plugin creator - ::SpecialSlice_TRT version 1
[TRT]    Registered plugin creator - ::InstanceNormalization_TRT version 1
[TRT]    detected model format - caffe  (extension '.caffemodel')
[TRT]    desired precision specified for GPU: FASTEST
[TRT]    requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT]    native precisions detected for GPU:  FP32, FP16, INT8
[TRT]    selecting fastest native precision for GPU:  FP16
[TRT]    attempting to open engine cache file ./snapshot_iter_23000.caffemodel.1.1.7103.GPU.FP16.engine
[TRT]    loading network plan from engine cache... ./snapshot_iter_23000.caffemodel.1.1.7103.GPU.FP16.engine
[TRT]    device GPU, loaded ./snapshot_iter_23000.caffemodel
[TRT]    Deserialize required 2705619 microseconds.
[TRT]    
[TRT]    CUDA engine context initialized on device GPU:
[TRT]       -- layers       74
[TRT]       -- maxBatchSize 1
[TRT]       -- workspace    0
[TRT]       -- deviceMemory 158651392
[TRT]       -- bindings     3
[TRT]       binding 0
                -- index   0
                -- name    'data'
                -- type    FP32
                -- in/out  INPUT
                -- # dims  3
                -- dim #0  3 (SPATIAL)
                -- dim #1  1088 (SPATIAL)
                -- dim #2  1920 (SPATIAL)
[TRT]       binding 1
                -- index   1
                -- name    'coverage'
                -- type    FP32
                -- in/out  OUTPUT
                -- # dims  3
                -- dim #0  1 (SPATIAL)
                -- dim #1  68 (SPATIAL)
                -- dim #2  120 (SPATIAL)
[TRT]       binding 2
                -- index   2
                -- name    'bboxes'
                -- type    FP32
                -- in/out  OUTPUT
                -- # dims  3
                -- dim #0  4 (SPATIAL)
                -- dim #1  68 (SPATIAL)
                -- dim #2  120 (SPATIAL)
[TRT]    
[TRT]    binding to input 0 data  binding index:  0
[TRT]    binding to input 0 data  dims (b=1 c=3 h=1088 w=1920) size=25067520
[TRT]    binding to output 0 coverage  binding index:  1
[TRT]    binding to output 0 coverage  dims (b=1 c=1 h=68 w=120) size=32640
[TRT]    binding to output 1 bboxes  binding index:  2
[TRT]    binding to output 1 bboxes  dims (b=1 c=4 h=68 w=120) size=130560
[TRT]    
[TRT]    device GPU, ./snapshot_iter_23000.caffemodel initialized.
[TRT]    detectNet -- number object classes:   1
[TRT]    detectNet -- maximum bounding boxes:  8160
[image] loaded './desk.jpg'  (1920x1080, 3 channels)
Segmentation fault (core dumped)

I also tried the model and same command on Jetson Nano(# R32 (release), REVISION: 4.2, GCID: 20074772, BOARD: t210ref, EABI: aarch64, DATE: Thu Apr 9 01:22:12 UTC 2020), and everything goes well.

What’s wrong with my AGX board?

BTW. difference between agx and nano in jetson-inference install stage: I installed data sets on nano while agx nothing, I’m not sure whether it matters in following installation.

It looks caused by the compatibility of snapshot_iter_23000.caffemodel.1.1.7103.GPU.FP16.engine generated by TensorRT7.1.3.0. SSDv2 model works with detectnet-console.

Hi,

There are two possible causes of this issue:

1. Please noticed that TensorRT engine (Ex. snapshot_iter_23000.caffemodel.1.1.7103.GPU.FP16.engine) is not portable.
You will need to re-generate the engine once device or software version changed.

2. Please make sure you are using the same branch of jetson_inference.
Please use brach L4T-R32.4.2 for your Nano(32.4.2) software and top for the Xavier(32.4.3).

Thanks.

Problem solved after adding “–labels=xxx.txt” arg to detectnet-console.
Not sure why Nano works without the labels arg.

Good to know it works now.
Without labels config, detectnet-console will read it from the default path.

Thanks.