TensorRT inference on SSD with python api fails with assertion error on nmsPlugin.cpp, while working fine in nvinfer

Hi,
We trained a custom resnet18 SSD network with TLT and then exported it to etlt format.
Using deepStream SDK on TX2 we were successfully able to use this network and to get TRT engine file.
We are able to use the engine file successfully when using DeepStream SDK.
But, when we trying to use it directly on TRT python SDK we got error.
The deepstream’s nvinfer plugin for reference are:

[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path=labels.txt
model-engine-file=/home/deep/ssd_resnet18_epoch_030_fp32.etlt_b1_fp32.engine
input-dims=3;306;544;0 # where c = number of channels, h = height of the model input, w = width of model input, 0: implies CHW format.
uff-input-blob-name=Input
batch-size=1

network-mode=0
num-detected-classes=1
interval=0
gie-unique-id=1
output-blob-names=NMS
parse-bbox-func-name=NvDsInferParseCustomSSDUff
custom-lib-path=</path/to/libnvds_infercustomparser_ssd_uff.so>

[class-attrs-all]
threshold=0.3
eps=0.2
group-threshold=0
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

The TRT code we are using is:

import numpy as np
import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit 
TRT_LOGGER = trt.Logger(trt.Logger.INFO)
trt.init_libnvinfer_plugins(TRT_LOGGER, '')
engine_file = 'ssd_resnet18_epoch_030_fp32.etlt_b1_fp32.engine'

with open(engine_file, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime:
    engine = runtime.deserialize_cuda_engine(f.read())

h_input = cuda.pagelocked_zeros( trt.volume(engine.get_binding_shape(0)), dtype=np.float32)
h_output = cuda.pagelocked_empty(trt.volume(engine.get_binding_shape(1)), dtype=np.float32)
d_input = cuda.mem_alloc(h_input.nbytes)
d_output = cuda.mem_alloc(h_output.nbytes)
stream = cuda.Stream()

with engine.create_execution_context() as context:
     #cuda.memcpy_htod_async(d_input, h_input, stream)
     cuda.memcpy_htod(d_input, h_input)
     #context.execute_async(bindings=[int(d_input), int(d_output)], stream_handle=stream.handle)
     context.execute(bindings=[int(d_input), int(d_output)])
     print ('execution complete')
     # Transfer predictions back from the GPU.
     cuda.memcpy_dtoh_async(h_output, d_output, stream)
     stream.synchronize()
     print ('h_output',h_output.shape)

The error we get is :

#assertion/home/deep/TensorRT/plugin/nmsPlugin/nmsPlugin.cpp,118
Aborted (core dumped)

No other warnings are seen.

TensorRT 6.0.1.10 version
Do you know what can be the issue?
Do you know how to get more detailed information about the problem from the logger?

Hi,

I looked into this a little bit with an internal TLT model and was able to do inference with roughly the following steps:

  1. Run TLT workflow to get .etlt file
  2. Run tlt-convert to convert .etlt to TRT engine file (.engine)
  3. Test inference with trtexec --loadEngine=model.engine, and it didn’t segfault.

This was all done with TRT 5.1.5, however I don’t think the model had any plugins which may be associated with your problem.

Can you possibly share the relevant models (.etlt, .engine, etc.) and scripts, plugins, etc. necessary to repro this issue?

I am having the exact same error, is there any way I can share my files with you? @NVES_R