Resnet50 Classification as Primary-GIE

Installed DeepStream SDK 4. 0 on a T4 system with the following components and was able to run the sample applications.
• Ubuntu 18.04
• Gstreamer 1.14.1
• NVIDIA driver 418+
• CUDA 10.1
• TensorRT 5.1.5

Trying to create a use-case using Resnet50 Classification as Primary-GIE, I get below assertion, how do I resolve this error? Please help!!!

# deepstream-app -c configs/deepstream-app/source1_infer_resnet50_int8.txt
0:00:00.448449951 45647 0x55a4a2264180 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:initialize(): Trying to create engine from model files
deepstream-app: regionFormat.cpp:65: size_t nvinfer1::RegionFormatB::memorySize(int, const nvinfer1::Dims&) const: Assertion `batchSize > 0’ failed.
Aborted (core dumped)

Input config file: configs/deepstream-app/source1_infer_resnet50_int8.txt
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file://…/…/streams/sample_1080p_h264.mp4
num-sources=1
gpu-id=0
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=1
source-id=0

[streammux]
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=400000

Set muxer output width and height

width=1980
height=1080

config-file property is mandatory for any gie section.

Other properties are optional and if set will override the properties set in

the infer config file.

[primary-gie]
enable=1
gie-unique-id=1
batch-size=1
interval=0
config-file=primary_resnet50.txt

[tests]
file-loop=0

Below is the primary GIE config-file: primary_resnet50.txt
[property]
net-scale-factor=1
model-file=…/…/models/Secondary_resnet50/resnet50.caffemodel
proto-file=…/…/models/Secondary_resnet50/resnet50.prototxt
int8-calib-file=…/…/models/Secondary_resnet50/CalibrationTable
labelfile-path=…/…/models/Secondary_resnet50/labels.txt
process-mode=1
model-color-format=0

0=FP32, 1=INT8, 2=FP16 mode

network-mode=1
num-detected-classes=1000
classifier-threshold=0.1
is-classifier=1
output-blob-names=prob

Hi,

The sample works on our environment with the similar setting.
To figure out where the issue comes from, would you mind to run the model with trtexec directly?

cd /usr/src/tensorrt/bin/
./trtexec --deploy=[path/to/prototxt] --model=[path/to/caffmodel] --output=prob

Thanks.

With trtexec the model runs fine and also model worked fine with DeepStream3.0.

/usr/src/tensorrt/bin$ ./trtexec --deploy=/home/lab/deepstream-4.0/samples/models/Secondary_resnet50/resnet50.prototxt --model=/home/lab/deepstream-4.0/samples/models/Secondary_resnet50/resnet50.caffemodel --output=prob
&&&& RUNNING TensorRT.trtexec # ./trtexec --deploy=/home/lab/deepstream-4.0/samples/models/Secondary_resnet50/resnet50.prototxt --model=/home/lab/deepstream-4.0/samples/models/Secondary_resnet50/resnet50.caffemodel --output=prob
[I] deploy: /home/lab/deepstream-4.0/samples/models/Secondary_resnet50/resnet50.prototxt
[I] model: /home/lab/deepstream-4.0/samples/models/Secondary_resnet50/resnet50.caffemodel
[I] output: prob
[I] Input “data”: 3x224x224
[I] Output “prob”: 1000x1x1
[I] Average over 10 runs is 2.53807 ms (host walltime is 2.78107 ms, 99% percentile time is 2.55574).
[I] Average over 10 runs is 2.71687 ms (host walltime is 2.98011 ms, 99% percentile time is 2.872).
[I] Average over 10 runs is 2.92612 ms (host walltime is 3.19431 ms, 99% percentile time is 2.93242).
[I] Average over 10 runs is 2.92296 ms (host walltime is 3.19326 ms, 99% percentile time is 2.92864).
[I] Average over 10 runs is 2.91046 ms (host walltime is 3.18021 ms, 99% percentile time is 2.92851).
[I] Average over 10 runs is 2.89945 ms (host walltime is 3.16082 ms, 99% percentile time is 2.91078).
[I] Average over 10 runs is 2.90088 ms (host walltime is 3.15789 ms, 99% percentile time is 2.9079).
[I] Average over 10 runs is 2.89149 ms (host walltime is 3.15156 ms, 99% percentile time is 2.90368).
[I] Average over 10 runs is 2.87674 ms (host walltime is 3.13682 ms, 99% percentile time is 2.88768).
[I] Average over 10 runs is 2.87251 ms (host walltime is 3.13241 ms, 99% percentile time is 2.88272).
&&&& PASSED TensorRT.trtexec # ./trtexec --deploy=/home/lab/deepstream-4.0/samples/models/Secondary_resnet50/resnet50.prototxt --model=/home/lab/deepstream-4.0/samples/models/Secondary_resnet50/resnet50.caffemodel --output=prob

Resnet50 Classification as Primary-GIE, FP32/FP16 mode is working fine in DeepStream 4.0. Seems like previous INT8 calibration file is not compatible with SDK 4.0 so we are getting below error. Any suggestions?

deepstream-app: regionFormat.cpp:65: size_t nvinfer1::RegionFormatB::memorySize(int, const nvinfer1::Dims&) const: Assertion `batchSize > 0’ failed.
Aborted (core dumped)

Hi,

Thanks for the experiment.

We have verified the sample(including IN8 mode) on T4 and it can work correctly.
Let me check this issue with our internal team and update more information with you later.

Thanks.

Hi,

Would you mind to double check the TensorRT version on your environment?

$ dpkg -l | grep TensorRT

Thanks.

Hi,

Here is the feedback from our internal team:

All the calibration file in Deepstream4.0 package are based on the TenosrRT 5.1.6.
Since the cache file is sensitive to TensorRT version, you will need to do the re-calibration before using INT8 mode.

Thanks.

We can not find tensorrt 5.1.6 at https://developer.nvidia.com/nvidia-tensorrt-5x-download