Preprocessing steps for dashcamnet

Description

I converted a Dashcam model from the example code into a TensorRT engine using tlt-convert, and tried to infer it by using the following pre-processing steps: convert to float, subtracting RGB channels numbers by offsets 77.2, 21.2, 11.8 respectively, and multiplying them by 1/255. However, the inferred results were different than the ones done through a modified version of deepstream-infer-tensor-meta-test app via deepstream. The pgie configs for Dashcamnet had net-scale-factor as 1/255 and no offsets. What is missing from these preprocessing steps for the Dashcam model?

Environment

TensorRT Version : 7.0.0.11
GPU Type : GTX1070
Nvidia Driver Version :450.66
CUDA Version : 10.2
CUDNN Version : 7.6.5
Operating System + Version : Ubuntu 18.04
Python Version (if applicable) :
TensorFlow Version (if applicable) :
PyTorch Version (if applicable) :
Baremetal or Container (if container which image + tag) : nvcr.io/nvidia/deepstream:5.0.1-20.09-triton

Relevant Files

Changed within the docker image:
/opt/nvidia/deepstream/deepstream-5.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/deepstream_infer_tensor_meta_test.cpp
/opt/nvidia/deepstream/deepstream-5.0/sources/libs/nvdsinfer_customparser/nvdsinfer_custombboxparser.cpp

Steps To Reproduce

The same image with cars is used for both the deepstream app and the TensorRT inference script
Pull and run the docker image: syther22/deepstream_dashcam_264 from docker.io
Run the container, run the deepstream-inter-tensor-meta-app
/opt/nvidia/deepstream/deepstream-5.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test# ./deepstream-infer-tensor-meta-app ./test.h264

  • the floating point number inference results start from 3.33143e-07 2.8675e-08 3.02065e-07 1.90778e-07 1.97347e-07 2.38013e-07…

Using TensorRT + cuda, the results of inference are different after following preprocessing steps mentioned above; eg. 1.33245e-06 7.46877e-07 1.81713e-06 1.9149e-06 1.86566e-06 2.54461e-06 5.59784e-06 8.99243e-06…
I can also send the source code for inference through TensorRT.

Hi @syther666,
Did you set the ‘offsets’ and ‘net-scale-factor’ in your pgie config, e.g. offsets=77.2, 21.2, 11.8, net-scale-factor=0.003921568627451 ? 0.003921568627451=1/255


Could you firstly share your pgie config?

Thanks!

For deepstream-infer-tensor-meta-test using dashcam I did not override the offsets(kept at default), and net-scale-factor was overridden to be ‘0.0039…’.

deepstream-infer-tensor-meta-test/dstensor_pgie_config.txt

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-engine-file=./resnet18_dashcam_unpruned.plan
proto-file=…/…/…/…/samples/models/Primary_Detector/resnet10.prototxt
int8-calib-file=…/…/…/…/samples/models/Primary_Detector/cal_trt.bin
force-implicit-batch-dim=1
batch-size=1
network-mode=1
process-mode=1
model-color-format=0
num-detected-classes=4
interval=0
gie-unique-id=1
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid

## 0=Detector, 1=Classifier, 2=Segmentation, 100=Other
network-type=100
# Enable tensor metadata output
output-tensor-meta=1

#scaling-filter=0
#scaling-compute-hw=0

[class-attrs-all]
pre-cluster-threshold=0.2
eps=0.2
group-threshold=1

Running the same .engine file using my own inference code, the conversion kernel is below:

__global__ void
kern(
float *outBuffer,
unsigned char *inBuffer,
unsigned int width,
unsigned int height,
unsigned int pitch)
{
float offsets[3] = {77.5, 21.2, 11.8};
float scaleFactor = 0.003921568627451;
unsigned int row = blockIdx.y * blockDim.y + threadIdx.y;
unsigned int col = blockIdx.x * blockDim.x + threadIdx.x;

if (col < width && row < height)
{
for (unsigned int k = 0; k < 3; k++)
{
outBuffer[width * height * k + row * width + col] =
scaleFactor * (inBuffer[row * pitch + col * 1 + (2 - k)] - offsets[k]);
}
}
}

Theoretically those should have been the same, but they end up yielding different results.

can you set ‘offsets’ and ‘net-scale-factor’ in deepstream ?

what do you mean? I already have them set in dstensor_pgie_config.txt

Really ? I saw net-scale-factor is set in the config gile, but I didn’t see offsets is set.
Can you point out which line set offsets?

Since offsets is 77.5,21.1,11.8 by default the behavior should have been the same whether I set it or not.
Okay for your sanity, I added the line ‘offsets=77.5;21.2;11.8’ to the config. But the issue has been deepstream using the engine model file always got the correct results while TensorRT inference using a different script didn’t.

“offsets=77.5;21.2;11.8” is just an exmaple, the offsets are all zero by default.

If it failed on TRT , so about CUDA kern(), I don’t know what’s the layout of inBuffer, but I suggest to check “[row * pitch + col * 1 + (2 - k)]” .