Hi!
We train a Faster Rcnn model with resnet18 backbone on TLT 3.0 container, train and evaluate test, inference works perfect with int8 calibration. Here TLT config:
faster_rcnn_config.txt (4.3 KB)
We export model to an .etlt file, we call our output tensor NMS -o option. After this we export model with TLT-converter tool on Jetson NX with calibration options and sizes. Seems ok. We are using Tensor RT on a c++ environment to make inference. Tensor input/outputs sizes are:
0 - input_image
· 0 - 3
· 0 - 1080
· 0 - 1920
1 - NMS
· 0 - 1
· 0 - 100
· 0 - 7
2 - NMS_1
· 0 - 1
· 0 - 1
· 0 - 1
At this time, detections seems to be less than on TLT container inference, we think that its problem of image preprocess before pass to the input tensor. We have a Opencv Mat as RGB source image (image).
As result we have 20-30% detections in coparision with tlt tests. As you can see reverse RGB order to BGR, substract channel mean and divide by 1.0 as tlt documentation says on input_image_config
parameter specification.
float* hostDataBuffer = static_cast<float*>(buffers.getHostBuffer("input_image"));
float pixelMean[3]{103.939,116.779, 123.68};
for (int i = 0, volImg = C * H * W; i < 1; ++i){
for (int c = 0; c < C; ++c){
for (unsigned j = 0, volChl = H * W; j < volChl; ++j){
hostDataBuffer[i * volImg + c * volChl + j] = float(((float(image.data[j * C + 2- c])) - pixelMean[c]))/1.0F;
}
}
}
buffers.copyInputToDevice();
bool status{true};
status = context->execute(1, buffers.getDeviceBindings().data());
buffers.copyOutputToHost();
const float* nms = static_cast<const float*>(buffers.getHostBuffer("NMS"));
for (int det_id = 0; det_id <100; det_id++){
float x1 = nms[det_id * 7 + 3];
}
Can anyone help?
Thanks
TensorRT Version: 4.
GPU Type: JETSON NX XAVIER
Jetpack: 32.6 4.6
Cuda: 10.2
cuDNN: 8.2.1
TensorRT: 7.2
Operating System + Version: Ubuntu 18 + Jetpack