How to accelerate the GOTURN network with GPU on tegra tx1?

I am evaluating the opencv tracker GOTURN, which used dnn but without GPU support for now. I was wondering if I could use ternsorNet to load its caffeModel, but then noticed that its prototxt has two inputs, “data1” and “data2”, yet tensorNet::LoadNetwork only accept one input_blob.
could you suggest How I can use the goturn network?

goturn.prototxt.zip (1023 Bytes)

Hi,

TensorRT can import Caffe model directly, although there are some non-supported layers.

Ideally, you can change networks via this command:
https://github.com/dusty-nv/jetson-inference#loading-custom-models-on-jetson

Based on your use-case to select which application to use.
For example:
classification -> imagenet-console
detection -> detectnet-console

Hi, AastaLLL,

1.My use-case is not recognition or detection.
It’s a tracking application.

2.A reference image and a current image are both passed to the dnn for tracking. i.e. two input blobs are to be passed to the net. quote from the goturn.prototxt
[i]input: “data1”
input_dim: 1
input_dim: 3
input_dim: 227
input_dim: 227

input: “data2”
input_dim: 1
input_dim: 3
input_dim: 227
input_dim: 227[/i]

follow your link, neither imagenet nor detectnet accept two blobs.

./imagenet-console bird_0.jpg output_0.jpg
–prototxt=$NET/goturn.prototxt
–model=$NET/gotuen.caffemodel
–input_blob=data

–output_blob=softmax

I checked the imageNet.h and detectNet.h and they both inherits tensorNet::Create(const char* prototxt_path, const char* model_path, const char* mean_binary, const char* class_lables, const char* input=“data”,
const char* output=“prob”, uint32_t maxBatchSize=2);

Hi,

We don’t have a sample for dual input. You need to modify the sample manually.
It’s recommended to check our tensorrt sample first, which is located at ‘/usr/src/tensorrt/’

By the way, please noticed that ‘Flatten layer’ doesn’t support by TensorRT currently. Self-implement via plugin API may be needed.

Hi,
tensorrt is in Jetpack l4t 3.1+. to support dual input I shall refer to the giexec source codes, right?
The flatten layer src codes is in below. I’m quite new to this, so is there a recommendation on sample codes that’s closest to the implementation of flatten layer plugin API from your expert view?

#include

#include “caffe/layers/flatten_layer.hpp”

namespace caffe {

template
void FlattenLayer::Reshape(const vector<Blob>& bottom,
const vector<Blob
>& top) {
CHECK_NE(top[0], bottom[0]) << this->type() << " Layer does not "
“allow in-place computation.”;
const int start_axis = bottom[0]->CanonicalAxisIndex(
this->layer_param_.flatten_param().axis());
const int end_axis = bottom[0]->CanonicalAxisIndex(
this->layer_param_.flatten_param().end_axis());
vector top_shape;
for (int i = 0; i < start_axis; ++i) {
top_shape.push_back(bottom[0]->shape(i));
}
const int flattened_dim = bottom[0]->count(start_axis, end_axis + 1);
top_shape.push_back(flattened_dim);
for (int i = end_axis + 1; i < bottom[0]->num_axes(); ++i) {
top_shape.push_back(bottom[0]->shape(i));
}
top[0]->Reshape(top_shape);
CHECK_EQ(top[0]->count(), bottom[0]->count());
}

template
void FlattenLayer::Forward_cpu(const vector<Blob>& bottom,
const vector<Blob
>& top) {
top[0]->ShareData(*bottom[0]);
}

template
void FlattenLayer::Backward_cpu(const vector<Blob>& top,
const vector& propagate_down, const vector<Blob
>& bottom) {
bottom[0]->ShareDiff(*top[0]);
}

INSTANTIATE_CLASS(FlattenLayer);
REGISTER_LAYER_CLASS(Flatten);

} // namespace caffe

Hi,

Please check TensorRT sampleFasterRCNN sample:
/usr/src/tensorrt/samples/sampleFasterRCNN

Thanks.

Hi, HooverLv and AastaLLL,
I am doing the same job accelerating the GOTURN (opencv tracker) network with GPU on tx1 now. The program can run normally on tx1 (Ubuntu 16.04, L4T R28.2, tensorRT 3.0.4), but the results of the tracking are not stable with FP16 or FP32. I tried a lot of methods and the GOTURN didn’t work as stable as running with opencv. So what do you think might be caused or could you share how you did this job?
Thanks a lot.