According to TensorRT 3.0 RC, release notes, a number of restrictions on the deconvolution layer are lifted for both the jetson platforms as well as tesla based platforms.
We tried to load a simple fcn_alexnet based caffe model with the depoy_fcnalex.prototxt file, with the following last two layers, using jetson_inference/segnet_console application. We changed the jetson_inference code slightly to accommodate different input parameters,
commandline:
./caffe/test/cpp/build/x86_64/bin/segnet_console -model caffe/models/experimental/foo_fcnalex_nov16_54000.caffemodel -prototxt caffe/test/deploy_fcnalex.prototxt -iblob data -oblob softmax_probabilities -input data/194064.jpg
it works in tesla platform, but in jetson tx2, we are getting an error,
Error in jetson txt2,
[GIE] building CUDA engine
[GIE] Internal error: could not find any implementation for node upscore_kitti, try increasing the workspace size with IBuilder::setMaxWorkspaceSize()
[GIE] cudnnBuilder2.cpp (452) - OutOfMemory Error in buildSingleLayer
[GIE] failed to build CUDA engine
Further details on running in jetson tx2
1)batchsize==1
2)We have tried to increase the workspace size,
builder->setMaxWorkspaceSize(16 << 31);
3)
we watched the memory usage in jetson while this is happening through
“watch free -m”
The memory usage never goes beyond 3GB (jetson tx2 has 8GB memory).
We are indeed using swap file of 10GB which ended up not being used as well.
4)We use a pad value of ‘0’ for the first convolution layer in the deploy prototxt.
——8<—————8<——————last two layers of deploy_fcnalex.prototxt —8<——
layer {
name: “upscore_kitti”
type: “Deconvolution”
bottom: “score_fr_kitti”
top: “upscore_kitti”
param {
lr_mult: 0
}
convolution_param {
num_output: 2
bias_term: false
kernel_size: 63
stride: 32
}
}
layer {
name: “softmax_probabilities”
type: “Softmax”
bottom: “upscore_kitti”
top: “softmax_probabilities”
softmax_param {
axis:0
}
}