Issues with TensorRT on Drive PX2

I am currently developing application on Drive PX2 base on the repo here:

https://github.com/dusty-nv/jetson-inference

I am currently facing some issues with TensorRT, which I posted here:

https://github.com/dusty-nv/jetson-inference/issues/134

In summary, there is 2 issues:

  1. Drive PX2 does not support FP16?

The return of

builder->platformHasFastFp16()

is false.

  1. When loading the network trained by me, it got the following error:

conv1: ERROR - 32-bit weights not found for 32-bit model

Which is similar to the following issue:

https://github.com/dusty-nv/jetson-inference/issues/83

Does anyone have any clues how to solve such issues?

BTW, it runs the downloaded pretrained model perfectly, just with my own trained model has difficulty, while the model parameters are exactly the same.

Dear zhoubinxyz,

Please see below for your topics. Thanks.

  1. FP16 is supported on iGPU of DPX2, but not supported on dGPU of DPX2.
  2. What version & branch of Caffe is used to train network? There is syntax / protobuf definition difference between nvcaffe and BVLC caffe. Prototxt and caffemodel file must be from the same branch/version of Caffe, and TensorRT only works on BVLC and nvcaffe branch.

Dear SteveNV,

Thanks very much for your reply.

After using caffe-0.15, it will have problem when parsing Batch Normalization Layer of caffe, which gives following error:

‘’’
[GIE] loading ./networks/ParkSeg/new_deploy.prototxt ./networks/ParkSeg/new_model.caffemodel
segnet-console: caffeParser.cpp:861: nvinfer1::ILayer* parseBatchNormalization(nvinfer1::INetworkDefinition&, const ditcaffe::LayerParameter&, CaffeWeightFactory&, BlobNameToTensor&): Assertion `mean.count == variance.count && movingAverage.count == 1’ failed.
‘’’

It has been checked that
net.params[‘bn1_1’][0].data
net.params[‘bn1_1’][1].data
net.params[‘bn1_1’][2].data
net.params[‘bn1_1’][3].data
all have same length of 16.
The assertion error must come from movingAverage.count == 1.
However we don’t have the source code of TensorRT to debug. Could you suggest what may be a solution?

Best,
Zhou Bin

Dear zhoubinxyz,

Could you please file a bug for this issue?

Please see the procedures below to file a bug.

Reporting a Bug
Login to developer.nvidia.com
In upper right, click the down arrow by Hello,
Select My Account
In the left navigation menu, select My Bugs
Select Submit a New Bug (in upper right green box, or within text of bounded green box)
Fill in the details of your feedback, request or issue
IMPORTANT: When filing Bug, in Summary, be sure to include [DRIVE PX 2]
If you have any issues, please contact InfoDRIVEPX@nvidia.com

Thanks, I have submit the issue.

conv+bn is a linear operation. you can merge conv+bn into single conv. then you can avoid using bn 16fp.