DenseNet121 transplanting using TensorRT

Hi Aasta,

Can we have direct call to talk the issue?


Sorry for the late.
It looks like there is some missing understanding of the sample.

Please don’t create the PLAN file on your own.
If there is no PLAN file exist, the sample will compile Caffe model into TensorRT and serialize one with the given “PLAN” name.

After that, the application can deserialize the model without compiling.



I have modified the process, now the sample in #27 can be run successfully, and the process time can be accepted

Next let’s focus on solving the serializing issue in our case, i will modify my coding, and please help to check the issue in serializing step in #30

Thanks a lot


It looks like a stack overflow issue.
In general, this can be fixed by updating the variables with call by referenece rather than call by value.



Now, the serialize/deserialize of our case have been solved, and the total process with pre-process and inference can be accepted.

But the last result is not right, there must be some mistakes in the operations to fetch the last result, please see the attachment and help to check


As mentioned in comment #27, you will need to update the preprocess steps based on your implementation:

for( size_t idx=0; idx<inputDims.d[1]*inputDims.d[2]; idx++ )
    input_data[0*plane+idx] = float([3*idx+0])-128;
    input_data[1*plane+idx] = float([3*idx+1])-128;
    input_data[2*plane+idx] = float([3*idx+2])-128;

The sample here is using BGR color space and subtract image with 128 as preprocessing.


I am now using the pre-process module of ourselves which has been run successfully on different AI platforms , please see the codes in last reply

I have tested different images, the final results are all 1, there must be some mistakes

Let’s solve this issue today

Two days have passed, haven’t got any significant feedback
Please give your feedback soon, thanks


Different AI platform has own implementation on the network. The code @AastaLLL give in #27 is verified on Jetson. So recommend you to try following that so that it could work correctly on our platform.

float*  input_data;
float* output_data;
nvinfer1::Dims inputDims  = engine->getBindingDimensions(engine->getBindingIndex( INPUT_BLOB));
nvinfer1::Dims outputDims = engine->getBindingDimensions(engine->getBindingIndex(OUTPUT_BLOB));
cudaMallocManaged( &input_data, inputDims.d[0]* inputDims.d[1]* inputDims.d[2]*sizeof(float));
cudaMallocManaged(&output_data, outputDims.d[0]*outputDims.d[1]*outputDims.d[2]*sizeof(float));

Pls make sure the input_data you alloced is the right float type and the right size get from the network.
Also be aware that INPUT_BLOB and the OUTPUT_BLOB should be right.


Here is some information for your reference.

I check the denseNet-121 from this GitHub: GitHub - shicai/DenseNet-Caffe: DenseNet Caffe Models, converted from
And update the preprocess based on the model

#define OUTPUT_BLOB "fc6"
for( size_t idx=0; idx<inputDims.d[1]*inputDims.d[2]; idx++ )
    input_data[0*plane+idx] = 0.017*(float([3*idx+0])-103.94);
    input_data[1*plane+idx] = 0.017*(float([3*idx+1])-116.78);
    input_data[2*plane+idx] = 0.017*(float([3*idx+2])-123.68);

And I can get the correct class type from the model:

iter=0 index=950: 15.7413