TensorRT С++ optimization profile

Hi @spolisetty !
I resolved previous issue and have the next question. On the preprocessImage function I have trouble. When execute application it just freeze and increase memory allocation. I found that it happens on the next step:

for (size_t i = 0; i < channels; ++i)
{
    chw.emplace_back(cv::cuda::GpuMat(input_size, CV_32FC1, gpu_input + i * input_width * input_height));
}

You can find script here, additional files I sent you via DM earlier.
trt_sample.cpp (8.6 KB)

Can you help with it?

@v.stadnichuk,

This looks like error related to image preprocess, please make sure loop is not going infinite.
Also this looks like out of scope for TensorRT.

Thank you.

Hi @spolisetty !
I resolved last issue, but I have another one. Script failed on the postprocess now at this step

std::vector cpu_output(getSizeByDim(dims) * batch_size);
(line 146)

I have error

terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc

Can you support with it?

@spolisetty
The error above looks like it is in the scope for TensorRT optimization and memory.

terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc

Do you have any clue about it? Thanks.

Can you please share issue repro new inference script and test image you’re trying. Also if possible verbose error logs for better debugging.

Thank you.

Hi @spolisetty !
I sent it to you via DM.
Thank you!

Sorry for the delayed response. Are you still facing this issue.

Dear @spolisetty ,

I’m working together with @v.stadnichuk .
As far as I know the described issues is still exist and all logs , inputs were provided.
The onnx model is very small and simple
//input tensor
name: input_1
type: float32[N,64,64,3]
//output tensor
name: lr
type: float32[N,1]
May be output tensor size is too small or data type incorrect

Could you please help us to fix this issue ?

Thank you for support,

Dear @spolisetty .

How are you doing ? Do you have any updates regarding to the current issue ?

Meantime, I’d like to ask you for some explanations.
I’ve returned back to sampleOnnxMNIST.cpp code.
Due to our model should use input dynamic shapes, I’ve added code to SampleOnnxMNIST::constructNetwork():

Case A:

// Create an optimization profile so that we can specify a range of input dimensions.
auto profile = builder->createOptimizationProfile();
profile->setDimensions(“input_1”, OptProfileSelector::kMIN, Dims4{1,64,64, 3});
profile->setDimensions(“input_1”, OptProfileSelector::kOPT, Dims4{20,64,64, 3});
profile->setDimensions(“input_1”, OptProfileSelector::kMAX, Dims4{100,64,64, 3});
config->addOptimizationProfile(profile);

Case B:

// Create an optimization profile so that we can specify a range of input dimensions.
auto profile = builder->createOptimizationProfile();
profile->setDimensions(“input_1”, OptProfileSelector::kMIN, Dims4{1,64,64, 3});
profile->setDimensions(“input_1”, OptProfileSelector::kOPT, Dims4{1,64,64, 3});
profile->setDimensions(“input_1”, OptProfileSelector::kMAX, Dims4{1,64,64, 3});
config->addOptimizationProfile(profile);

So, for cases (A) and (B) I’ve got runtime error:

****************** infer() *******************
terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc
When we try to get buffers from Engine:
Create RAII buffer manager object
samplesCommon::BufferManager buffers(mEngine, mParams.batchSize = 1);

However, when I’ve got engine from trtexec tool:

sudo ./trtexec --verbose --onnx=/usr/src/tensorrt/data/mnist/apm_one_input.onnx --explicitBatch=1 --dumpProfile --int8 --shapes=input_1:1x64x64x3,input_1:20x64x64x3,input_1:100x64x64x3 --saveEngine=engine.trt

Then upload the engine file to my app:

IRuntime* runtime = createInferRuntime(gLogger);
ICudaEngine *engine = runtime->deserializeCudaEngine(…)

No errors happens

Could you please correct me, why trtexec applies shapes, but the app via profile makes runtime error ?

Thank you for support,

Hi,

Apologies for the delayed response. Above errors looks like a bad pointer access or incorrect memory allocation, as you’re able to build engine using trtexec, please make sure you’re allocating memory correctly in case of script.

Thank you.