TensorRT С++ optimization profile

v.stadnichuk · July 20, 2021, 2:24pm

Hi @spolisetty !
I resolved previous issue and have the next question. On the preprocessImage function I have trouble. When execute application it just freeze and increase memory allocation. I found that it happens on the next step:

for (size_t i = 0; i < channels; ++i)
{
    chw.emplace_back(cv::cuda::GpuMat(input_size, CV_32FC1, gpu_input + i * input_width * input_height));
}

You can find script here, additional files I sent you via DM earlier.
trt_sample.cpp (8.6 KB)

Can you help with it?

spolisetty · July 20, 2021, 4:49pm

@v.stadnichuk,

This looks like error related to image preprocess, please make sure loop is not going infinite.
Also this looks like out of scope for TensorRT.

Thank you.

v.stadnichuk · July 22, 2021, 9:07am

Hi @spolisetty !
I resolved last issue, but I have another one. Script failed on the postprocess now at this step

std::vector cpu_output(getSizeByDim(dims) * batch_size);
(line 146)

I have error

terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc

Can you support with it?

kyehliu · July 22, 2021, 3:14pm

@spolisetty
The error above looks like it is in the scope for TensorRT optimization and memory.

terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc

Do you have any clue about it? Thanks.

spolisetty · July 22, 2021, 4:03pm

Can you please share issue repro new inference script and test image you’re trying. Also if possible verbose error logs for better debugging.

Thank you.

v.stadnichuk · July 23, 2021, 8:25am

Hi @spolisetty !
I sent it to you via DM.
Thank you!

spolisetty · August 3, 2021, 10:28am

Sorry for the delayed response. Are you still facing this issue.

anton.nesterenko · August 3, 2021, 4:06pm

Dear @spolisetty ,

I’m working together with @v.stadnichuk .
As far as I know the described issues is still exist and all logs , inputs were provided.
The onnx model is very small and simple
//input tensor
name: input_1
type: float32[N,64,64,3]
//output tensor
name: lr
type: float32[N,1]
May be output tensor size is too small or data type incorrect

Could you please help us to fix this issue ?

Thank you for support,

anton.nesterenko · August 9, 2021, 3:50pm

Dear @spolisetty .

How are you doing ? Do you have any updates regarding to the current issue ?

Meantime, I’d like to ask you for some explanations.
I’ve returned back to sampleOnnxMNIST.cpp code.
Due to our model should use input dynamic shapes, I’ve added code to SampleOnnxMNIST::constructNetwork():

Case A:

// Create an optimization profile so that we can specify a range of input dimensions.
auto profile = builder->createOptimizationProfile();
profile->setDimensions(“input_1”, OptProfileSelector::kMIN, Dims4{1,64,64, 3});
profile->setDimensions(“input_1”, OptProfileSelector::kOPT, Dims4{20,64,64, 3});
profile->setDimensions(“input_1”, OptProfileSelector::kMAX, Dims4{100,64,64, 3});
config->addOptimizationProfile(profile);

Case B:

// Create an optimization profile so that we can specify a range of input dimensions.
auto profile = builder->createOptimizationProfile();
profile->setDimensions(“input_1”, OptProfileSelector::kMIN, Dims4{1,64,64, 3});
profile->setDimensions(“input_1”, OptProfileSelector::kOPT, Dims4{1,64,64, 3});
profile->setDimensions(“input_1”, OptProfileSelector::kMAX, Dims4{1,64,64, 3});
config->addOptimizationProfile(profile);

So, for cases (A) and (B) I’ve got runtime error:

****************** infer() *******************
terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc
When we try to get buffers from Engine:
Create RAII buffer manager object
samplesCommon::BufferManager buffers(mEngine, mParams.batchSize = 1);

However, when I’ve got engine from trtexec tool:

sudo ./trtexec --verbose --onnx=/usr/src/tensorrt/data/mnist/apm_one_input.onnx --explicitBatch=1 --dumpProfile --int8 --shapes=input_1:1x64x64x3,input_1:20x64x64x3,input_1:100x64x64x3 --saveEngine=engine.trt

Then upload the engine file to my app:

IRuntime* runtime = createInferRuntime(gLogger);
ICudaEngine *engine = runtime->deserializeCudaEngine(…)

No errors happens

Could you please correct me, why trtexec applies shapes, but the app via profile makes runtime error ?

Thank you for support,

spolisetty · September 9, 2021, 11:00am

Hi,

Apologies for the delayed response. Above errors looks like a bad pointer access or incorrect memory allocation, as you’re able to build engine using trtexec, please make sure you’re allocating memory correctly in case of script.

Thank you.

Topic		Replies	Views
Work with batch in TensorRT TensorRT tensorrt , opencv , cuda , tensorflow	20	3804	July 20, 2021
TensorRT: input_1: dynamic input is missing dimensions in profile 0 TensorRT	9	2362	February 9, 2021
How to use different profile in tensorrt? TensorRT tensorrt , python	3	1397	July 19, 2022
[TensorRT] OutOfMemory Error when building engine from ONNX model TensorRT tensorrt	6	3819	January 2, 2024
[TensorRT] ERROR: ../rtSafe/safeRuntime.cpp (32) - Cuda Error in free: 700 TensorRT opencv , cuda , tensorflow , python	3	1926	April 28, 2021
Assertion Error in buildMemGraph: 0 (mg.nodes[mg.regionIndices[outputRegion]].size == mg.nodes[mg.regionIndices[inputRegion]].size) TensorRT	10	1293	October 12, 2021
TensorRT 7 ONNX models with variable batch size TensorRT kb	13	12061	October 12, 2021
ONNX Model and Tensorrt Engine gives different output TensorRT tensorrt , onnx	13	5397	June 29, 2022
[executionContext.cpp::executeInternal::652] Error Code 1: Cuda Runtime (an illegal memory access was encountered) \| Cuda failure: 700 TensorRT tensorrt	5	2944	April 11, 2022
Errors with reading pb file in TensorRT and readNetFromTensorflow in C++ TensorRT	3	1238	January 26, 2021

TensorRT С++ optimization profile

Related topics