Use trtexec to test resnet50 in resolution 1920*1080

xidiantuoersuo · May 8, 2021, 9:10am

Hello

Description

Use trtexec in Xavier to test the time-consuming of Resnet50 at a resolution of 1920*1080

Environment

TensorRT Version: 5.1
GPU Type: xavier
CUDA Version:10.0

Relevant Files

github.com

KaimingHe/deep-residual-networks/blob/master/prototxt/ResNet-50-deploy.prototxt

name: "ResNet-50"
input: "data"
input_dim: 1
input_dim: 3
input_dim: 224
input_dim: 224

layer {
	bottom: "data"
	top: "conv1"
	name: "conv1"
	type: "Convolution"
	convolution_param {
		num_output: 64
		kernel_size: 7
		pad: 3
		stride: 2
	}
}

This file has been truncated. show original

Steps To Reproduce

modify ResNet50 data shape 1 * 3 * 224 * 224 → 1 * 3 * 1080 * 1920

./trtexec --avgRuns=10 --deploy=ResNet50_N2.prototxt --int8 --batch=1  --output=prob

result

avgRuns: 10
deploy: ResNet50_N2.prototxt
int8
batch: 1
output: prob
Parameter check failed at: ../builder/Network.cpp::addFullyConnected::78, condition: kernelWeights.values != nullptr
error parsing layer type InnerProduct index 226
Engine could not be created
Engine could not be created

Thanks

NVES · May 10, 2021, 5:15am

Hi,
Please refer to the installation steps from the below link if in case you are missing on anything

However suggested approach is to use TRT NGC containers to avoid any system dependency related issues.

In order to run python sample, make sure TRT python packages are installed while using NGC container.
/opt/tensorrt/python/python_setup.sh
Thanks!

xidiantuoersuo · May 10, 2021, 6:22am

HI，

thank you for your reply，but the installation steps are correct

if I use Rest50 in 1 ×3 * 224 * 224，The result is OK，As follows

 ./trtexec --avgRuns=10 --deploy=ResNet50_N2.prototxt --int8 --batch=1  --output=prob
avgRuns: 10
deploy: ResNet50_N2.prototxt
int8
batch: 1
output: prob
Input "data": 3x224x224
Output "prob": 1000x1x1
name=data, bindingIndex=0, buffers.size()=2
name=prob, bindingIndex=1, buffers.size()=2
Average over 10 runs is 2.24991 ms (host walltime is 2.34324 ms, 99% percentile time is 2.2753).
Average over 10 runs is 2.23246 ms (host walltime is 2.30345 ms, 99% percentile time is 2.25008).
Average over 10 runs is 2.23027 ms (host walltime is 2.29599 ms, 99% percentile time is 2.24582).
Average over 10 runs is 2.23384 ms (host walltime is 2.29854 ms, 99% percentile time is 2.25206).
Average over 10 runs is 2.23196 ms (host walltime is 2.29724 ms, 99% percentile time is 2.26346).
Average over 10 runs is 2.23042 ms (host walltime is 2.295 ms, 99% percentile time is 2.27827).
Average over 10 runs is 2.22992 ms (host walltime is 2.29304 ms, 99% percentile time is 2.27405).
Average over 10 runs is 2.22126 ms (host walltime is 2.28345 ms, 99% percentile time is 2.24029).
Average over 10 runs is 2.22063 ms (host walltime is 2.29072 ms, 99% percentile time is 2.22464).
Average over 10 runs is 2.22029 ms (host walltime is 2.28319 ms, 99% percentile time is 2.23456).

but if I modify ResNet50

name: "ResNet-50"
input: "data"
input_dim: 1
input_dim: 3
input_dim: 1080
input_dim: 1920

layer {
	bottom: "data"
	top: "conv1"
	name: "conv1"
	type: "Convolution"
	convolution_param {
		num_output: 64
		kernel_size: 7
		pad: 3
		stride: 2
	}
}
...

Just changed the size of the input, Unable to measure time consuming

./trtexec --avgRuns=10 --deploy=ResNet50_N2.prototxt --int8 --batch=1  --output=prob
avgRuns: 10
deploy:ResNet50_N2.prototxt
int8
batch: 1
output: prob
Parameter check failed at: ../builder/Network.cpp::addFullyConnected::78, condition: kernelWeights.values != nullptr
error parsing layer type InnerProduct index 226
Engine could not be created
Engine could not be created

spolisetty · May 10, 2021, 6:46am

Hi @xidiantuoersuo,

We noticed that you are using old version of TensorRT. We recommend you to install latest version and try on it. Please let us know if you still face this issue.

Thank you.

xidiantuoersuo · May 10, 2021, 7:28am

HI , @spolisetty
thank you for your reply, I has change Tensorrt version, 7.0.0, As follows

[05/10/2021-07:23:49] [I] Averages: 10 inferences
[05/10/2021-07:23:49] [I] Percentile: 99
[05/10/2021-07:23:49] [I] Dump output: Disabled
[05/10/2021-07:23:49] [I] Profile: Disabled
[05/10/2021-07:23:49] [I] Export timing to JSON file: 
[05/10/2021-07:23:49] [I] Export output to JSON file: 
[05/10/2021-07:23:49] [I] Export profile to JSON file: 
[05/10/2021-07:23:49] [I] 
[05/10/2021-07:24:06] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[05/10/2021-07:24:34] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[05/10/2021-07:24:34] [W] [TRT] GPU memory allocation error during getBestTactic: fc1000
[05/10/2021-07:24:34] [E] [TRT] Internal error: could not find any implementation for node fc1000, try increasing the workspace size with IBuilder::setMaxWorkspaceSize()
[05/10/2021-07:24:34] [E] [TRT] ../builder/tacticOptimizer.cpp (1523) - OutOfMemory Error in computeCosts: 0
[05/10/2021-07:24:35] [E] Engine creation failed
[05/10/2021-07:24:35] [E] Engine set up failed

spolisetty · May 10, 2021, 7:49am

Hi @xidiantuoersuo,

Looks like error is appearing due to insufficient memory. We recommend you to increase the workspace using --workspace option in trtexec.
Please make sure enough GPU memory is available. Please check GPU utilization by using nvidia-smi.

Thank you.

Topic		Replies	Views
Resnet50.onnx failed to parse using tensorrt trtexec to be used in Deepstream TensorRT	9	902	October 12, 2021
trtexec set input shape not working with TensorRT	2	5622	August 5, 2021
script used for NVIDIA Deep Learning Inference Performance TensorRT	9	1289	October 12, 2021
AssertionError: Max workspace size for TensorRT inference should be positive, got 0 TensorRT	4	740	July 21, 2021
Trtexec performance Jetson TX2 jetpack , tensorrt	6	3415	October 18, 2021
The trt exec could not predict the image properly with resNet50.onnx model Jetson AGX Xavier tensorrt	22	1022	January 9, 2024
Maximum Performance of ResNet50 model for NVIDIA T4 in TensorRT using trtexec Deep Learning (Training & Inference)	1	451	October 5, 2020
Trtexec can not convert resnet152 onnx to TRT engine, without prompting error! TensorRT	12	1546	July 22, 2021
TX2 "INT8 not supported by platform. Trying FP16 mode" TAO Toolkit	11	2777	October 12, 2021
resnet50 get error result on px2 with tensorRT2.1.2 General	18	1540	May 14, 2018

Use trtexec to test resnet50 in resolution 1920*1080

Description

Environment

Relevant Files

Steps To Reproduce

Related topics