Use trtexec to test resnet50 in resolution 1920*1080

Hello

Description

Use trtexec in Xavier to test the time-consuming of Resnet50 at a resolution of 1920*1080

Environment

TensorRT Version: 5.1
GPU Type: xavier
CUDA Version:10.0

Relevant Files

Steps To Reproduce

modify ResNet50 data shape 1 * 3 * 224 * 224 → 1 * 3 * 1080 * 1920

./trtexec --avgRuns=10 --deploy=ResNet50_N2.prototxt --int8 --batch=1  --output=prob 

result

avgRuns: 10
deploy: ResNet50_N2.prototxt
int8
batch: 1
output: prob
Parameter check failed at: ../builder/Network.cpp::addFullyConnected::78, condition: kernelWeights.values != nullptr
error parsing layer type InnerProduct index 226
Engine could not be created
Engine could not be created

Thanks

Hi,
Please refer to the installation steps from the below link if in case you are missing on anything
https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html
However suggested approach is to use TRT NGC containers to avoid any system dependency related issues.
https://ngc.nvidia.com/catalog/containers/nvidia:tensorrt

In order to run python sample, make sure TRT python packages are installed while using NGC container.
/opt/tensorrt/python/python_setup.sh
Thanks!

HI,

thank you for your reply,but the installation steps are correct

if I use Rest50 in 1 ×3 * 224 * 224,The result is OK,As follows

 ./trtexec --avgRuns=10 --deploy=ResNet50_N2.prototxt --int8 --batch=1  --output=prob
avgRuns: 10
deploy: ResNet50_N2.prototxt
int8
batch: 1
output: prob
Input "data": 3x224x224
Output "prob": 1000x1x1
name=data, bindingIndex=0, buffers.size()=2
name=prob, bindingIndex=1, buffers.size()=2
Average over 10 runs is 2.24991 ms (host walltime is 2.34324 ms, 99% percentile time is 2.2753).
Average over 10 runs is 2.23246 ms (host walltime is 2.30345 ms, 99% percentile time is 2.25008).
Average over 10 runs is 2.23027 ms (host walltime is 2.29599 ms, 99% percentile time is 2.24582).
Average over 10 runs is 2.23384 ms (host walltime is 2.29854 ms, 99% percentile time is 2.25206).
Average over 10 runs is 2.23196 ms (host walltime is 2.29724 ms, 99% percentile time is 2.26346).
Average over 10 runs is 2.23042 ms (host walltime is 2.295 ms, 99% percentile time is 2.27827).
Average over 10 runs is 2.22992 ms (host walltime is 2.29304 ms, 99% percentile time is 2.27405).
Average over 10 runs is 2.22126 ms (host walltime is 2.28345 ms, 99% percentile time is 2.24029).
Average over 10 runs is 2.22063 ms (host walltime is 2.29072 ms, 99% percentile time is 2.22464).
Average over 10 runs is 2.22029 ms (host walltime is 2.28319 ms, 99% percentile time is 2.23456).

but if I modify ResNet50

name: "ResNet-50"
input: "data"
input_dim: 1
input_dim: 3
input_dim: 1080
input_dim: 1920

layer {
	bottom: "data"
	top: "conv1"
	name: "conv1"
	type: "Convolution"
	convolution_param {
		num_output: 64
		kernel_size: 7
		pad: 3
		stride: 2
	}
}
...

Just changed the size of the input, Unable to measure time consuming

./trtexec --avgRuns=10 --deploy=ResNet50_N2.prototxt --int8 --batch=1  --output=prob
avgRuns: 10
deploy:ResNet50_N2.prototxt
int8
batch: 1
output: prob
Parameter check failed at: ../builder/Network.cpp::addFullyConnected::78, condition: kernelWeights.values != nullptr
error parsing layer type InnerProduct index 226
Engine could not be created
Engine could not be created

Hi @xidiantuoersuo,

We noticed that you are using old version of TensorRT. We recommend you to install latest version and try on it. Please let us know if you still face this issue.

Thank you.

HI , @spolisetty
thank you for your reply, I has change Tensorrt version, 7.0.0, As follows

[05/10/2021-07:23:49] [I] Averages: 10 inferences
[05/10/2021-07:23:49] [I] Percentile: 99
[05/10/2021-07:23:49] [I] Dump output: Disabled
[05/10/2021-07:23:49] [I] Profile: Disabled
[05/10/2021-07:23:49] [I] Export timing to JSON file: 
[05/10/2021-07:23:49] [I] Export output to JSON file: 
[05/10/2021-07:23:49] [I] Export profile to JSON file: 
[05/10/2021-07:23:49] [I] 
[05/10/2021-07:24:06] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[05/10/2021-07:24:34] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[05/10/2021-07:24:34] [W] [TRT] GPU memory allocation error during getBestTactic: fc1000
[05/10/2021-07:24:34] [E] [TRT] Internal error: could not find any implementation for node fc1000, try increasing the workspace size with IBuilder::setMaxWorkspaceSize()
[05/10/2021-07:24:34] [E] [TRT] ../builder/tacticOptimizer.cpp (1523) - OutOfMemory Error in computeCosts: 0
[05/10/2021-07:24:35] [E] Engine creation failed
[05/10/2021-07:24:35] [E] Engine set up failed

Hi @xidiantuoersuo,

Looks like error is appearing due to insufficient memory. We recommend you to increase the workspace using --workspace option in trtexec.
Please make sure enough GPU memory is available. Please check GPU utilization by using nvidia-smi.

Thank you.