I am running TensorRT inference using a model written in tensorflow and converted into uff. The model is saved with input size of (1, 601, 400). However, when I try to create tensorRT engine (at calling buildCudaEngine function) with input height set to 480, I get the following error:
INFO: --------------- Timing deconv4/conv2d_transpose(4)
INFO: Tactic 1 time 2.20518
ERROR: c:\p4sw\sw\gpgpu\MachineLearning\DIT\release.0\builder\cudnnBuilderUtils.cpp (255) - Cuda Error in nvinfer1::cudnn::findFastestTactic: 77
ERROR: c:\p4sw\sw\gpgpu\MachineLearning\DIT\release.0\engine\runtime.cpp (30) - Cuda Error in nvinfer1::`anonymous-namespace'::DefaultAllocator::free: 77
If I change input height to 478 or 481, not changing anything else at all, engine builds and I can run inference without any issues. Height of 479 does not work. I tried creating engine using a different uff file with input set to (1, 480, 400), still the same error. Why could I possibly be getting this error and how can I fix it?
1, Cannot reproduce this issue on Linux with current release 5.0.2 and next release.
tested parameters ranges like this:
for (int imageH = 475; imageH < 485; ++imageH)
for (int imageW = 395; imageW < 405; ++imageW)
for (int maxBatchSize = 1; maxBatchSize < 5; ++maxBatchSize)
2, From the error message reported in the bug description, suspect it’s caused by a kernel failure and then the next call of cudaFree catched this error. 77 means
/**
* The device encountered a load or store instruction on an invalid memory address.
* This leaves the process in an inconsistent state and any further CUDA work
* will return the same error. To continue using CUDA, the process must be terminated
* and relaunched.
*/
cudaErrorIllegalAddress = 77,
3, The tactic number in the info is 1, this is a sign of calling a CUDNN related function since 1) cask kernels use hashed values as tactic which means numbers are really big, 2) inside TRT, CUDNN algo type(ranges in 0~7) are used for tactic if it’s a CUDNN related layer.
So suggestion is
link with the next version of TRT (coming soon) and check whether it works
use ldd to check that the correct version of CUDNN is used or re-install the latest version of CUDNN, and re-run it.
As a reference, attached engineering testing code snippet.