TF-TRT Cuda error on Pegasus

mike19931010 · June 10, 2019, 9:03am

Hi,

I’m using Pegasus and trying to build a TF-TRT model using trt.create_inference_graph.

When trying to generate the model, these errors occured:

2019-06-10 16:51:26.724539: E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] DefaultLogger engine.cpp (99) - Cuda Error in initializeCommonContext: 4 (Could not initialize cudnn, please check cudnn installation.)
2019-06-10 16:51:26.738716: E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] DefaultLogger engine.cpp (99) - Cuda Error in initializeCommonContext: 4 (Could not initialize cudnn, please check cudnn installation.)
2019-06-10 16:51:26.739282: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:511] Engine creation for batch size 16 failed Internal: Failed to build TensorRT engine
2019-06-10 16:51:26.739366: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:290] Engine retrieval for batch size 1 failed. Running native segment for TRTEngineOp_0
2019-06-10 16:51:26.846929: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-06-10 16:51:26.861626: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-06-10 16:51:26.863472: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:180] Failed to execute native segment TRTEngineOp_0: Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node MobilenetV2/Conv/Conv2D}}]]
Exception in thread Thread-8:
Traceback (most recent call last):
File “/home/nvidia/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py”, line 1334, in _do_call
return fn(*args)
File “/home/nvidia/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py”, line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File “/home/nvidia/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py”, line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node MobilenetV2/Conv/Conv2D}}]]
[[{{node SemanticPredictions}}]]

I’m not pretty sure if is the problem that tensorflow version does not match the version of cudnn or TRT.
All these code run on Pegasus.

TRT 5.0.3
Tensorflow 1.13.0-rc0
cudnn 7.3.1
CUDA 10.0

Could you help me on this issue?
Thanks.

Pooya-Davoodi · July 1, 2019, 10:37pm

This could be due to OOM. Could you try to reduce the TF GPU memory fraction: config.gpu_options.per_process_gpu_memory_fraction

shadysource2 · September 5, 2019, 7:46pm

Thank you, limiting TF GPU mem fraction fixed this for me!

Topic		Replies	Views
From my frozen graph. I created tensorRT graph. My frozen_graph was working fine on same system. but when I tried same code with tensorRT converted graph. got error described below.. CUDA Programming and Performance	0	328	March 13, 2020
"ERROR: Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED" inside nvidia tensorflow docker CUDA Setup and Installation	3	1871	May 27, 2020
Tensor RT 4 INT8 building - ERROR: cudnnEngine.cpp (85) - Cuda Error in initializeCommonContext: 4 TensorRT	8	5425	July 2, 2019
need help!!!: Non-OK-status: CudaLaunchKernel( SwapDimension1And2InTensor3UsingTiles TensorRT	2	2688	October 8, 2021
Failed to get convolution algorithm. This is probably because cuDNN failed to initialize cuDNN	29	51627	October 12, 2021
Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR cuDNN	3	8117	November 7, 2019
tensorflow/stream_executor/cuda/cuda_dnn.cc:329 CUDA Setup and Installation	2	3683	February 18, 2020
Cuda Error in initializeCommonContext cuDNN	5	2652	September 2, 2021
Cuda Error in createFilterTextureFused TensorRT	0	331	June 27, 2018
CuDNN error in docker Docker and NVIDIA Docker tensorrt , cuda , tensorflow , cudnn	0	1924	March 24, 2021

TF-TRT Cuda error on Pegasus

Related topics