Tensor RT 4 INT8 building - ERROR: cudnnEngine.cpp (85) - Cuda Error in initializeCommonContext: 4

Hi,

I am using TensorRT to build an INT8 inference model, similar to the way it’s done in sampleINT8. My system is:

-OS: Ubuntu 16.04 (amd64)
-GPU: Geforce GTX 1050 Ti
-CUDA: 9.0.176
-CuDNN: 7.0.5.15
-TensorRT: 4.0.0.3 RC
-Driver: 384.111

I am running the code through nvidia-docker, using nvidia/cuda:9.0-devel-ubuntu16.04 as base image on top of which I install CuDNN as done here: https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/devel/cudnn7/Dockerfile

When trying to generate the model, I get this error:

INFO: --------------- Timing Softmax(11)
INFO: Tactic 0 is the only option, timing skipped
INFO: After reformat layers: 11 layers
INFO: Block size 268435456
INFO: Block size 147968
INFO: Block size 100352
INFO: Total Activation Memory: 268683776
INFO: Calculating Maxima
INFO: Calibrating with batch 0
ERROR: cudnnEngine.cpp (85) - Cuda Error in initializeCommonContext: 4 (Could not initialize cudnn, please check cudnn installation.)
ERROR: cudnnEngine.cpp (85) - Cuda Error in initializeCommonContext: 4 (Could not initialize cudnn, please check cudnn installation.)
ERROR: Unable to create engine

What’s the problem? CuDNN is properly installed and I can successfully build and run FP32 models using this setup. The TensorRT documentation says it’s compatible with CuDNN 7.0 which is the case here.

Thanks!

Any hints on this?

Looks like a cuDNN error.

Is cuDNN working for you otherwise?

Do you have more than one cuDNN installed on your system?

-Chris

Also … did you do step 2.4 in the cuDNN installation … where it tests the cuDNN installation? Did that pass?

We created a new “Deep Learning Training and Inference” section in Devtalk to improve the experience for deep learning and accelerated computing, and HPC users:
https://devtalk.nvidia.com/default/board/301/deep-learning-training-and-inference-/

We are moving active deep learning threads to the new section.

URLs for topics will not change with the re-categorization. So your bookmarks and links will continue to work as earlier.

-Siddharth

TensorRT provides the ./infer_device tool. This tool checks your system for proper installation of cuda, cublas, and cudnn, along with reporting information on your various devices. Can you see if this works for you?

Hi,

It turned out to be a totally unrelated issue. I had some out-of-bounds access on an array stored on CPU memory. It should have segfaulted there but somehow I ended up with this CuDNN error. Very strange!

@ChrisGottbrath I am using a Docker image based on the official CUDA 9/CuDNN 7 provided by Nvidia, so I think the installation is perfectly fine. I run it through nvidia-docker and I can successfully build and run FP32 models on the GPU, for which CuDNN is required.

This issue can be closed then, thanks for the replies!

where can I find ./infer_device tool?

With respect to the question posted by @carlosgalvezp, I would like to bring into some of the steps I had undergone to resolve the error. I also got the similar kind of error while working with tensorflow, tensorrt and python. Please see my version of these as below

tensorflow-gpu 1.13.1
tensorrt 5.0.2.6
tensorrtserver 1.2.0
python 3.6.8
cuda 10.0.130
cudnn 7.6.0

Error I was facing
pp (99) - Cuda Error in initializeCommonContext: 4 (Could not initialize cudnn, please check cudnn installation.)
2019-07-02 11:22:31.895747: E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] DefaultLogger engine.cpp (99) - Cuda Error in initializeCommonContext: 1 (Could not initialize cudnn, please check cudnn installation.)
terminate called after throwing an instance of ‘nvinfer1::CudaError’
terminate called recursively

Solution

First of all please check whether cudnn is properly installed and one have mentioned the following lines in the file .bashrc.

export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
LD_LIBRARY_PATH=/usr/local/lib
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Now after this addition of the line, please type the command source .bashrc. Close the terminal.

Restart the terminal and please approach to the folder cd $HOME

Got to the folder

cd cudnn_samples_v7
cd mnistCUDNN/
make clean && make
./mnistCUDNN

You will see that test will pass. Now if you will run your code again, you will see it will pick up the cudnn library.

In nutshell, please add the path in the ubuntu base file (.bashrc) so that cudnn file can be accessed.