Cublas Algo ID 11 doesn't suit current config

adamcuellar · July 27, 2021, 3:26pm

Description

Hello,

I’ve converted the efficientnetB0 model from tensorflow to onnx (using tf2onnx) and use code very similar to the sampleOnnxMNIST.cpp for using the model; however, for some batches it prints the following:

“Cublas Algo ID 11 doesn’t suit current config, setting to default algo”

Is this something to worry about? Performance/speed is still adequate but I’d like for it to be as optimal as possible. I tried to google this but didn’t find anything explaining what this means.

Environment

nvidia/cuda:11.1-cudnn8-devel-ubuntu18.04 docker container

TensorRT Version: 7.2.3.4
GPU Type: Tesla V100 32gb
Nvidia Driver Version: System has 418.126.02, not sure if it’s different inside the container
CUDA Version: 11.1 in container
CUDNN Version: 8.0.5 in container
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): N/A
TensorFlow Version (if applicable): 2.5.0
PyTorch Version (if applicable): N/A
Baremetal or Container (if container which image + tag):
nvidia/cuda:11.1-cudnn8-devel-ubuntu18.04

NVES · July 27, 2021, 3:37pm

Hi,
Please refer to the below link for Sample guide.

Refer to the installation steps from the link if in case you are missing on anything

However suggested approach is to use TRT NGC containers to avoid any system dependency related issues.

In order to run python sample, make sure TRT python packages are installed while using NGC container.
/opt/tensorrt/python/python_setup.sh

In case, if you are trying to run custom model, please share your model and script with us, so that we can assist you better.
Thanks!

adamcuellar · July 27, 2021, 4:53pm

I’m not running python, I’m using the cpp sample file. The container I’m running is the same as the TRT NGC containers, I copied the dockerfile and added other dependencies. I’m also not running a custom model, it’s EfficientNetB0.

spolisetty · July 28, 2021, 10:38am

@adamcuellar,

Based on the info in the description, looks like you’re using CUDA container. Please share us complete verbose logs and minimal issue repro script/model for better debugging.