CUBLAS_STATUS_NOT_iNITIALIZED

quizz.0n · May 14, 2021, 1:26pm

After successfully trained the network and save it in a SavedModel format, TF Serving displays the fallowing error:

2021-05-11 13:41:27.083035: I tensorflow/cc/saved_model/loader.cc:277] SavedModel load for tags { serve }; Status: success: OK. Took 1886711 microseconds.
2021-05-11 13:41:27 (INFO) TensorflowModelServe: Source info :
2021-05-11 13:41:27 (INFO) TensorflowModelServe: Receptive field  : [160, 160]
2021-05-11 13:41:27 (INFO) TensorflowModelServe: Placeholder name : lr_input
2021-05-11 13:41:27 (INFO) TensorflowModelServe: Output spacing ratio: 0.25
2021-05-11 13:41:27 (INFO) TensorflowModelServe: The TensorFlow model is used in fully convolutional mode
2021-05-11 13:41:27 (INFO) TensorflowModelServe: Output field of expression: [512, 512]
2021-05-11 13:41:27 (INFO) TensorflowModelServe: Tiling disabled
2021-05-11 13:41:27 (WARNING): Streaming configuration through extended filename is used. Any previous streaming configuration (ram value, streaming mode ...) will be ignored.
2021-05-11 13:41:27 (INFO): File Image_test.tif will be written in 110 blocks of 512x512 pixels
Writing Image_test.tif?&gdal:co:COMPRESS=DEFLATE&streaming:type=tiled&streaming:sizemode=height&streaming:sizevalue=512...: 0% [                                                  ]2021-05-11 13:41:27.770868: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-05-11 13:41:28.572215: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-05-11 13:41:28.573738: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2021-05-11 13:41:28.573882: W tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES failed at conv_ops.cc:1106 : Not found: No algorithm worked!

Reading about it the solution to fix this is:

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

but this solution conflicts with the code that I’m trying to run.

So the question is why is this error occurring in the first place?

TF 2.4.1, CUDA 11.3, CUDNN 8.2

Robert_Crovella · May 14, 2021, 1:41pm

If that solution fixes it, the problem is due to the fact that TF has a greedy allocation method (when you don’t set allow_growth). This greedy allocation method uses up nearly all GPU memory. When CUBLAS is asked to initialize (later), it requires some GPU memory to initialize. There is not enough memory left for CUBLAS to initialize, so the CUBLAS initialization fails.

this SO Question/Answer has additional relevant information. I won’t be able to sort this out for you here, and this particular forum is not really the right place to ask your question. I’m unlikely to respond to follow-up questions.

CUBLAS-specific questions should be asked on our accelerated libraries forum, but this question is really about Tensorflow behavior. NVIDIA doesn’t develop or support Tensorflow

quizz.0n · May 18, 2021, 12:43pm

It makes sense now, thanks for explaining the process.

Topic		Replies	Views
Failed call to cuInit: CUDA_ERROR_OUT_OF_MEMORY: out of memory Frameworks cuda , tensorflow	1	2928	April 22, 2021
tensorflow/stream_executor/cuda/cuda_dnn.cc:329 CUDA Setup and Installation	2	3675	February 18, 2020
CUDA_ERROR_OUT_OF_MEMORY: out of memory cuDNN cuda , tensorflow , windows-driver	1	1749	July 31, 2023
TensorFlow CUDNN_STATUS_EXECUSION_FAILED cuDNN tensorflow	1	1232	May 21, 2021
Slow startup and model loading time Frameworks tensorflow , ubuntu	6	4907	April 27, 2024
Failed to get convolution algorithm. This is probably because cuDNN failed to initialize cuDNN	29	51603	October 12, 2021
Tensorflow issue:OP_REQUIRES failed at conv_ops_fused_impl.h:697 : Not found: No algorithm worked! cuDNN	3	8670	May 12, 2021
tensorflow CUDA_ERROR_MIS ALIGNED_ADDRESS: misaligned address Frameworks tensorflow	6	1503	March 3, 2020
OP_REQUIRES failed at matrix_inverse_op.cc:191 : Internal: tensorflow/core/kernels/cuda_solvers.cc:803: cuBlas call failed status = 13 Frameworks tensorflow	6	2128	September 4, 2019
Couldn't open CUDA library libcudnn.so. CUDA Setup and Installation	2	3631	November 14, 2016

CUBLAS_STATUS_NOT_iNITIALIZED

Related topics