YOLOv4 with cuDNN8 on Ubuntu 20.04 is raising error during training on GeForce RTX 3090

muhammad.bilal · October 30, 2021, 5:42pm

Hi all,

I recently bought a system with 2 GeForce RTX 3090 for deep learning using YOLOv4. The system has Ubuntu 20.04 OS. I followed all the installation correctly and configured the YOLOv4 for custom model training. The darknet raises error when training is started with -map switch. The training command giving this error is as follows:
!./darknet detector train "obj.data" "custom-train.cfg" "weights/yolov4.conv.137" -dont_show -mjpeg_port 8090 -gpus 0,1 -map

The error is as follows:

calculation mAP (mean average precision)…
Detection layer: 139 - type = 28
Detection layer: 150 - type = 28
Detection layer: 161 - type = 28
4
cuDNN status Error in: file: /home/oem/mlworks/darknet/src/convolutional_kernels.cu : () : line: 555 : build time: Oct 30 2021 - 15:30:45

cuDNN Error: CUDNN_STATUS_BAD_PARAM
Darknet error location: /home/oem/mlworks/darknet/src/dark_cuda.c, cudnn_check_error, line #204
cuDNN Error: CUDNN_STATUS_BAD_PARAM: Success

I found that I may need to install older version of CUDA toolkit (10. 1 update2) and older cuDNN library (v7.6.5). I couldn’t find cuDNN for Ubuntu 20.04. So I am stuck with this error and can’t proceed any further.

I can take out the -map to continue with training but that is not the option. Can someone guide me how to resolve this error with CUDA toolkit 11 on Ubuntu 20.04.

I will greatly appreciate your help or any guidance.

Many Thanks and
Best Regards,
Bilal

rs277 · October 31, 2021, 7:17pm

Sorry, I can’t help you with the error directly, but just wanted to point out that Cuda 10.1 would not help you as it does not support SM 8.6 (RTX30XX) cards. The minimum required is Cuda 11.1.

muhammad.bilal · October 31, 2021, 9:12pm

Thanks for this clarification. So it seems like a dead end to me then?

rs277 · October 31, 2021, 10:02pm

I know nothing about it, but I have just looked at the installation requirements: GitHub - AlexeyAB/darknet: YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet ) · GitHub
and it states: “Cuda >=10.2 cuDNN >= 8.0.2”

Looking at the the Nvidia page for cuDNN, I see version for Cuda 11.4:

So if it were me, I’d install Cuda 11.4 and use the 11.4 cuDNN and rebuild it from there…

Yuki_Ni · November 1, 2021, 3:44am

You may get more help from the cudnn forum Topics tagged cudnn

Topic		Replies	Views
Nvcc fatal : Unsupported gpu architecture 'compute_86' CUDA Setup and Installation	1	1191	September 6, 2021
RTX3080 run yolo no response Deep Learning (Training & Inference)	5	2238	October 12, 2021
Compatibility Inquiry for CUDA and cuDNN Versions on Ubuntu 24.04.1 with RTX 4090 cuDNN cuda	1	649	December 23, 2024
Problems with 4090, CUDA (samples), cuDNN (sample). Are these expected? cuDNN tensorrt , cuda	5	5036	March 17, 2023
Problem to run training with the new RTX 3080 cuDNN cuda , tensorflow , cudnn	1	1692	January 6, 2021
(cuDNN Error: CUDNN_STATUS_BAD_PARAM: Permission denied) - YOLOV4 object detection Jetson TX2 yolo	3	2959	May 4, 2022
JetPack 4.4 - L4T R32.4.3 issue with darknet yolo Jetson Nano nvbugs , cudnn	7	5530	October 18, 2021
Run detector on an Image in YOLO Jetson Nano cuda	2	580	March 22, 2023
Darknet YOLO issue with JetPack 4.4.1 and L4T 32.4.4 Jetson AGX Xavier yolo	4	1181	November 24, 2021
Failed cuDNN 8 test (./mnistCUDNN) on Tesla K80 Ubuntu 22.04 cuDNN cudnn	3	1129	March 23, 2024

YOLOv4 with cuDNN8 on Ubuntu 20.04 is raising error during training on GeForce RTX 3090

Related topics