YOLOv4 with cuDNN8 on Ubuntu 20.04 is raising error during training on GeForce RTX 3090

Hi all,

I recently bought a system with 2 GeForce RTX 3090 for deep learning using YOLOv4. The system has Ubuntu 20.04 OS. I followed all the installation correctly and configured the YOLOv4 for custom model training. The darknet raises error when training is started with -map switch. The training command giving this error is as follows:
!./darknet detector train "obj.data" "custom-train.cfg" "weights/yolov4.conv.137" -dont_show -mjpeg_port 8090 -gpus 0,1 -map

The error is as follows:

calculation mAP (mean average precision)…
Detection layer: 139 - type = 28
Detection layer: 150 - type = 28
Detection layer: 161 - type = 28
cuDNN status Error in: file: /home/oem/mlworks/darknet/src/convolutional_kernels.cu : () : line: 555 : build time: Oct 30 2021 - 15:30:45

Darknet error location: /home/oem/mlworks/darknet/src/dark_cuda.c, cudnn_check_error, line #204

I found that I may need to install older version of CUDA toolkit (10. 1 update2) and older cuDNN library (v7.6.5). I couldn’t find cuDNN for Ubuntu 20.04. So I am stuck with this error and can’t proceed any further.

I can take out the -map to continue with training but that is not the option. Can someone guide me how to resolve this error with CUDA toolkit 11 on Ubuntu 20.04.

I will greatly appreciate your help or any guidance.

Many Thanks and
Best Regards,

Sorry, I can’t help you with the error directly, but just wanted to point out that Cuda 10.1 would not help you as it does not support SM 8.6 (RTX30XX) cards. The minimum required is Cuda 11.1.

Thanks for this clarification. So it seems like a dead end to me then?

I know nothing about it, but I have just looked at the installation requirements: GitHub - AlexeyAB/darknet: YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
and it states: “Cuda >=10.2 cuDNN >= 8.0.2”

Looking at the the Nvidia page for cuDNN, I see version for Cuda 11.4:

So if it were me, I’d install Cuda 11.4 and use the 11.4 cuDNN and rebuild it from there…

You may get more help from the cudnn forum Topics tagged cudnn