cuda 10.0 on gemm is faster than convolutional

I run yolov3 on CUDA 10.0 faster than CUDA 10.0 + cudnn 7.4. Is it that the optimization of cudnn is slower than that of cuda?

The fps of yolov3 on CUDA 10.0 increase 8 fps than on CUDA 10.0 + CUDNN 7.4.