Opencv cuda convolution extremly slower than bare cuda code convolution on Jetson Nano using unified memory

AastaLLL · November 2, 2020, 6:00am

Hi,

Since we don’t own OpenCV implementation, you can check this issue with OpenCV developer to get more information.

Based on their code, it seems that they implement convolution through cufft rather than cudnn.
Depends on usecase, to convert the spatial signal to Fourier may not always has gain due to the transformation overhead.

github.com

opencv/opencv_contrib/blob/master/modules/cudaarithm/src/arithm.cpp

/*M///////////////////////////////////////////////////////////////////////////////////////
//
//  IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
//  By downloading, copying, installing or using the software you agree to this license.
//  If you do not agree to this license, do not download, install,
//  copy or use the software.
//
//
//                           License Agreement
//                For Open Source Computer Vision Library
//
// Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
// Copyright (C) 2009, Willow Garage Inc., all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
//   * Redistribution's of source code must retain the above copyright notice,

This file has been truncated. show original

For slow cuDNN issue, this is a known regression from cuDNN v8.
https://forums.developer.nvidia.com/t/darknet-slower-using-jetpack-4-4-cudnn-8-0-0-cuda-10-2-than-jetpack-4-3-cudnn-7-6-3-cuda-10-0/
Our internal team is working on this. Will share you the latest status once we got any update.

Thanks.

Topic		Replies	Views
Eliminate upload/download for OpenCV cuda::GpuMat using shared memory? Jetson Nano opencv	14	21365	October 14, 2021
CUDA is so slow Jetson Nano opencv	5	1408	June 30, 2022
Jetson Nano convolution operation as fast as possible CUDA Programming and Performance cuda	3	1146	September 25, 2020
Slow performance with opencv at jetson tx2 Jetson TX2	13	4078	October 18, 2021
Gaussian filtering computed by cpu through opencv is much faster than Gaussian filtering through CUDA on jetson nano Jetson Nano cuda , gpu-computing	4	168	December 5, 2024
Too slow OPENCV with CUDA compiled, why? Jetson Nano opencv	5	5188	October 18, 2021
Opencv Face Detection Poor Performance with jetson nano Jetson Nano opencv	51	14798	October 14, 2021
OpenCV 4.2.0 and CuDNN for Jetson Nano? Jetson Nano opencv	56	11581	October 18, 2021
OpenCV, CUDA, Python with Jetson Nano Jetson Nano opencv	58	39874	October 14, 2021
OpenCV with Jetson Nano Slow Webcam frame rate Jetson Nano opencv	7	3618	October 15, 2021

Opencv cuda convolution extremly slower than bare cuda code convolution on Jetson Nano using unified memory

Related topics