Why OpenCV thresholding function is slower in GPU than CPU?

vidheesha.pk · September 17, 2018, 10:24am

Hi,

OpenCV - 3.3.1
Cuda toolkit - 8.0.84
Cudnn - 6.0

I’ve done a code on comparing the performance of OpenCV functions namely cvtcolor() and threshold() in CPU as well as in GPU.
It is observed that the cvtcolor() is faster in GPU. But cuda::threshold() is slower in GPU.
Why is the performance of some functions decreasing in GPU?

My code is as described below.

##################################### cvtcolor #########################

start_color = std::chrono::steady_clock::now();
for (int k = 0; k < 10000; k++)
{

	cv::cvtColor(im1, im1_, CV_BGR2GRAY, 1);
	cv::cvtColor(im2, im2_, CV_BGR2GRAY, 1);	
}

stop_color = std::chrono::steady_clock::now();
std::chrono::duration<float> time_interval = stop_color - start_color;
t = 1000*time_interval.count();
	std::cout << "run time for " << "cv::cvtColor" << " = " << (t/20000) << " milli sec" << std::endl;

################################ cuda::cvtcolor #########################

start_colorcuda = std::chrono::steady_clock::now();
for (int k = 0; k < 10000; k++)
{
	
	cuda::cvtColor(im1_gpu, im1_gpu_, CV_BGR2GRAY, 1);
	cuda::cvtColor(im2_gpu, im2_gpu_, CV_BGR2GRAY, 1);
	
}

stop_colorcuda = std::chrono::steady_clock::now();
std::chrono::duration<float> time_interval_cu = stop_colorcuda - start_colorcuda;
t_cu = 1000*time_interval_cu.count();
	std::cout << "run time for " << "cuda::cvtColor" << " = " << (t_cu/20000) << " milli sec" << std::endl;

####################################### threshold #########################

start_threshold = std::chrono::steady_clock::now();
for (int k = 0; k < 10000; k++)
{
	
	threshold(im1_, th1, 0, 1, THRESH_BINARY );
	threshold(im2_, th2, 0, 1, THRESH_BINARY);
	
	
}
stop_threshold = std::chrono::steady_clock::now();
std::chrono::duration<float> time_th = stop_threshold - start_threshold;
t_th = 1000*time_th.count();
	std::cout << "run time for " << "threshold" << " = " << (t_th/20000) << " milli sec" << std::endl;

#################################### cuda::threshold #########################

start_threshcuda = std::chrono::steady_clock::now();
for (int k = 0; k < 10000; k++)
{
	
	cuda::threshold(im1_gpu_,th1_gpu,0, 1,  THRESH_BINARY);
	cuda::threshold(im2_gpu_,th2_gpu,0, 1, THRESH_BINARY);
	
}

stop_threshcuda = std::chrono::steady_clock::now();
std::chrono::duration<float> time_thcu = stop_threshcuda - start_threshcuda;
t_thcu = 1000*time_thcu.count();
	std::cout << "run time for " << "cuda::threshold" << " = " << (t_thcu/20000) << " milli sec" << std::endl;

And the results are attached as shown below:

run time for cv::cvtColor = 0.143568 milli sec
run time for cuda::cvtColor = 0.097032 milli sec
run time for threshold = 0.0138205 milli sec
run time for cuda::threshold = 0.0794818 milli sec

cbuchner1 · September 17, 2018, 1:20pm

This very specific question would be best answered by OpenCV developers, as they know the details of their implementation best.

There’s an opencv-users mailing list as a Yahoo! Groups (opencv@yahoogroups.com)

And sourceforge hosts the opencvlibrary-devel mailing list ( opencvlibrary-devel List Signup and Options )

Also, when unsure about aspects of speed, there’s the nVidia Visual Profiler (nvvp) as well as the nSight Visual C++ profiler (or the nSight Eclipse edition). These tools can tell you what’s going on on a microsecond timeline, and with a graphical user interface. Transporting data (especially image data) to the GPU and back is often a bottleneck.

Topic		Replies	Views
Why CUDA slower that OpenCL? CUDA Programming and Performance	5	1518	September 12, 2018
GPU and CPU speed dicrepancy Jetson AGX Xavier cuda	6	684	January 26, 2022
CUDA slower than CPU? CUDA Programming and Performance	7	773	August 18, 2023
Using OpenCV cuda stream for parallel CPU and GPU execution CUDA Programming and Performance opencv	3	1320	December 1, 2022
OpenCV CUDA Canny is slower than cv::Canny ? Jetson Nano opencv	4	2653	July 2, 2019
cuda gpu slower than cpu CUDA Programming and Performance	2	1086	May 1, 2012
CUDA code too slow Jetson Nano cuda	6	1771	July 26, 2022
CUDA is so slow Jetson Nano opencv	5	1294	June 30, 2022
CUDA-OPENCV : low performances instead of high performances CUDA Programming and Performance	0	881	April 13, 2016
OpenCV dft vs. gpu::dft Performance GPU-Accelerated Libraries opencv	6	4336	July 8, 2017

Why OpenCV thresholding function is slower in GPU than CPU?

Related topics