How expensive is calling clEnqueueNDRangeKernel?

santevid · May 10, 2010, 6:39pm

Question #1

I have a function Run() that calls execution of two kernels:

// As you see, I’m using events (eventRow, eventCol) because of profiling.

How expensive (time performance) is calling enqueueNDRangeKernel (or clEnqueueNDRangeKernel ).

With Nvidia OpenCL Profiler, I got total time of execution (on GPU) 351 ms, but when I measured time of running of method Run()

I got 622 ms.

Why this difference is so large?

When is data transfered to GPU, on calling clEnqueueNDRangeKernel or when buffer is created (clCreateBuffer)?

I tested on NVIDIA GT240.

I also tested on ATI HD 5670 and difference is much smaller.

madsen · May 12, 2010, 8:44am

Hi!

I measured the overhead in launching my kernel to be 41um here:
[url=“http://forums.nvidia.com/index.php?showtopic=162539”]The Official NVIDIA Forums | NVIDIA

This is of cause just on my machine, but it means that even though my calculations run much faster on the GPU, I can’t benefit since the overhead is killing the execution.

Madsen

Topic		Replies	Views
clEnqueueNDRangeKernel call takes too much time on Nvidia GPUs CUDA Programming and Performance opencl	4	194	January 10, 2026
Increasing kernel dispatch overhead for multiple GPUs CUDA Programming and Performance	0	578	July 30, 2018
kernel launch time way too long CUDA Programming and Performance	6	4121	July 5, 2011
Kernel enqueue overhead Bringing kernel overhead down? CUDA Programming and Performance	9	13884	March 12, 2010
Interpreting OpenCL Visual Profiler Results CUDA Programming and Performance	4	2323	June 10, 2010
Dispatch Kernel Overhead (OpenCL) CUDA Programming and Performance	6	3828	March 28, 2017
kernel call overhead: timing results overhead is large for small # of calls CUDA Programming and Performance	16	8016	March 8, 2013
Where does kernel time's overhead come from? Fermi, Kernel time, profile CUDA Programming and Performance	3	6371	March 31, 2011
Profiler OpenCL results explanations CUDA Programming and Performance	4	1588	December 2, 2010
overhead between two successive kernel calls CUDA Programming and Performance	6	1851	July 7, 2013

How expensive is calling clEnqueueNDRangeKernel?

Related topics