cudaDeviceSynchronize blocking effect cudaDeviceScheduleBlockingSync

anon766635 · June 30, 2012, 9:08am

Hi,

The documentation concerning cudaDeviceSynchronize seems to make a difference if the flag cudaDeviceScheduleBlockingSync is set or not.

What the difference between the two ?

Is “completed all preceding requested tasks” different from “device has finished its work” ?

If it the same does cudaDeviceScheduleBlockingSync affect what is blocked, because the second specify “host thread” and the first don’t ?

And finally, if the second block the “host thread”, what is blocked without cudaDeviceScheduleBlockingSync flag set ?

Edit:

Or maybe the difference is that the with cudaDeviceScheduleBlockingSync flag set the host thread calling cudaDeviceSynchronize will block for the device to finish all its work, including requests from others host threads to the device. And without the flag the host thread block only until its requested calls are completed.

tera · June 30, 2012, 9:37am

The difference is between “block”, “yield”, or “spin”. In the default “spin” setting the host thread enters a close busy-waiting loop, so it consumes 100% CPU cycles while waiting for the device to finish (unless the CPU scheduler yields to CPU to a different thread). In the “yield” setting, the busy-waiting loop includes an OS call to actively yield the CPU to other threads, while in the “block” setting the host thread sleeps using 0% CPU until the GPU becomes ready again.

anon766635 · June 30, 2012, 11:57am

Thanks for your answer.

According to you the cudaDeviceSynchronize function should be documented as:

xnov · June 30, 2012, 7:04pm

i have almost similar question
when to use the cudaDeviceSynchronize?
let say i have 2 device that split a task of simple array sum… should i use it after the kernel launch?

Topic		Replies	Views
cudaDeviceScheduleBlockingSync & multi-gpu How to use BlockingSync w/ multiple devices? CUDA Programming and Performance	3	6675	April 13, 2011
CPU core is busy while GPU runs its kernel CUDA Programming and Performance	11	5390	February 11, 2018
Question about cudaDeviceScheduleBlockingSync CUDA Programming and Performance	0	484	March 24, 2021
cudaDeviceSynchronize - blocks only GPU for the host (CPU) thread in which it is called, or does it CUDA Programming and Performance	3	4237	January 12, 2014
letting the host thread sleep in 2.2? CUDA Programming and Performance	8	4441	July 1, 2009
Does cudaDeviceReset() wait for operation completion on the device? CUDA Programming and Performance	5	858	December 27, 2023
CUDA beginner: understanding the workflow of CUDA kernels and cudaDeviceSynchronize() CUDA Programming and Performance	0	828	November 27, 2017
CPU Usage CUDA Programming and Performance	6	1800	October 5, 2009
CUDA context flags CU_CTX_SCHED_YIELD vs CU_CTX_BLOCKING_SYNC CUDA Programming and Performance	5	2975	October 20, 2010
Performance tests and cudaThreadSynchronize CUDA Programming and Performance	2	1044	July 1, 2010

cudaDeviceSynchronize blocking effect cudaDeviceScheduleBlockingSync

Related topics