Max threads/blocks

Curefab · September 6, 2024, 3:45pm

A kernel call within a stream is always one stream operation. All blocks and threads finish before the stream continues.
It does not matter, if the same or a different kernel is called as next operation in the stream.

Within a kernel call the blocks and their threads run asynchronously and you possibly get all kind of crazy race conditions, yes.

That is, why you try to make blocks as independent as possible in your algorithm.

The same to a lesser degree for warps and to the least degree for the threads within a warp.

And where you have to share work or data or reconfigure, which thread is responsible for which data packet (that can make sense even within a kernel), you use synchronization primitives.

Topic		Replies	Views
Scheduling Thread Blocks CUDA Programming and Performance	5	1328	July 29, 2021
Limit to Number of Blocks? Noob Question CUDA Programming and Performance	4	3066	May 16, 2008
Why is max threads per sm larger than max threads per block? CUDA Programming and Performance	3	1645	January 5, 2024
confusion of basic concepts CUDA Programming and Performance	8	6419	May 18, 2011
how are blocks scheduled for execution? CUDA Programming and Performance	3	3539	December 9, 2016
Maximum block per grid CUDA Programming and Performance cuda	4	4291	March 24, 2023
How determine max number of blocks and threads for a GPU? CUDA Programming and Performance	4	21621	December 13, 2018
Cuda Cores Cuda Cores - run threads bloocks, kernels etc. CUDA Programming and Performance	5	1862	February 22, 2011
Maximum number of threads in a GPU CUDA Programming and Performance cuda	5	7226	December 29, 2022
a simple question about the resident blocks per multiprocessor CUDA Programming and Performance	6	3904	August 23, 2017

Max threads/blocks

Related topics