Cuda Master Slave

brunopozzebon44 · January 9, 2023, 1:34am

Is possible to create a bunch of tasks, a lot of them, i make CUDA executes like a master/slave. It means, i have thousands of jobs to do, and hundrens of GPU threads that are able to compute. Each thread will run, complete, and as soon as possible take another job, until there is no job left to run. I was thinking about this approach, because every job has different execution complexity. Because of this, the faster threads are always waiting the small ones that are running the hard tasks, and i dont use the GPU efficiently.

Robert_Crovella · January 10, 2023, 3:48am

You can probably make almost anything “work”, so nearly any work distribution strategy is probably “possible” on a CUDA GPU.

CUDA GPU threads generally need to be following the same code path, and loading and storing adjacent data, in order to get useful performance out of the GPU, at least up to the warp level (groups of 32 threads). It sounds like what you are describing is task-level parallelism rather than the kind of “data parallelism” that I described.

If your actual work fits the “data parallelism” that I described, then it may be a candidate, but having a large work discrepancy between threads doesn’t always indicate “data parallelism”. If you are asking about task parallelism, i.e. disparate work between threads, with no commonality/similarity even at the warp level, its probably not a good fit for (CUDA) GPUs.

Topic		Replies	Views
Cuda-task parallelism on a single GPU CUDA Developer Tools	0	492	October 15, 2020
My first test on CUDA and some questions sync, thread with CUDA CUDA Programming and Performance	5	3019	November 13, 2007
Is it possible to take advantage of multi-core CPU parallelism while using CUDA? Considering using C CUDA Programming and Performance	3	5588	April 4, 2011
Using CUDA to run many instances CUDA Programming and Performance	10	3252	April 1, 2012
I want to Implement 10.000 Cores in GPU, each making an arithmetic equation, is possible to do: I wi CUDA Programming and Performance	4	1615	February 4, 2016
32-256+ different process running in parallel CUDA Programming and Performance	3	3543	August 4, 2009
implement parallelism among procedures? CUDA Programming and Performance	1	894	June 22, 2009
choosing which to parallelize CUDA Programming and Performance	2	3886	February 1, 2009
Performance of Divergent Threads CUDA Programming and Performance	2	1632	September 29, 2008
task parallelism on CUDA CUDA Programming and Performance	3	1729	June 24, 2012

Cuda Master Slave

Related topics