implement parallelism among procedures?

zhunan_sjtu · June 22, 2009, 4:31pm

I think that CUDA offers a instruction-level parallel execution enginee,but in the usual problems, we often meet two functions without any dependencies,so that if we want to improve the efficiency of the program,we must implement the procedure-level parallelism,

Can we use CUDA to do this?

It may be too hard for CUDA-C but how about PTX?

_Big_Mac · June 22, 2009, 6:15pm

Technically you can do something like this

int tid = threadIdx.x + blockDim.....

if(tid < 2048)

  //do procedure 1

else if(2048 <= tid < 4096)

  //do procedure 2

else ...

That’s a poor man’s task parallelism. CUDA is smart enough to do load balancing in this situation. I’ve tested it on something like:

if(tid < 2048)

   //compute a hundred MADs and store the result

else if(tid < 4096)

  //compute two hundred MADs and store the result

This code turned out to be exactly as fast as one that calculated 150 MADs with all 4096 threads.

Now, there are a few problems with this:

your kernels get huge and ugly
your register usage is for the worst case branch
for non-trivial kernels load balancing may not turn out so great
you have to manually partition the data and computation

We’ve been waiting for parallel kernel execution for a long time but apparently the GPU logic isn’t smart enough yet.

Topic		Replies	Views
32-256+ different process running in parallel CUDA Programming and Performance	3	3543	August 4, 2009
Cuda Master Slave CUDA Programming and Performance cuda	1	430	January 10, 2023
Is it doable with CUDA? CUDA Programming and Performance	6	564	December 29, 2019
Is it possible to take advantage of multi-core CPU parallelism while using CUDA? Considering using C CUDA Programming and Performance	3	5588	April 4, 2011
Can CUDA do sequential processing? CUDA Programming and Performance	7	6528	August 24, 2011
Cuda-task parallelism on a single GPU CUDA Developer Tools	0	492	October 15, 2020
Using <<<...>>> CUDA Programming and Performance	6	2475	June 19, 2011
Designing a CUDA algo question Sort of a newbie question.... CUDA Programming and Performance	2	2363	December 9, 2011
Convert an existing numerical model (C) to support CUDA CUDA Programming and Performance	4	437	July 26, 2018
Simulation engine based on CUDA Exploring a new(?) idea CUDA Programming and Performance	3	1749	March 5, 2009

implement parallelism among procedures?

Related topics