Threads begginer question

l0odi · July 16, 2007, 6:11am

Ok…I read NVIDIA’s CUDA guide and I know how threads are grouped and stuff like that. The thing that confuses me is thread’s and block’s indicies (threadIDx.x etc…) and the number of threads that I have to use?! Give me example, please, how can I add 2 arrays or matricies. I have experience in GPGPU using OpenGL + cg…
Thank you!

yk_cadcg · July 16, 2007, 7:20am

guide has an example, which is good;

sdk samples are all worth read.

take your time to read them all :)

l0odi · July 16, 2007, 4:55pm

Please, please, please write me kernel (pseudo code will be fine) for adding 2 arrays. In OpenGL/cg I had texture indicies and I don’t know how to do same thing with CUDA. I don’t understand things like: C[threadIDx.x * blockIDx.x + blockDim.x] Why like this? I didn’t find explained what are values of block’s/thread’s ID’s…

Thank you

mfatica · July 16, 2007, 5:45pm

Take a look at this thread:

[url=“http://forums.nvidia.com/index.php?showtopic=34309”]The Official NVIDIA Forums | NVIDIA

l0odi · July 16, 2007, 6:46pm

Ok…This is what bugs me:

global void add_arrays_gpu( float *in1, float *in2, float *out, int Ntot)

{

   int idx=blockIdx.x*blockDim.x+threadIdx.x;

   if ( idx <Ntot )

   out[idx]=in1[idx]+in2[idx];

}

int idx=blockIdx.x*blockDim.x+threadIdx.x;

idx covers all elemetns from arrays. Let’s asumme that we have arrays of 16 members so values of idx will be 1 2 3 4 5…16? Am I right? Can u explain me why are u doing this blockIdx.x*blockDim.x+threadIdx.x?

I am so stucked with this…:(

mfatica · July 16, 2007, 6:55pm

You need to map from a local index to a global index. You know how many blocks you have and how big each block is.

Let’s assume you have an array of 8 elements, and you are using 2 blocks with 5 threads each.

Block 0:
blockIdx.x=0
blockDim.x=5
threadIdx.x= 0,1,2,3,4
idx will span: 0,1,2,3,4

Block 1:
blockIdx.x=1
blockDim.x=5
threadIdx.x= 0,1,2,3,4
idx will span: 5,6,7,8,9

So idx is covering the initial range plus some (and there is a check in the kernel to see if idx is outside the initial range).

l0odi · July 16, 2007, 7:03pm

Oh…Thanks a lot man! It is so much clearer now! 1 more question: What are values of threadIDx.y and blockIdx.y?

Edit: What is local index and global index? thread index and block’s/grid’s index?

mfatica · July 16, 2007, 8:23pm

This example is using a 1D decomposition, so threadIdx.y and blockIdx.y are both zero.

I am using global to refer to the index of the original problem, and local to refer to the index in the decomposed problem.

l0odi · July 16, 2007, 8:33pm

Tell me this:

If I have 64 blocks and 256 threads I have 4 thread in each block and have allocated arrays like this:

CUDA_SAFE_CALL(cudaMalloc((void**)&dinput, sizeof(float) * NUM_THREADS * 2));

CUDA_SAFE_CALL(cudaMalloc((void**)&doutput, sizeof(float) * NUM_BLOCKS));

CUDA_SAFE_CALL(cudaMalloc((void**)&dtimer, sizeof(clock_t) * NUM_BLOCKS * 2));

blockIDx.x = 0

threadIDx.x = 0 1 2 3

blockIDx.x = 1

threadIDx.x = 0 1 2 3

blockIDx.x = 2

threadIDx.x = 0 1 2 3

blockIDx.x = 3

threadIDx.x = 0 1 2 3

.

blockIDx.x = 63

threadIDx.x = 0 1 2 3

Is this correct???

What is good example of using threadIDx.y and blockIDx.y values in kernel? (SDK sample or some else)

Edit: …sample with 2d composition

Topic		Replies	Views
CUDA thread allocation CUDA Programming and Performance cuda	4	898	March 8, 2021
Threads, Blocks & Grid in CUDA CUDA Programming and Performance	4	10446	October 4, 2011
Thread id and thread index in 2D CUDA Programming and Performance	4	6138	November 20, 2014
difference between threadIdx, blockIdx statements CUDA Programming and Performance	10	154938	May 14, 2024
1D/2D indexes usage in a kernel CUDA Programming and Performance	3	772	January 31, 2011
Help me! CUDA Programming and Performance	5	1951	February 9, 2010
questions about the NVIDIA programming model and GPU architecture newbie in here.... CUDA Programming and Performance	3	2469	November 10, 2008
idz question CUDA Programming and Performance	5	5201	June 13, 2008
What does threadIdx.x+blockDim.x mean? CUDA Programming and Performance cuda	3	818	September 16, 2022
Grids and Threads question CUDA Programming and Performance	2	4421	August 7, 2007

Threads begginer question

Related topics