a question about SIMT

Xuejun · June 7, 2009, 8:04pm

As a newbie to CUDA programming, the concept of SIMT( single instruction multiple data) is quite confusing to me. To my understanding a CUDA code gets the best performance if all threads execute the exact same instructions. But how about in the case that different threads have to execute different number of instructions? For example in the following function

global void (int *n, int *a){

const int tid = blockIdx.x*blockDim.x + threadIdx.x;

for (int i = 0; i < n[tid] ; i++){

   a[tid] += i;

}
}

Case 1) the array n has been assigned with varying number from 10 to 1000.
Case 2) the array n has the exact same number 1000.

I expected that case 1) violate the SIMT rule and will achieve the worse performance. However, the testing results gave me a very similar performance for both case 1) and case 2).

It is very confusing. Does any body know why I get this very similar performance for two cases? Thanks!

Cygnus_X1 · June 7, 2009, 8:12pm

Case 1 is correct. However there are 8 computing units (per SM) executing only 32 threads in SIMD fashon. Those groups of 32 threads are called warps. If your block contains more threads (and usually it should), you will have several waprs interleaving each other on those “poor” 8 units. However 2 different warps may execute completly different instructions. That’s why in your case, although your performance will degrade, it won’t hurt so much.

Topic		Replies	Views
SIMT ,SIMD,SPMD, CUDA Programming and Performance	2	18021	June 6, 2010
SIMD Versus SIMT What is the difference between SIMT vs SIMD CUDA Programming and Performance	15	26091	August 20, 2010
SIMD and SIMT Understanding the difference CUDA Programming and Performance	2	2796	October 6, 2009
SIMT vs SIMD CUDA Programming and Performance	3	3336	April 27, 2015
How is SIMT “Single Instruction”? CUDA Programming and Performance	1	1842	July 1, 2015
SIMT == SIMD? CUDA Programming and Performance	4	26174	April 3, 2009
CUDA PROGRAMMING CUDA Programming and Performance	3	3908	June 18, 2009
Back to SIMD CUDA Programming and Performance	21	471	November 12, 2024
Will random result break SIMD? CUDA Programming and Performance	1	659	January 17, 2014
threads in one block CUDA Programming and Performance	7	1849	March 6, 2010

a question about SIMT

Related topics