How would the code in CuDa?

How would the code in CuDa?

for (int i=0; i<N; i++)
for (int j=0; j<N; j++)

thank you,

CUDA is C, so you can write it exactly the same way.

P.S. Why are you adding 2 in a loop over j, why not just do a[i] = a[i] + 2*N - 1;

Why not just add 2 * N - 1 to each element rather than doing the inner loop?

I’d write a kernel with N threads that adds 2 * N - 1 to each element.

Something like:

__global__ void kernel(int *array, int N) {

	unsigned int index = blockIdx.x * blockDim.x + threadIdx.x;

	if (index < N) {

  a[index] += 2 * N - 1;



Maybe it’s a zen koan.

sorry for stupid code…
my idea is compare the time of execution between cpu code and cuda code. So I need a stupid code but computationally expensive for easy looking the diference.
the code is easy, only we want for each float the average with its right and left float.
this code is writed in c. but How would the code in CuDa?
thank you!

void example (float *array, float *b ){

for (int i=0; i<sizeof(array); i++){
case ‘0’:
b(i)= (2array(i) + array (i+1))/3;
case ‘sizeof(array)-1’:
b(i)= (2
array(i) + array (i-1))/3;
b(i)=(array(i-1) + array(i) + array (i+1))/3;