hey guys , im trying to practice some of the stuff i’ve managed to learn so far , but there are few things that i still dont understand

lets say i’ve a 1000X1000 matrix that i want to compute , what’s the best way to send its parameters to kernel function <<<GridBlock,numThreadsInEachBlock>>>

lets say i have the following code :

```
for(unsigned y=0; y<ImageHeight; ++y)
{
double c_im = MaxIm - y*Im_factor;
for(unsigned x=0; x<ImageWidth; ++x)
{
double c_re = MinRe + x*Re_factor;
double Z_re = c_re, Z_im = c_im;
bool isInside = true;
for(unsigned n=0; n<20; ++n)
{
double Z_re2 = Z_re*Z_re, Z_im2 = Z_im*Z_im;
if(Z_re2 + Z_im2 > 4)
{
isInside = false;
break;
}
Z_im = 2*Z_re*Z_im + c_im;
Z_re = Z_re2 - Z_im2 + c_re;
}
if(isInside) { putpixel(x, y); }
}
}
```

the following code constructs the MANDELBROT SET(see fractals) , i know that the first 2 FOR loops can be used outside the kernel function whenever i generate the initial matrix

i mean sending the matrix to the kernel already initialized with the points i need to check…

my question is about the the third FOR . since it’s constant , does it create barrier diveregent? i say it doesn’t since all threads in same warp should be executed by that for statement … but im not sure.

same thing about the “if(Z_re2 + Z_im2 > 4)” how can i avoid that brance and plance another command instead inorder to avoid any branches?

and last is it possible to use opengl command directly from the GPU? (GLvertex and so on?)

are there any implemented cuda samples that calculate the mandelbrot set that i can use?

thank you.