Problem on thread divergance.

Hi All ,

I have a problem on the follwing code :-

global void kernel_fun(unsigned char* Array1, unsigned char* Array2, unsigned char* Array3)
{

//Some initialization here.

for (j = 0; j < 560; j++) {
for (k = 0; k < 255; k++){

  [b]if (//condition1){
    if ((//condition2)[/b] 
             //some manipulation
    [b]else[/b] 
      //default manipulation
  }
  [b]else[/b] {
   // assigning some fixed value to variables.
  }
}

}

for (k = 0; k < 560; k++) {
for (j = 0; j < 255; j++){

if (//condition3){
if (condition 4)
//some manipulation
else
//default manipulation
}
else {
//another manipulations.
}
}
}

I got a reply from MisterAnderson42 in some other question thread modifying simulator to work with CUDA.?.
But Some thing not clear to me,that is

– How can I solve a big thread divergance in this code(i.e. multiple if else conditions within two for loops ) ?
–How should I execute this code parllely ( I mean to say how to organize block and grid ) ?

Thanks.

Till now I have not got efficient solution for this .Please help me (or give me any clue.)

Thanks.