Here is part of my code - Im operating on a 2d integer array (data-2d array)
idlist-1d array
tem-1d array
tem1-1d array
while
{
//SOME CODE HERE
da1=data[r][cols-1];
for(int k=1;k<rows;k++)
{
r=idlist[k];
da2=data[r][cols-1];
if(da1==da2)
tem[val++]=idlist[k];
else
tem1[val1++]=idlist[k];
}
//SOME CODE HERE
}
I have read few examples of cuda program and they are understandable but when it comes my program it looks very complicated. How can we convert this kind for loop to parallel code, not exactly - give me some suggestions.Does it needs 1 or 2 kernels to be written?? For codes flowing this way is it possible to run parallelly using threads in cuda?. Please help