while (true)
{ while (S[text[n++]]);
if (n > to) break;
matches[count++] = n;
}
return count;
}[/codebox]
and this is my kernel:
[codebox]global void oneMatch_kernel(bool *S, char *text, int *matches, int *count, int from, int to) {
if (from < 0) from = 1;
int i = blockIdx.x*blockDim.x+threadIdx.x;
if (i < from) return;
while(S[text[i]]);
if (i > to) return;
matches[count[0]++] = i;
}[/codebox]
the function works by comparing characters and each time the characters match it increments the variable count but the values returned by the kernel are wrong so if anyone can help it would be great
Why count from 0 on last line? And while(S[text[i]]); is very strange. Looks like you have wrong assumptions about how gpu program works. It works in parallel, it is not a cycle.
count is what i want to return from the kernel but i couldn’t seem to use and int variable so i used an array and incremented the first element in it and i know i shouldn’t use this while loop but i want each thread to loop on the array the thing is it works perfectly in the serial code
i tried to debug it using visual studio but i couldn’t seem to find a way to debug the kernel, if you know any way that i can debug it with it would be of great help
You can’t just copy/paste serial code into CUDA :)
The count[0]++ means that all threads in all blocks will concurrently write to the same location in memory thus creating race-condition and obviously create
faulty results.
Either use Atomic functions or re-implement your algorithm to be multi-threaded safe. This really is not CUDA/GPU related but more of a multi-threaded issue.