Flood Fill algorithm in CUDA

I got a task to do. I need to run a Flood Fill algorithm on CUDA. On CPU I have a non-recursive method with stack, but I dont have any idea how to do move this code to GPU. Can anybody help?

edit:
this is my CPU code, it’s simple

void cpuFloodFill(std::vector<std::vector<int>> *colorVector, int node)
{
	std::queue<int> q;
	q.push(node);

	int i,j;

	while(!q.empty())
	{
		int k = q.front();
		q.pop();

		k2ij(k, &i, &j);
		if((*colorVector)[i][j] == COLOR_TARGET)
		{
			(*colorVector)[i][j] = COLOR_REPLACEMENT;			
			if(i - 1 >= 0)
				q.push(ij2k(i - 1, j));

			if(i + 1 < X)
				q.push(ij2k(i + 1, j));

			if(j - 1 >= 0)
				q.push(ij2k(i, j - 1));

			if(j + 1 < Y)
				q.push(ij2k(i, j + 1));
		}
		
	}
}

That is essentially a BFS type algorithm, and there has been some work in that area;

http://www.ijcaonline.org/volume10/number10/pxc3871992.pdf
http://ppl.stanford.edu/papers/ppopp070a-slides.pdf
https://github.com/pathscale/rodinia/blob/master/cuda/bfs/bfs.cu

You cannot expect someone here to write the CUDA version for you, but those papers will get you on your way.

Google is your friend…

I didn’t expect full code, but a help with a general idea, like using BFS ;) thanks for that, hope it will help