parallel search

i am a newbie in cuda and trying to implement sequential search on cuda without using any of the libraries provided but i dont know how i will know if one element is at multiple places…

one thing that i can do is create a array of same size of that array and put index number in that array as soon as i found any number matched with that?

plz help…

Just off the top of my head…

  • You have a thread responsible for each element in the search space.
  • If that threads value matches the search value, it writes a 1 into the output array at out[threadID].
  • You can then do a stream compaction(link 1, link 2) on the output array to get the indexes of the matches

Of course, depending on how many elements you have it may be quicker to just search on the CPU.

If the overhead of the stream compaction is too great (or too complicated to implement right now), you could do what you suggested and write the index number to the output array and iterate over that on the host…

but second method is quite costly and uses lot of memory

stream compacting is blowing my head off :wacko: