String searching in CUDA

Hello all!

I wonder if it is possible to develop a CUDA application that will search for given patterns in a text corpus (similar to grep in linux). The patterns will be loaded once in device memory at startup. Then, the data of each given file will be copied to the device where the string search will take place. The results will be copied back to host memory.

a) Is CUDA suitable for this kind of applications ?

b) Does anybody know if there are any good SIMD algorithms suitable for this job?

c) Do you know if something similar exists ?

Thanks in advance!

There are several published papers on using CUDA for gene sequence alignment (which is basically a specialized kind of string search), I would start by reading these:

Hello Simon and thanks for your reply!

I have already read those papers and it seems they don’t fit my needs.

The most algorithms for string searching I have seen until now (knuth-morris-patterson, boyer-moore, etc.) use at least one loop with complex conditions inside that makes difficult to parallelize in a SIMD way.

I wouldn’t assume that complex conditions necessarily make it inefficient to implement on the GPU. There’s no substitute for experiment!

There are several ways to parallelize string search - Googling for “parallel string searching” turns up lots of interesting stuff.


i think it’s already done