I am looking for an idea that I could solve/implement using CUDA. It could be a classic programming problem, if it would gain a performance boost thanks to CUDA.
Image processing (i.e. filters - low pass etc.) is too simple for my needs. I need smt bigger (not necessarily more difficult External Image )
I just implemented counting all simple cycles in a graph in CUDA. It were 6 times faster than 2-core processor implementation. So you can try some integer calculations too.
Here is a few ideas:
Prime searching or factorization with square sieve.