GPU Search Engine

I am doing a GPU search engine project.

We want GPU to do the inverted lists intersection operation. The problem is that the time for transferring the inverted lists to GPU is slow because response time must be small, but GPU memory can not hold the whole inverted index. We need a GPU cache strategy.

Any comments?

Try zero-copy mechanism… Load portion of data from zero-copy RAM memory to GPU memory (which acts like a cache), perform search on it, Store your results somewhere in GPU memory
and continue loading other portions of Zero-copy memory and so on…