Is there external memory sort library similar to thrust?

I want to sort a large dataset using CUDA on GPUs, like 40GB. It is important part of my application.

I have used thrust sort library for a small dataset, it is fast and easy-use.

Is there a sort library similar to thrust to support a large dataset that can’t fit in the global memory? I’ve searched some papers researching on external memory sort, it would take me too much time to implement it. If there is a good library I can use, I can focus on my application problem.

Thanks a lot~~~~

You can split data to fit to gpu memory, sort parts of array there and after use merge sort on cpu side. Actually just use recursive merge sort and use gpu when size is small to fit to gpu. It is very easy.

Thanks a lot~, I will try it