I am looking for the
__device__ sorting function for an array with variable length, since I want a warp or block of threads to collaboratively sort an array with variable length.
I found most sorting function implementation online is about the
__global__ sorting calling from the host side.
Is any high-performance implementation of
__device__ sorting function?