Matlab/MEX and CUDA and CUDPP

Hello !

I am quite new to programming in MEX nor CUDA.
I already read the tutorial by Boxed Cylon.

My wish is to speed up following (matlab notation):
[sorted_vector indices_of_original_vector] = sort(vector);

vector’s lengths is between 100k and 3M float elements (32 bit).

CUDPP-sort should do this for me. But I have no idea, how to pass my vectors in MEX through the CUDPP-API - could anyone help me out?
Here is a small description of sort in CUDPP.

bump