In general, this question is about sorting. But what I am seeking is not a sorted array, but a reordering for sorting. My definition of reordering goes as follows:

First of all, how to generate this reordering permutation from a sequence? Second, is there a good parallel algorithm to generate this reordering? I need this reordering, not sorted array, because in DoA, it may be that only one array is used to extract this reordering, and we need to sort all arrays based on this reordering.

What size does the set S have, what is the maximum integer number within S? Dependent on it you can use register-local or shared memory or global lookup-tables.

Sort pairs of numbers: For each element in the one array, create a pair of (value, initial position), then sort those pairs by the value.

After sorting you remove the values from the pairs and just keep the initial positions in their order after sorting.

(Or instead of using pairs, create a lookup-table between value and initial position and resolve after normal sorting to get a function between initial and final position.)

1 < 2 => s(r(1)) < s(r(2)): r(k) is the initial position of the element, which lands at final position k after sorting. s(r(k)) is the value of the element. s(1..n) is the initial sequence s, s(r(1..n)) is the rordered and sorted final sequence.

You can invert the function r (it is an bijection) to get the final position of a certain element or create a function, which takes a value as argument.