Device-Level Reordering Algorithms

In general, this question is about sorting. But what I am seeking is not a sorted array, but a reordering for sorting. My definition of reordering goes as follows:

First of all, how to generate this reordering permutation from a sequence? Second, is there a good parallel algorithm to generate this reordering? I need this reordering, not sorted array, because in DoA, it may be that only one array is used to extract this reordering, and we need to sort all arrays based on this reordering.

Can you add details and a few numbers?

What size does the set S have, what is the maximum integer number within S? Dependent on it you can use register-local or shared memory or global lookup-tables.

Sort pairs of numbers: For each element in the one array, create a pair of (value, initial position), then sort those pairs by the value.

After sorting you remove the values from the pairs and just keep the initial positions in their order after sorting.

(Or instead of using pairs, create a lookup-table between value and initial position and resolve after normal sorting to get a function between initial and final position.)

1 < 2 => s(r(1)) < s(r(2)): r(k) is the initial position of the element, which lands at final position k after sorting. s(r(k)) is the value of the element. s(1..n) is the initial sequence s, s(r(1..n)) is the rordered and sorted final sequence.

You can invert the function r (it is an bijection) to get the final position of a certain element or create a function, which takes a value as argument.

Should be well parallelizable.

The size of S can be huge (millions or more).

How do you create this reordering from the one array?

The one array contains the sequence? And all other arrays have to be reordered the same way?

A = 1 5 4 3 2
B = 7 3 9 1 4 => 7 4 1 9 3

Or

A = 10 50 40 30 20
B = 7 3 9 1 4 => 7 4 1 9 3

Or (probably not)

A = 1 5 4 3 2
B = 7 3 9 1 4 => 7 4 1 9 3

Does the first array (continuously) have all integer numbers up to millions or are there gaps?

Probably this answer solves your problem (with initial keys 0…n-1), afterwards the sorted keys are the indices into the other arrays in order: