Removing a recurring integer from an array

Hi

I have an array of 100 million 4 byte integers. The integers themselves can range from 1 to N (where
N is some arbitrary integer). There are also zeroes in the array. I am trying to devise an algorithm
that will remove the zeroes from the array and pack the remaining integers into the list.
For example, let’s say I have this list of integers

0, 0, 5, 10, 11, 12, 0, 0, 0, 20, 30, 40

I want to remove the zeroes and pack it into

5, 10, 11, 12, 20, 30, 40

I tried to use the reduction sample but I’m not getting anywhere.

Can anyone suggest an algorithm or another example I should look at?

Thanks

Bob

You can pack the array this way:

  1. Start with array in memory:
    X = 0, 0, 5, 10, 11, 12, 0, 0, 0, 20, 30, 40
  2. Compute an integer array of same length as X with 0 if the corresponding entry in X is 0, 1 otherwise:
    P = 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1
  3. Perform a parallel prefix sum on P (this is very similar to how a reduction is done):
    P = 0, 0, 1, 2, 3, 4, 4, 4, 4, 5, 6, 7
  4. Read in elements of X and P together. If X non-zero, write it to the output array at the offset indicated by the corresponding element in P.

If X has 100 million ints, then P is also 100 million ints, so that’s at least 800 MB of storage, not counting the packed output array.

This algorithm is implemented in the CUDPP library by the cudppCompact() function:

[url=“http://www.gpgpu.org/static/developer/cudpp/rel/cudpp_1.0a/html/group__public_interface.html#g275bcc2d5cb5b0277027140aae9c51bd”]http://www.gpgpu.org/static/developer/cudp...027140aae9c51bd[/url]