Multimap reduction

Hi, due to lack of basics of parallel algorithms, I’m having a hell of time thinking how to do reduction on a multimap External Image

I have a multimap like this in a global memory:
[font=“Courier New”]
Key: |1|2|2|2|2|2|3|4|5|
Val: |a|a|a|a|a|a|a|a|a|[/font]

How to reduce this multimap into this:

[font=“Courier New”]Key: |1|2 |3|4|5|
Val: |a|5a|a|a|a|[/font]

Any input will be appreciated. Thanks!

This is the last stage in a bucket sort algorithm recently discussed: [url=“http://forums.nvidia.com/index.php?showtopic=89164”]The Official NVIDIA Forums | NVIDIA