Compress Sparse Binary Vector, Get Indices of Non-Zero Elements

UserHCC · July 9, 2020, 10:44pm

I have a large sparse binary vector. It has between one million and ten million elements. Approximately half the elements are one and half are zero. I would like to create a new smaller vector where each element is an index into the non-zero elements of the sparse vector. For example:

Input vector: [0, 1, 1, 0, 1, 0, 0, 0, 1, 0]
Output vector: [1, 2, 4, 8]

This might be a common problem with SIMD programming. The binary vector corresponds to the result of some if-conditional applied to a vector of data and I need the indices so I can easily process those in the next step. How can I do this efficiently in CUDA? Is there a common name for this?

njuffa · July 10, 2020, 3:22am

If you can manage to make the value of the non-zero elements identical to their position, you would have a standard stream compaction problem. GPUs have been successfully applied to that.

[0, 1, 1, 0, 1, 0, 0, 0, 1, 0] → [0, 1, 2, 0, 4, 0, 0, 0, 8, 0] → [1, 2, 4, 8]

Your use case strikes me as a potentially common scenario, so there may already be code out there that does exactly what you need. I would start by looking what thrust has available in this regard.

UserHCC · July 10, 2020, 8:57pm

Yes, I can do that. I can make the non-zero elements identical to their position. Do you think someone has this already programmed in CUDA? (I will check thrust)

njuffa · July 10, 2020, 9:37pm

I have never used stream compaction on GPUs myself, but I have seen multiple publications in the past ten years that discussed GPU implementations. If I recall correctly, this has come up a few times on these forums and on Stackoverflow as well, so you may want to just do a Google search to see what is already out there.

I mentioned the Thrust library because it is focused on array processing and makes for easy programming as I have heard from multiple people who had no prior experience programming GPUs.

Topic		Replies	Views
How to put specific elements from one array to another array use CUDA? CUDA Programming and Performance cuda	6	1302	October 30, 2022
how to parallel this simple algorithm? someone has met this before? CUDA Programming and Performance	2	7275	December 3, 2010
Inverse of cusparse<t>gthr() ? CUDA Programming and Performance	2	719	May 13, 2015
Parallel Search Algorithms Parallel search algorithm for determining all non-zero array elements CUDA Programming and Performance	2	3498	November 14, 2011
All non-zero element indexes CUDA Programming and Performance	1	1900	December 28, 2020
simple question on scan example from Data-parallel Algorithms... CUDA Programming and Performance	1	1255	August 5, 2008
Hello and help with cusparse cusparse CUDA Programming and Performance	2	837	January 20, 2012
How to push back thread index which pass a condition in cuda kernel like numpy's “Where” op? CUDA Programming and Performance	2	415	April 19, 2021
Dynamic memory allocation problem CUDA Programming and Performance	3	918	April 15, 2013
Vector Reduction CUDA Programming and Performance	3	19663	March 9, 2011

Compress Sparse Binary Vector, Get Indices of Non-Zero Elements

Related topics