I am a CUDA beginner with one question regarding re-arrangement for one dimension array. Appreciate for any suggestions!

Given an array A={a1,a2,a3,a4…aN} indexed from 1 to N and stored at Global mem. Assuming a scenario where some variables are filled with zero, while the rest are simply filled with some positive numbers.

The question is how may I write an efficient kernel function to only pick up those positive ones, re-index them from 1-N’, and store them into another array B in an consecutive manner ?

Many thanks.