Hi,
I have a large 1D array a containing float/integer values. I have another array b containing some indices of array a. I want to find the maximum values in array a for each array slice defined by array b. How can i do that in numba cuda.
For example,
a=[1,2,1,2,3,4,1,2,3,4,5,1,2,3]
b=[0,2,6,11]
this is a segmented reduction. As usual you may wish to consider a library implementation. If you wish to “roll your own” here is an example. That example is not “black-box ready to go” for your use case, but it generally depicts one possible method. In that example, the block sizes are fixed. In your example, the block sizes (the length of each segment) varies. So to adapt to your case, each block would load its starting point and ending point from your b array, rather than doing a fixed calculation for that. Also, that example intends to do a transform on each element first. Your example doesn’t need that.
The block-per-segment approach is reasonably flexible (when using a block-stride loop, as indicated there), but probably suffers a bit of efficiency when the block/segment sizes are extremely small, or the blocks/segments are extremely large and/or few in number. Again, a library approach is probably a good choice, otherwise the best “roll your own” algorithm may depend on specifics like the number of elements in your b array and the distance between each.
No, the example won’t work for your case without modification. That’s why I said:
A library example would be thrust reduce_by_key and yes it will work for varying segment lengths, but it expects C++, and it would probably still require some “manipulation” of your b array in order to generate a proper key array. I don’t generally have immediate library recommendations for numba cuda, but you can use CUDA python to interface. No I don’t have a recipe for you.