CUB version of thrust::inclusive_scan_by_key

Hello all,

Does CUDA UnBound (CUB) have a version of the thrust::inclusive_scan_by_key function implemented? That is, can I create a function using CUB call(s) that would yield the same result as the thrust::inclusive_scan_by_key function? I would like to look at the performance of thrust as compared to CUB for this particular operation.

Thank you in advance for any pointer(s).

So I take it that no such operation exists within the CUB framework? I have found cub::DeviceSegmentedReduce::Sum but I don’t think this does the same operation as thrust::inclusive_scan_by_key ?