Using thrust::inclusive_scan in CUDA kernel


I’m trying to use thrust::inclusive_scan(thrust::device, xxx) in a CUDA kernel to perform a global scan on a device array.

Can I use this?

Does this answer your question?

Thank you.

Since the function call is in the host side, I’m wondering if we can use it inside a cuda kernel.

With current CUDA it is no longer possible to start a parallelized thrust scan from the device.
If you need this functionality, you can use cub’s decoupled lookback api to write your own device wide scan