Using thrust::cuda::par with thrust::cuda::par.on

I have been tasked to make a very old project heavily using thrust as non-blocking as possible, so I am throwing stream definitions left and right, however, at some point saw this with its own execution policy restricting to use a memory region.

thrust::transform_inclusive_scan( thrust::cuda::par(Allocator), input.begin(), input.end(), output.begin(), scanStencil(), thrust::plus<int>());

Is there a way to combine thrust::cuda::par.on(myFooStream) with thrust::cuda::par(Allocator) in a simple manner without writing my own execution policy backend?

(This is a duplicate of

Does this work?


Note, however, that in the latest thrust version thrust calls are still blocking with respect to host even when streams are used. To have non-blocking thrust calls on the host, you need to use the new asynchronous API.

Greetings striker159,
It works perfectly, thank you very much, also I do appreciate your warning regarding the API.

I’ve found this --default-stream per-thread to be very very helpful feature from NVIDIA, exactly for the problem you’re describing.