Getting thrust to work with streams.

I’m currently working on expanding and improving FLANN’s cuda support.
I have a line such as this:

thrust::transform(thrust::system::cuda::par.on(stream), id, id + knn*queries.rows, id, map_indices(thrust::raw_pointer_cast(&((*gpu_helper_->gpu_vind_))[0])));

I’ve similarly tried this:
thrust::transform(thrust::cuda::par.on(stream), id, id + knn*queries.rows, id, map_indices(thrust::raw_pointer_cast(&((*gpu_helper_->gpu_vind_))[0])));

That doesn’t seem to be performing synchronization on the passed in stream.
If I break my execution in thrust::system::detail::cuda::bulk_::detail::synchronize, and then examine my stack, I find that async_in_stream is called from:
(async.inl: 88)
template<typename ExecutionGroup, typename Closure>
host device
future async(ExecutionGroup g, Closure c)
{
return bulk::detail::async_in_stream(g, c, 0, 0);
} // end async()

instead of:
(async.inl: 97)
template<typename ExecutionGroup, typename Closure>
host device
future async(async_launch launch, Closure c)
{
return launch.is_stream_valid() ?
bulk::detail::async_in_stream(launch.exec(), c, launch.stream(), launch.before_event()) :
bulk::detail::async(launch.exec(), c, launch.before_event());
} // end async()

Thus everything is being launched on the null stream and synchronized on it as well.
It is apparent that the intention is to execute on the given stream, but it seems to not be happening.

When reporting an issue with thrust, it’s usually helpful if you mention which version you are using. There is a known issue with streams and the thrust version that ships with CUDA 7:

https://github.com/thrust/thrust/issues/664

You can try updating to the latest master branch to work around this issue.

Thanks, I was running cuda 7.5 and still had the issue, but it looks like it might be fixed in the master branch.