CUDA non-default stream synchronization

I believe striker159 has addressed your item a. For your item b, my suggestion would be to check the concurrentManagedAccess device property, before trying to access a managed allocation from host code, without an intervening cudaDeviceSynchronize, after 1 or more kernel launches. If the concurrent managed access property is false, then what you are trying to do is expected to seg fault. For discrete GPUs, this would typically be true on linux for a pascal or newer GPU, but false on maxwell or on windows. Jetson devices have a somewhat different footprint, I believe.

1 Like