See related/original question here: https://stackoverflow.com/questions/57927742/nvidia-npp-on-cuda-streams-that-use-cudastreamnonblocking
Is there an official stance on how NPP functions interact with streams that are initialized to not synchronize with stream 0 (i.e., they use the
While we discovered surprising behaviour when trying to use the new
_Ctx functions, when using the old
nppSetStream using a
cudastreamnonblocking stream we also got problematic behaviour.
My current suspicion is that
cudastreamnonblocking streams are effectively not-compatible with NPP. But some kind of official documentation would be appreciated.