See related/original question here: https://stackoverflow.com/questions/57927742/nvidia-npp-on-cuda-streams-that-use-cudastreamnonblocking
Is there an official stance on how NPP functions interact with streams that are initialized to not synchronize with stream 0 (i.e., they use the cudastreamnonblocking
flag)?
While we discovered surprising behaviour when trying to use the new _Ctx
functions, when using the old nppSetStream
using a cudastreamnonblocking
stream we also got problematic behaviour.
My current suspicion is that cudastreamnonblocking
streams are effectively not-compatible with NPP. But some kind of official documentation would be appreciated.
Thank you!