Is NNP stream variable on thread local storage now?

I have 2 GPUs in my computer and I would like to dispatch work to them from 2 Host threads. I create 2 GPU streams, one for each GPU. From host thread #i, I call nppSetStream on stream #i to send job (i.e. call NPP API functions) to GPU #i. Is this a safe thing to do? Note please that I am NOT using any of the NPP functions that have been marked as thread unsafe.

I read on an old forum thread (Can NPP be safely used in multi-threaded code? - GPU-Accelerated Libraries - NVIDIA Developer Forums) that starting from version 6.5, NPP should be keeping current stream id on thread local storage, which should make it ok for me to do I do. However, in the latest online documentation of NPP, it says:

“NPP is a stateless API, as of NPP 6.5 the ONLY state that NPP remembers between function calls is the current stream ID, i.e. the stream ID that was set in the most recent nppSetStream call and a few bits of device specific information about that stream.”

There is no mention of stream ID being on thread local storage. Could someone shed some light on this situation?

can u try to set the different device ID to NPP functions?