How to use streams with npp APIs in CUDA

Hi there!
I’m very new to cuda and have been trying some codes lately! I am stuck at using streams in my code. I am basically applying a box filter npp function for filtering an image. Although I want to apply some other npp functions too in the same program using streams. I have used streams like this:

int nstreams=2;
cudaStream_t *streams=(cudaStream_t )malloc (nstreamssizeof(cudastream_t));
for(int i=0;i<nstreams;i++)

NppStreamContext nppStreamCtx1, nppStreamCtx2;


and then using the npp context in two functions like:


The programs is running fine and producing results but I’m not sure if stream concept is working as the time difference is not too much.
Is this the correct way of using streams? Plus how would I know if streams are working in my program!

Thank you in advance!!