How to use stream with inverse problem?

I want to use cuda stream to implement matrix inverse as matrices are small and have different sizes.
[url]cuda - cublas matrix inversion from device - Stack Overflow is to use cublasSgetrfBatched and cublasSgetriBatched to do inversion in device. I am confused.
Do I need to use cublasSetStream(handle, streams[i]) before cublasSgetrfBatched and cublasSgetriBatched or just set streams like inv_kernel<<<1, 1,0,streams[i]>>>(a_d, c_d, n)? Thanks so much if you can help.