About asynchronous execution of cusparseDgtsv2StridedBatch()

I want to call cusparseDgtsv2StridedBatch function for three times to batch computing tridiagonal matrix, and I found in The manual “The routine supports asynchronous execution”. How should I execute the following functions asynchronously? Thank you so much!

istat = cusparseDgtsv2StridedBatch(handle,m_EX_1,A_EX_1,B_EX_1,C_EX_1,D_EX_1,batchcount_EX_1,batchstride_EX_1,buffer_EX_1) 
		istat = cusparseDgtsv2StridedBatch(handle,m_EY_1,A_EY_1,B_EY_1,C_EY_1,D_EY_1,batchcount_EY_1,batchstride_EY_1,buffer_EY_1) 
		istat = cusparseDgtsv2StridedBatch(handle,m_EZ_1,A_EZ_1,B_EZ_1,C_EZ_1,D_EZ_1,batchcount_EZ_1,batchstride_EZ_1,buffer_EZ_1)
		!$acc wait

It may be the execution is already asynchronous, in the same way that a CUDA kernel launch is asynchronous. You can also check the cusparse documentation for stream usage and learn how to use streams in cusparse, for additional control.