For examle, there are following codes, two kernels are addBoarder and linaInput. dMin is the result of first kernel addBoarder, then dMin will be the input of the second kernel linaInput.
addBoarder<<<blocksPerGrid,threadsPerBlock,0, stream[0]>>>(dIn,dMin,nWidth,nHeight,nWidth,nHeight,sinTheta,cosTheta,tx,ty,stepIn,stepOut,scale,channel);
linaInput<<<blocksPerGrid,threadsPerBlock,0, stream[1]>>>(dMin,dOut,nWidth,nHeight,nNewWidth,nNewHeight,sinTheta,cosTheta,tx,ty,stepIn,stepOut,scale,channel);