CUDA:11.1 update 1
CUDNN 8.0.5
RTX3090
windows 10
i use a non block stream which is binded to cudnn handles. when some ops has been done,
cudnnConvolutionForward
cudnnConvolutionBackwardData
cudnnConvolutionBackwardFileter
…
cudnnConvolutionForward
cudnnConvolutionBackwardData
cudnnConvolutionBackwardFileter
cudastreamsynchronize(stream)
sometimes,i may encounter cudnnConvolutionForward return a status of CUDNN_STATUS_EXECUTION_FAILED.
but if cudastreamsynchronize(stream) has been done after each api. it works well.eg:
cudnnConvolutionForward
cudastreamsynchronize(stream)
cudnnConvolutionBackwardData
cudastreamsynchronize(stream)
cudnnConvolutionBackwardFileter
cudastreamsynchronize(stream)
…
cudnnConvolutionForward
cudastreamsynchronize(stream)
cudnnConvolutionBackwardData
cudastreamsynchronize(stream)
cudnnConvolutionBackwardFileter
cudastreamsynchronize(stream)
it all works well in tesla v100.
cuda 10.0
cudnn 7.6.5
windows 10