Cudnn_status_execution_failed

CUDA:11.1 update 1
CUDNN 8.0.5
RTX3090
windows 10
i use a non block stream which is binded to cudnn handles. when some ops has been done,
cudnnConvolutionForward
cudnnConvolutionBackwardData
cudnnConvolutionBackwardFileter

cudnnConvolutionForward
cudnnConvolutionBackwardData
cudnnConvolutionBackwardFileter
cudastreamsynchronize(stream)
sometimes,i may encounter cudnnConvolutionForward return a status of CUDNN_STATUS_EXECUTION_FAILED.
but if cudastreamsynchronize(stream) has been done after each api. it works well.eg:
cudnnConvolutionForward
cudastreamsynchronize(stream)
cudnnConvolutionBackwardData
cudastreamsynchronize(stream)
cudnnConvolutionBackwardFileter
cudastreamsynchronize(stream)

cudnnConvolutionForward
cudastreamsynchronize(stream)
cudnnConvolutionBackwardData
cudastreamsynchronize(stream)
cudnnConvolutionBackwardFileter
cudastreamsynchronize(stream)

it all works well in tesla v100.
cuda 10.0
cudnn 7.6.5
windows 10

Hi @ttt,
Are you binding different cudnn handle to different cuda stream?
Is this single threaded or multi threaded?
Can you please help us with your code.

Thanks!