hi,
i get this error on nvidia v100.
2025-08-07 18:03:20.734516997 [E:onnxruntime:Default, cuda_call.cc:123 CudaCall] CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=e7ff6c9ad68c ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/reduction/reduction_ops.cc ; line=778 ; expr=cudnnReduceTensor(GetCudnnHandle(ctx), reduce_desc, indices_cuda.get(), indices_bytes, workspace_cuda.get(), workspace_bytes, &one, input_tensor, temp_X.get(), &zero, output_tensor, temp_Y.get());
2025-08-07 18:03:20.734593929 [E:onnxruntime:, sequential_executor.cc:572 ExecuteKernel] Non-zero status code returned while running ReduceSum node. Name:‘/ReduceSum_1’ Status Message: CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=e7ff6c9ad68c ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/reduction/reduction_ops.cc ; line=778 ; expr=cudnnReduceTensor(GetCudnnHandle(ctx), reduce_desc, indices_cuda.get(), indices_bytes, workspace_cuda.get(), workspace_bytes, &one, input_tensor, temp_X.get(), &zero, output_tensor, temp_Y.get());
how to fix it or debug it?
i use nvidia v100 with NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.9
Thanks,
Gerald