[cuDNN bug] Different outputs with Same input (cuDNN7.1.3)

Issue summary
I will get different outputs when I pass the same input in a simple model.
network architecture:
[image][/https://user-images.githubusercontent.com/9379079/45991826-1b0b2c80-c0b9-11e8-9e45-860fd054b7dd.png]

Steps to reproduce
The attachment file is my reproduced code including a python script and a proto file.
[debug_cudnn.zip][/https://github.com/BVLC/caffe/files/2413676/debug_cudnn.zip]

$python debug_cudnn.py

iter: 0, meandiff:0.088871, min:0.000000, max:18.754311
iter: 1, meandiff:0.088867, min:0.000000, max:18.754311
iter: 2, meandiff:0.000000, min:0.000000, max:18.453726
iter: 3, meandiff:0.088826, min:0.000000, max:18.754311
iter: 4, meandiff:0.088791, min:0.000000, max:18.754311
iter: 5, meandiff:0.088705, min:0.000000, max:18.754311
iter: 6, meandiff:0.088907, min:0.000000, max:18.754311
iter: 7, meandiff:0.088798, min:0.000000, max:18.754311
iter: 8, meandiff:0.088865, min:0.000000, max:18.754311
iter: 9, meandiff:0.000000, min:0.000000, max:18.453726

The meandiff’s value should be 0.

Tried solutions
The issue can be fixed when I uncomment line 131 in deploy.prototxt. That is to say, the outputs are identical if “conv2_1/linear” use CAFFE convolution (engine: CAFFE). So, I think that cuDNN leads to this issue.

$python debug_cudnn.py

iter: 0, meandiff:0.000000, min:0.000000, max:7.555433
iter: 1, meandiff:0.000000, min:0.000000, max:7.555433
iter: 2, meandiff:0.000000, min:0.000000, max:7.555433
iter: 3, meandiff:0.000000, min:0.000000, max:7.555433
iter: 4, meandiff:0.000000, min:0.000000, max:7.555433
iter: 5, meandiff:0.000000, min:0.000000, max:7.555433
iter: 6, meandiff:0.000000, min:0.000000, max:7.555433
iter: 7, meandiff:0.000000, min:0.000000, max:7.555433
iter: 8, meandiff:0.000000, min:0.000000, max:7.555433
iter: 9, meandiff:0.000000, min:0.000000, max:7.555433

System configuration
Operating system: centos
Compiler: gcc
CUDA version (if applicable): CUDA8.0
CUDNN version (if applicable): cuDNN7.1.3
BLAS: atlas
Python version (if using pycaffe): python3.6m
MATLAB version (if using matcaffe): no

debug_cudnnv7.1.3.zip (1.26 KB)