Running mnistCUDNN as mentioned in the documentation [[link][https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#verify]] detects all the CUDA capable devices but the test runs only for device 0. Any chance I could run the test for all the devices and not just device 0 ?
Output on running ./mnistCUDNN
cudnnGetVersion() : 7300 , CUDNN_VERSION from cudnn.h : 7300 (7.3.0)
Host compiler version : GCC 5.4.0
There are 2 CUDA capable devices on your machine :
device 0 : sms 30 Capabilities 6.1, SmClock 1582.0 Mhz, MemSize (Mb) 12188, MemClock 5705.0 Mhz, Ecc=0, boardGroupID=0
device 1 : sms 8 Capabilities 6.1, SmClock 1480.5 Mhz, MemSize (Mb) 5053, MemClock 3504.0 Mhz, Ecc=0, boardGroupID=1
Using device 0
Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation …
Testing cudnnGetConvolutionForwardAlgorithm …
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm …
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.024608 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.049152 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.061536 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.092032 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.105472 time requiring 207360 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000
Loading image data/three_28x28.pgm
Performing forward propagation …
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation …
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation …
Testing cudnnGetConvolutionForwardAlgorithm …
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm …
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.024288 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.043712 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.057344 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.078848 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.116640 time requiring 203008 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001
Loading image data/three_28x28.pgm
Performing forward propagation …
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation …
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!