I’m installed all the package and tools to run a CNN on my GPU:
- Cuda 8
- Cudnn5
- tensorflow-gpu
When i run my CNN, it says that it recognizes my GPU but it still run on CPU
2017-12-06 12:25:30.681683: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-12-06 12:25:30.681846: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-06 12:25:30.682149: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-06 12:25:30.683051: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-06 12:25:30.683420: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-06 12:25:30.684240: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-12-06 12:25:30.684699: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-06 12:25:30.685396: W c:\l\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-12-06 12:25:31.342265: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:887] Found device 0 with properties:
name: GeForce GTX 950M
major: 5 minor: 0 memoryClockRate (GHz) 0.928
pciBusID 0000:01:00.0
Total memory: 2.00GiB
Free memory: 1.65GiB
2017-12-06 12:25:31.342409: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:908] DMA: 0
2017-12-06 12:25:31.343662: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:918] 0: Y
2017-12-06 12:25:31.344105: I c:\l\work\tensorflow-1.1.0\tensorflow\core\common_runtime\gpu\gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 950M, pci bus id: 0000:01:00.0)
60000/60000 [==============================] - 4s - loss: 0.2455 - acc: 0.9233 - val_loss: 0.1215 - val_acc: 0.9622
Epoch 2/20
60000/60000 [==============================] - 2s - loss: 0.1019 - acc: 0.9694 - val_loss: 0.0751 - val_acc: 0.9754
Epoch 3/20
60000/60000 [==============================] - 2s - loss: 0.0754 - acc: 0.9778 - val_loss: 0.0918 - val_acc: 0.9751
Even if it runs on my gpu, i can’t see much difference in terms of computation time