Is there any rule of thumb for choosing a good learning rate for a specific solver type?

aferust · December 5, 2019, 12:54pm

I am playing around with digits. I found out that when I set, for instance with GoogleNet, solver type as SGD and learning rate 0.1, the accuracy reaches the 100% in a few epoches but loss remain very high like %80 and it does not change until the training finishes. if I only change LR to 0.01, things become “normal” with SGD and while the accuracy is increasing, loss is decreasing as expected. Additionally if I use AdaDelta under the same conditions with LR=0.1, things become normal too.

I wonder if there is any rule of thumb for choosing a “matching” couple for learning rate and solver type? Referencing a research paper helps too.

Sorry for my poor English if I said something very weird.

Tanks in advance!