MADGRAD: A Best-of-Both-Worlds Optimizer with The Generalization Performance of SGD and at Least as Fast Convergence as That of Adam, Often Faster

Thats what I love about these optimizers man. They keep getting better, but my code stays the same.
Click the image to read the article

Find more #DSotD posts

Have an idea you would like to see featured here on the Data Science of the Day?