An overview of gradient descent optimization algorithms

Warning: drink ☕ before reading!

Gradient descent is the grandfather of all optimization algorithms. The fundamental insight is fairly straightforward and falls directly out of calculus (finding the slope at a point on a curve), but there are many different algorithmic implementations and all have their own trade-offs. 

This is an important topic, and I've never seen a better primer.


Want to receive more content like this in your inbox?