The Unreasonable Effectiveness of Stochastic Gradient Descent (in 3 minutes)
![](https://i.ytimg.com/vi/iudXf5n_3ro/mqdefault.jpg)
3:18
Accelerate Gradient Descent with Momentum (in 3 minutes)
![](https://i.ytimg.com/vi/qg4PchTECck/mqdefault.jpg)
3:06
Gradient Descent in 3 minutes
![](https://i.ytimg.com/vi/vMh0zPT0tLI/mqdefault.jpg)
10:53
Stochastic Gradient Descent, Clearly Explained!!!
![](https://i.ytimg.com/vi/TkwXa7Cvfr8/mqdefault.jpg)
25:28
Watching Neural Networks Learn
![](https://i.ytimg.com/vi/NE88eqLngkg/mqdefault.jpg)
15:52
Optimization for Deep Learning (Momentum, RMSprop, AdaGrad, Adam)
![](https://i.ytimg.com/vi/wjZofJX0v4M/mqdefault.jpg)
27:14
Transformers (how LLMs work) explained visually | DL5
![](https://i.ytimg.com/vi/SmZmBKc7Lrs/mqdefault.jpg)
40:08
The Most Important Algorithm in Machine Learning
![](https://i.ytimg.com/vi/Q7vT0--5VII/mqdefault.jpg)
3:18