Theoretical and Practical Insights from Linear Transformers
50:41
Robust Gradient Descent: Agnostically Estimating an Unknown Affine Transformation...
27:14
Transformers (how LLMs work) explained visually | DL5
45:35
Learning Theory of Transformers: Generalization and Optimization of In-Context Learning
50:16
Jacob Andreas | What Learning Algorithm is In-Context Learning?
52:06
Generalization in the representations and computations of frontier language models.
1:55:27
Worst Fails of the Year | Try Not to Laugh 💩
55:27
Mechanistic Interpretability - Stella Biderman | Stanford MLSys #70
1:10:10