TILOS Seminar: Transformers learn in-context by (functional) gradient descent
![](https://i.ytimg.com/vi/PuAPMjhDJTs/mqdefault.jpg)
1:11:07
TILOS Seminar: Off-the-shelf Algorithmic Stability
![](https://i.ytimg.com/vi/wjZofJX0v4M/mqdefault.jpg)
27:14
Transformers (how LLMs work) explained visually | DL5
![](https://i.ytimg.com/vi/QiW80uDho4E/mqdefault.jpg)
1:04:58
TILOS Seminar: What Kinds of Functions do Neural Networks Learn? Theory and Practical Applications
![](https://i.ytimg.com/vi/dqoEU9Ac3ek/mqdefault.jpg)
1:01:31
MIT 6.S191: Recurrent Neural Networks, Transformers, and Attention
![](https://i.ytimg.com/vi/KJtZARuO3JY/mqdefault.jpg)
57:45
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
![](https://i.ytimg.com/vi/uOwxKlDTAUU/mqdefault.jpg)
49:40
TILOS Seminar: Towards Foundation Models for Graph Reasoning and AI 4 Science (2023-10-11)
![](https://i.ytimg.com/vi/IHZwWFHWa-w/mqdefault.jpg)
20:33
Gradient descent, how neural networks learn | DL2
![](https://i.ytimg.com/vi/-yo2672UikU/mqdefault.jpg)
15:26