TILOS Seminar: Transformers learn in-context by (functional) gradient descent · Minideo

TILOS Seminar: Transformers learn in-context by (functional) gradient descent

1:11:07

TILOS Seminar: Off-the-shelf Algorithmic Stability

27:14

Transformers (how LLMs work) explained visually | DL5

1:04:58

TILOS Seminar: What Kinds of Functions do Neural Networks Learn? Theory and Practical Applications

1:01:31

MIT 6.S191: Recurrent Neural Networks, Transformers, and Attention

57:45

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

49:40

TILOS Seminar: Towards Foundation Models for Graph Reasoning and AI 4 Science (2023-10-11)

20:33

Gradient descent, how neural networks learn | DL2

15:26

Do pretrained transformers learn in-context by Gradient Descent? | ICML 2024 (Oral)