∞-former: Infinite Memory Transformer (aka Infty-Former / Infinity-Former, Research Paper Explained)
48:06
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention (Paper Explained)
50:24
Linformer: Self-Attention with Linear Complexity (Paper Explained)
57:00
xLSTM: Extended Long Short-Term Memory
40:13
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
43:51
Feedback Transformers: Addressing Some Limitations of Transformers with Feedback Memory (Explained)
39:13
DINO: Emerging Properties in Self-Supervised Vision Transformers (Facebook AI Research Explained)
1:19:06
Hardware-aware Algorithms for Sequence Modeling - Tri Dao | Stanford MLSys #87
56:49