Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention (Paper Explained)
50:24
Linformer: Self-Attention with Linear Complexity (Paper Explained)
48:21
Synthesizer: Rethinking Self-Attention in Transformer Models (Paper Explained)
36:16
The math behind Attention: Keys, Queries, and Values matrices
12:22
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
54:39
Rethinking Attention with Performers (Paper Explained)
58:58
FlashAttention - Tri Dao | Stanford MLSys #67
21:31
Efficient Self-Attention for Transformers
34:30