Next-Gen AI: RecurrentGemma (Long Context Length)

24:52
Mighty New TransformerFAM (Feedback Attention Mem)

2:11:12
How I use LLMs

31:09
No more Fine-Tuning: Unsupervised ICL+

36:16
The math behind Attention: Keys, Queries, and Values matrices

1:12:02
Graph Transformers: What every data scientist should know, from Stanford, NVIDIA, and Kumo

39:43
Do we need Attention? - Linear RNNs and State Space Models (SSMs) for NLP

24:34
RING Attention explained: 1 Mio Context Length

28:21