Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention (Paper Explained)
57:45
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
51:38
Linear Transformers Are Secretly Fast Weight Memory Systems (Machine Learning Paper Explained)
31:51
MAMBA from Scratch: Neural Nets Better and Faster than Transformers
1:05:16
Hopfield Networks is All You Need (Paper Explained)
36:16
The math behind Attention: Keys, Queries, and Values matrices
27:14
Transformers (how LLMs work) explained visually | DL5
24:07
AI can't cross this line and we don't know why.
12:22