FlashAttention - Tri Dao | Stanford MLSys #67

48:13
Distributed and Decentralized Learning - Ce Zhang | Stanford MLSys #68

56:32
Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

47:47
MedAI #54: FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness | Tri Dao

59:40
Learn How to Build An AI Agent

57:45
Visualizing transformers and attention | Talk for TNG Big Tech Day '24

1:21:37
Tri Dao on FlashAttention and sparsity, quantization, and efficient inference

57:05
Text2SQL: The Dream versus Reality - Laurel Orr | Stanford MLSys #89

11:48