Deep dive: model merging, part 2
50:44
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
57:24
Terence Tao at IMO 2024: AI and Mathematics
26:10
Attention in transformers, visually explained | DL6
1:05:15
Run performant and cost-effective GenAI Applications with AWS Graviton and Arcee AI
44:06
LLM inference optimization: Architecture, KV cache and Flash attention
17:01
Quantum Computers, explained with MKBHD
27:14
Transformers (how LLMs work) explained visually | DL5
39:05