Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83
59:17
Serving 100s of LLMs on 1 GPU with LoRAX - Travis Addair | Stanford MLSys #84
1:19:06
Hardware-aware Algorithms for Sequence Modeling - Tri Dao | Stanford MLSys #87
58:06
Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI
34:14
Understanding the LLM Inference Workload - Mark Moyou, NVIDIA
1:00:05
Open Pretrained Transformers - Susan Zhang | Stanford MLSys #77
24:19
A friendly introduction to distributed training (ML Tech Talks)
24:04
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM | Jared Casper
56:32