Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM | Jared Casper · Minideo

Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM | Jared Casper

18:28

SLICK: Driving SLO Culture At Meta | Dávid Bartók & Filip Klepo

19:28

Optics in AI Clusters - Meta Perspective

21:18

Turing-NLG, DeepSpeed and the ZeRO optimizer

21:40

Ray, a Unified Distributed Framework for the Modern AI Stack | Ion Stoica

57:45

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

7:03

Forget About LLMs - Large Concept Models (LCM) Are Here Now!

55:59

Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

1:07:40

Multi GPU Fine tuning with DDP and FSDP