PyTorch 2.5 Live Q&A · Minideo

PyTorch 2.5 Live Q&A

30:06

PyTorch 2.6 Release Live Q&A

20:37

State of PyTorch - Ji Li & Damien Sereni, Meta

29:18

PyTorch Expert Exchange Hacker Cup AI

33:29

How does batching work on modern GPUs?

44:06

LLM inference optimization: Architecture, KV cache and Flash attention

24:21

Running State-of-Art Gen AI Models on-Device with NPU Acceleration - Felix Baum, Qualcomm

32:03

DistServe: disaggregating prefill and decoding for goodput-optimized LLM inference

27:14

Transformers (how LLMs work) explained visually | DL5