Distributed Multi-Node Model Inference Using the LeaderWorkerSet API- Abdullah Gharaibeh, Rupeng Liu
36:10
ARM-Wrestling: Overcoming CPU Migration Challenges to Reduce Costs- Laurent Bernaille, Eric Mountain
35:12
Best Practices for Deploying LLM Inference, RAG and Fine Tuning Pipelines... M. Kaushik, S.K. Merla
36:51
Better Together! GPU, TPU and NIC Topological Alignment with DRA - John Belamaric & Patrick Ohly
54:01
Dapr - build distributed applications faster
30:03
Natural Language Processing | Data Analytics Club TXST
1:30:43
Evolution of software architecture with the co-creator of UML (Grady Booch)
30:52
The Evolution of Multi-GPU Inference in vLLM | Ray Summit 2024
26:52