Output Predictions - Faster Inference with OpenAI or vLLM · Minideo

Output Predictions - Faster Inference with OpenAI or vLLM

25:09

Predicting Events with Large Language Models

32:19

7 outils d'agent IA pour n8n indispensables ! (Résultats incroyables)

1:21:12

Fine tune Gemma 3, Qwen3, Llama 4, Phi 4 and Mistral Small with Unsloth and Transformers

1:04:22

How to pick a GPU and Inference Engine?

12:46

Speculative Decoding: When Two LLMs are Faster than One

59:55

vLLM Office Hours - SOTA Tool-Calling Implementation in vLLM - November 7, 2024

55:32

Advanced Data Prep and Visualisation Techniques for Fine-tuning LLMs

48:20

vLLM Office Hours - Distributed Inference with vLLM - January 23, 2025