Kimi k1.5: Scaling Reinforcement Learning with LLMs · Minideo

Kimi k1.5: Scaling Reinforcement Learning with LLMs

10:23

Modelos de grandes conceptos (LCM) de Meta: ¿La era de la IA después de los LLM?

18:02

Group Robust Preference Optimization in Reward-free RLHF

17:44

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

58:06

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

10:50

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

1:44:31

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

26:19

Deshazte de RAG y opta por CAG más inteligente con optimización de caché KV

11:23

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training