DeepSeek-V3 · Minideo

DeepSeek-V3

25:36

DeepSeek R1 Theory Overview | GRPO + RL + SFT

57:45

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

14:21

Building a fully local "deep researcher" with DeepSeek-R1

24:01

ChatGPT is made from 100 million of these [The Perceptron]

24:06

WHY R1 & o1 Models Underthink Complex Reasoning (+ Solution)

1:09:00

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

19:44

A Visual Guide to Mixture of Experts (MoE) in LLMs

27:14

Transformers (how LLMs work) explained visually | DL5