Paper: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
![](https://i.ytimg.com/vi/bAWV_yrqx4w/mqdefault.jpg)
1:09:00
[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
![](https://i.ytimg.com/vi/QdEuh2UVbu0/mqdefault.jpg)
25:36
DeepSeek R1 Theory Overview | GRPO + RL + SFT
![](https://i.ytimg.com/vi/zNxOrF6RdXA/mqdefault.jpg)
2:12
Dictionary website using html css js (API)
![](https://i.ytimg.com/vi/8JVRbHAVCws/mqdefault.jpg)
1:00:19
MIT 6.S191: Reinforcement Learning
![](https://i.ytimg.com/vi/8v2l6SJECW4/mqdefault.jpg)
1:21:39
DeepSeek-V3
![](https://i.ytimg.com/vi/vGYK-ef-cJc/mqdefault.jpg)
1:32:10
WSDL 2023: Scaling ResNets in the large-depth regime by Gerard Biau
![](https://i.ytimg.com/vi/SXpJ9EmG3s4/mqdefault.jpg)
1:16:15
Stanford CS224N | 2023 | Lecture 10 - Prompting, Reinforcement Learning from Human Feedback
![](https://i.ytimg.com/vi/Mn_9W1nCFLo/mqdefault.jpg)
1:10:55