UMD F25 NLP #16: PPO to GRPO for LLMs · Minideo

UMD F25 NLP #16: PPO to GRPO for LLMs

İndir

1:11:31

UMD F25 NLP #16: GRPO and reasoning models

1:11:35

UMD F25 NLP #15: RL for LLMs

2:15:13

Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code.

1:14:28

UMass CS685 S24 (Advanced NLP) #21: Detecting LLM-generated text / LLM security

53:51

How language model post-training is done today