1:11:31
UMD F25 NLP #16: GRPO and reasoning models
1:11:35
UMD F25 NLP #15: RL for LLMs
2:15:13
Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code.
1:14:28
UMass CS685 S24 (Advanced NLP) #21: Detecting LLM-generated text / LLM security
53:51