WARP: On the Benefits of Weight Averaged Rewarded Policies
28:52
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
35:52
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
37:00
Visual AutoRegressive Modeling:Scalable Image Generation via Next-Scale Prediction
43:26
xLSTM: Extended Long Short-Term Memory
56:33
MLBBQ: “Are Transformers Effective for Time Series Forecasting?” by Joanne Wardell
1:02:30
Stable Diffusion 3: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
1:14:43
Mamba 2 - Transformers are SSMs: Generalized Models and Efficient Algorithms Through SSS Duality
40:14