DeepSeek-V3 · Minideo

DeepSeek-V3

59:24

Titans: Learning to Memorize at Test Time

48:21

MiniMax-01: Scaling Foundation Models with Lightning Attention

38:55

CoPE - Contextual Position Encoding: Learning to Count What's Important

45:05

Byte Latent Transformer: Patches Scale Better Than Tokens

11:01

China's DeepSeek AI disrupts U.S. tech just as NASDAQ 100 turns 40

15:53

DEEPSEEK Vs CHATGPT There Is A Clear Winner !!

43:26

xLSTM: Extended Long Short-Term Memory

46:17

Memory Layers at Scale