RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation
16:28
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
24:09
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation
21:35
Alignment faking in large language models
40:44
The BrowserGym Ecosystem for Web Agent Research
17:37
How to Synthesize Text Data without Model Collapse?
42:03
ChatQA: Surpassing GPT-4 on Conversational QA and RAG
45:16
Phi 4 Technical Report
16:59