Overview
- Research teams released three arXiv studies Thursday that broaden RAG design, introducing full-document verification for long contexts, a concept-level fusion method for web Q&A, and the first in-depth look at fairness gaps across query groups.
- An Adaptive Chunking paper posted Friday proposes five intrinsic metrics to choose chunking per document and reports higher answer correctness, rising to 72% from 62–64%, with more questions answered, 65 versus 49.
- Microsoft’s Tech Community highlights PageIndex as a vectorless, tree-based workflow that lets an LLM navigate a document’s outline to find the right section in structured materials like policies, filings, or manuals.
- The fairness study finds RAG can widen accuracy disparities between query groups and ties outcomes to three drivers in the pipeline—group exposure in retrieval, group utility for the generator, and group attribution in the final answer.
- The verification study details a real-time component that checks answers against entire documents up to 32K tokens using adaptive inference, arguing chunk-by-chunk checks often miss evidence and that long-context review better flags unsupported claims.