Overview
- A cluster of practitioner guides and arXiv papers published May 26–27 lays out production patterns for Retrieval-Augmented Generation, showing how RAG can be integrated into IDEs, Slack, and CLIs to surface up-to-date answers with source links.
- New domain-specific systems report strong but benchmark-limited gains: MimirRAG scored about 89.3% on FinanceBench, AstroRAG raised Mistral-7B to roughly 79.5% on AstroQA, and EfficientGraph-RAG reduced large-model token use while leading on LongBench subsets.
- Operational guidance from practitioners emphasizes careful chunking (roughly 500–800 tokens for many docs), continuous re-indexing on merges to avoid version drift, two-stage retrieval with reranking by recency and type, and explicit source citations so developers can verify answers.
- Reproducibility studies find reader failures come from semantic competition among retrieved passages and from evaluation choices, showing ordering, topic sampling, and retriever quality can materially change reported RAG results.
- A new MEntA membership-inference attack can detect whether a document is in a RAG corpus with as few as five non-templated queries, creating an urgent need for privacy defenses such as ephemeral indexing, conservative citation/abstention policies, and retrieval-aware access controls.