Particle News: RAG Moves From Labs to Production With New Domain Systems and a Low‑Cost Privacy Attack

Overview

A cluster of practitioner guides and arXiv papers published May 26–27 lays out production patterns for Retrieval-Augmented Generation, showing how RAG can be integrated into IDEs, Slack, and CLIs to surface up-to-date answers with source links.
New domain-specific systems report strong but benchmark-limited gains: MimirRAG scored about 89.3% on FinanceBench, AstroRAG raised Mistral-7B to roughly 79.5% on AstroQA, and EfficientGraph-RAG reduced large-model token use while leading on LongBench subsets.
Operational guidance from practitioners emphasizes careful chunking (roughly 500–800 tokens for many docs), continuous re-indexing on merges to avoid version drift, two-stage retrieval with reranking by recency and type, and explicit source citations so developers can verify answers.
Reproducibility studies find reader failures come from semantic competition among retrieved passages and from evaluation choices, showing ordering, topic sampling, and retriever quality can materially change reported RAG results.
A new MEntA membership-inference attack can detect whether a document is in a RAG corpus with as few as five non-templated queries, creating an urgent need for privacy defenses such as ephemeral indexing, conservative citation/abstention policies, and retrieval-aware access controls.