Particle.news
Download on the App Store

Enterprises Scale RAG in Production as New Studies Expose Recall Gaps and Offer Targeted Fixes

Research spotlights recall‑sensitive failures on distractor‑heavy corpora with measured gains from purpose‑built architectures.

Overview

  • New arXiv work defines recall‑critical “pluri‑hop” questions and introduces PluriHopWIND, a 48‑question dataset built from 191 wind industry reports where tested QA and RAG baselines stayed below 40% statement‑wise F1.
  • PluriHopRAG decomposes queries into document‑level subquestions and applies a cross‑encoder filter before LLM reasoning, yielding reported relative F1 improvements of 18–52% on PluriHopWIND.
  • Another preprint presents DEG‑RAG, which denoises LLM‑generated knowledge graphs via entity resolution and triple reflection, producing more compact graphs and consistent QA gains over unprocessed KGs.
  • Parallel enterprise reporting describes production RAG deployments at companies such as Uber, LINE, Asana, Linde Group, and Siemens for engineering search, support, and knowledge access.
  • A DEV Community case study cites secure on‑premise or VPC setups, mandatory source citations, rapid implementation timelines near 90 days, and ROI claims in the 300–500% range within a year.