Overview
- New arXiv work defines recall‑critical “pluri‑hop” questions and introduces PluriHopWIND, a 48‑question dataset built from 191 wind industry reports where tested QA and RAG baselines stayed below 40% statement‑wise F1.
- PluriHopRAG decomposes queries into document‑level subquestions and applies a cross‑encoder filter before LLM reasoning, yielding reported relative F1 improvements of 18–52% on PluriHopWIND.
- Another preprint presents DEG‑RAG, which denoises LLM‑generated knowledge graphs via entity resolution and triple reflection, producing more compact graphs and consistent QA gains over unprocessed KGs.
- Parallel enterprise reporting describes production RAG deployments at companies such as Uber, LINE, Asana, Linde Group, and Siemens for engineering search, support, and knowledge access.
- A DEV Community case study cites secure on‑premise or VPC setups, mandatory source citations, rapid implementation timelines near 90 days, and ROI claims in the 300–500% range within a year.
