Overview
- A new arXiv study introduces HybridRAG-Bench to evaluate retrieval-intensive, multi-hop reasoning over hybrid knowledge from recent literature while minimizing pretraining contamination.
- An enterprise-focused evaluation of natural language to SQL and API calls finds retrieval is indispensable, with 0% exact match without retrieval and execution accuracy rising up to 79.30% when it is used.
- Across multi-turn conversational QA, straightforward approaches such as reranking, hybrid BM25, and HyDE outperform vanilla setups, and several advanced methods can underperform a no-retrieval baseline depending on dataset traits and dialogue length.
- In hybrid documentation scenarios, CoRAG shows statistically significant gains over standard pipelines for combined SQL/API tasks, highlighting the importance of retrieval-policy design under heterogeneous sources.
- Research on fairness reports that adding external context can reduce social bias, whereas integrating chain-of-thought reasoning increases overall bias even as accuracy improves, and a separate study’s ARGUS pipeline remedies retriever blind spots with +3.4 nDCG@5 and +4.5 nDCG@10 gains.