Science ❯ Computer Science ❯ Machine Learning

Model Evaluation

Performance Metrics Benchmarking Performance Analysis Performance Benchmarking Bias in AI Factuality and Truthfulness Data Annotation Environmental Assessment Benchmarking Performance Explainability

3 ARTICLES

3w ago

Scale Forces RAG to Become a Retrieval‑First Architecture

Microsoft’s guidance reframes production RAG as a system engineering problem where index design and retrieval practices matter more than changing the LLM.

3 ARTICLES

3mo ago

MIT Index Finds AI Agents Proliferating With High Autonomy and Thin Safety Disclosures

6 ARTICLES

5mo ago

South Korea Advances LG, SK Telecom and Upstage in Sovereign AI Race as Naver and NC Are Cut

14 ARTICLES

9mo ago

Anthropic Reports Real-World Criminal Use of Claude, Tightens Safeguards

17 ARTICLES

this yr.

Reasoning-Enabled AI Systems Emit Up to 50 Times More CO₂ Than Concise Models