Science ❯ Computer Science ❯ Evaluation Metrics
Ultradomain and LongBench
Authors tout efficiency gains alongside accuracy improvements, with claims based on early arXiv releases.