Particle.news
Download on the App Store

HI-MoE Proposes Two-Stage Expert Routing for Object Detection

Early COCO tests report small-object gains over a DINO baseline in an unreviewed draft.

Overview

  • HI-MoE introduces a DETR-style detector that routes in two steps, using a scene router to pick a small pool of experts before an instance router assigns each object query to a few of them.
  • The authors report better accuracy than a dense DINO baseline and than simpler routing variants, with the biggest gains on small objects in the COCO benchmark.
  • The current draft focuses experiments on COCO with early specialization analysis on LVIS, and the paper shares ablations and visualizations of how experts specialize.
  • The paper frames this design around Mixture-of-Experts, where a gating network activates only a subset of specialized subnetworks to cut compute while keeping model capacity high.
  • Background coverage explains that MoE systems often use top-k routing with two experts per input and use noisy top-k to spread load, with Mixtral cited as a real-world example of this approach.