Overview
- Apple published the full corpus on GitHub under a research-only license that permits academic use but restricts commercial reuse.
- Seed images were selected from the OpenImages collection to cover humans, objects, and scenes with text.
- Edits were produced with Google’s Gemini-2.5-Flash-Image (Nano-Banana) and scored by Gemini-2.5-Pro for instruction compliance and quality.
- The dataset spans 35 edit types across eight categories with 258,000 single-edit examples, 56,000 preference pairs, and 72,000 multi-turn sequences.
- Apple’s analysis reports roughly 93% success on global style changes with success rates below 60% on fine-grained spatial edits and text modification.