Overview
- Arc Institute and collaborators released code, weights, training resources and an interpretability visualizer, with integration into NVIDIA's BioNeMo to support broad use.
- The model was trained on more than 128,000 genomes and metagenomes, making it the largest fully open-source AI biology model reported to date.
- A new StripedHyena 2 architecture and months of training on over 2,000 NVIDIA H100 GPUs enable reasoning over sequences up to one million nucleotides.
- In benchmarking, Evo 2 exceeded 90% accuracy on classifying BRCA1 variants and produced genome-scale designs, including M. genitalium–inspired sequences, human mitochondrial DNA and a yeast chromosome.
- Developers excluded human pathogens and added response safeguards, while external experts caution that AI-designed genomes still require extensive synthesis and validation before they can function as living systems.