Particle.news
Download on the App Store

Evo 2, an Open DNA Foundation Model, Debuts in Nature

Trained on 9.3 trillion nucleotides, the system models the genetic code across species with open access for research.

Overview

  • Arc Institute and collaborators released code, weights, training resources and an interpretability visualizer, with integration into NVIDIA's BioNeMo to support broad use.
  • The model was trained on more than 128,000 genomes and metagenomes, making it the largest fully open-source AI biology model reported to date.
  • A new StripedHyena 2 architecture and months of training on over 2,000 NVIDIA H100 GPUs enable reasoning over sequences up to one million nucleotides.
  • In benchmarking, Evo 2 exceeded 90% accuracy on classifying BRCA1 variants and produced genome-scale designs, including M. genitalium–inspired sequences, human mitochondrial DNA and a yeast chromosome.
  • Developers excluded human pathogens and added response safeguards, while external experts caution that AI-designed genomes still require extensive synthesis and validation before they can function as living systems.