Particle.news
Download on the App Store

Ant Releases LLaDA2.0, Open-Sourcing a Claimed First 100B Diffusion Language Model

Weights and training code are available on Hugging Face for independent benchmarking.

Overview

  • Ant Technology Research Institute introduced LLaDA2.0, a discrete‑diffusion LLM family with MoE 16B “mini” and 100B “flash” variants.
  • The 100B model is described by Ant as the industry’s first diffusion language model at that parameter scale.
  • Model weights and training code are open-sourced on Hugging Face to facilitate independent evaluation.
  • Training adopts a Warmup‑Stable‑Decay schedule to reuse autoregressive knowledge, complemented by confidence‑aware parallel training and a diffusion‑form DPO.
  • Ant reports roughly 2.1× faster inference from parallel decoding with strong results on structured generation such as code, pending external verification.