Overview
- Bulbul V3, a new text‑to‑speech model with 35+ voices across 11 Indian languages, offers low‑latency streaming and consent‑based voice cloning for production use.
- A third‑party blind A/B listening study across 11 languages favored Bulbul V3 on 8 kHz telephony tests and found low rates of word skips and mispronunciations.
- Sarvam Vision, a 3B‑parameter vision‑language/OCR model, scored 84.3% on olmOCR‑Bench and 93.28% on OmniDocBench v1.5, with results reported as ahead of Gemini 3 Pro and recent OCR systems on India‑focused tasks.
- To spur adoption, Sarvam is offering unlimited API access to Bulbul V3 through February 28 and free use of its Document Intelligence API during February.
- Backed as one of 12 entities in the India AI Mission, the Bengaluru startup plans to showcase its models at the India‑AI Impact Summit on February 16–20, drawing praise from officials including Ashwini Vaishnaw and Amitabh Kant.