Particle.news
Download on the App Store

Alibaba Launches Qwen3.5-Omni Multimodal AI With Audio and Video Claims

The launch underscores Alibaba's bid for leadership in real-time, multimodal AI.

Overview

  • The system accepts text, images, audio and video in one model and generates fine, time-stamped captions to turn long clips into searchable notes.
  • Alibaba says the Omni-Plus variant sets 215 state-of-the-art results in audio and video tasks and beats Gemini 3.1 Pro on many audio measures.
  • Live voice gains interruption handling, voice cloning and voice control to keep dialogue smooth in real time.
  • The model includes web search and function calling so it can pull live information and use tools to complete tasks.
  • Developers can access the API on Alibaba Cloud’s Bailian in Plus, Flash and Light sizes, and a separate Qwen 3.6 Plus preview is now available on OpenRouter.