Particle.news
Download on the App Store

Researchers Bypass Apple Intelligence Guardrails With Prompt Injection and Unicode Trick

The research shows that bundling a model into the OS widens the attack surface despite on‑device privacy claims.

Overview

  • RSAC Research published findings Thursday showing Apple’s on‑device AI can be steered to follow attacker‑set tasks through crafted inputs.
  • The attack pairs a Neural Exec prompt‑injection method, which uses machine‑generated gibberish triggers, with a Unicode right‑to‑left override to evade input and output filters.
  • Because the model is wired into apps through system APIs, manipulated replies can change app behavior or expose personal data, and the team even inserted a bogus contact.
  • In 100 trials, the technique succeeded 76% of the time, demonstrating reliable bypass of guardrails on the local model.
  • RSAC says Apple added hardening in iOS 26.4 and macOS 26.4 after an October 2025 disclosure, and it reports no known abuse but estimates 100,000 to 1 million exposed app installs.