Particle News: Researchers Bypass Apple Intelligence Guardrails With Prompt Injection and Unicode Trick

Overview

RSAC Research published findings Thursday showing Apple’s on‑device AI can be steered to follow attacker‑set tasks through crafted inputs.
The attack pairs a Neural Exec prompt‑injection method, which uses machine‑generated gibberish triggers, with a Unicode right‑to‑left override to evade input and output filters.
Because the model is wired into apps through system APIs, manipulated replies can change app behavior or expose personal data, and the team even inserted a bogus contact.
In 100 trials, the technique succeeded 76% of the time, demonstrating reliable bypass of guardrails on the local model.
RSAC says Apple added hardening in iOS 26.4 and macOS 26.4 after an October 2025 disclosure, and it reports no known abuse but estimates 100,000 to 1 million exposed app installs.