Overview
- Publishing new guidance on December 8, the NCSC warned that prompt injection may never be fully mitigated because today’s large language models do not separate instructions from data.
- Officials cautioned against treating the issue like SQL injection, noting that unlike parameterized queries in databases, LLMs lack a true boundary that would allow similar technical fixes.
- The advisory frames LLMs as “inherently confusable” and recommends risk reduction through least‑privilege access, constrained tool use, secure design choices and rigorous logging and monitoring.
- Researchers have demonstrated practical exploit paths, including malicious prompts embedded in GitHub commits and pull requests, manipulation of AI browser agents and websites that feed different content to AI crawlers.
- Vendors acknowledge limitations, with OpenAI describing partial progress on hallucinations and Anthropic relying on external monitoring, while the NCSC warns of a potential wave of breaches as AI is embedded into more applications.