Researchers claim breakthrough in fight against AI’s frustrating security hole

April 23, 2025

Since 2022, prompt injection, a vulnerability where malicious instructions override AI system behavior, has plagued large language models (LLMs). No reliable solution existed until Google DeepMind introduced CaMeL (CApabilities for MachinE Learning), a novel approach that shifts away from self-policing AI models. Instead, CaMeL treats LLMs as untrusted components within a secure software framework, using established security principles like Control Flow Integrity, Access Control, and Information Flow Control.

Prompt injections occur because LLMs cannot distinguish trusted user commands from malicious content in their context window, leading to exploits like misdirected emails or unauthorized actions. CaMeL addresses this with dual-LLM architecture: a privileged LLM (P-LLM) generates code for user instructions, while a quarantined LLM (Q-LLM) parses untrusted data without execution privileges. This separation ensures malicious content cannot influence actions. CaMeL converts prompts into secure Python code, monitored by an interpreter that tracks data flow and enforces security policies, akin to preventing contaminated water from spreading.

Tested on the AgentDojo benchmark, CaMeL resisted previously unsolvable attacks and showed potential to mitigate insider threats and data exfiltration. However, it requires users to define and maintain security policies, which could lead to user fatigue and approval complacency. While not perfect, CaMeL’s principled approach marks a significant step toward secure AI assistants, with hopes for future refinement to balance security and usability.

More from Blackwired

June 25, 2025

US Homeland Security warns of escalating Iranian cyberattack risks

US-Iran conflict escalates; DHS warns of rising cyber, terror threats from Iran, allies, and hacktivists targeting US infrastructure.

Read more
June 18, 2025

CISA Issues Comprehensive Guide to Safeguard Network Edge Devices

New global guidance urges stronger edge device security to counter rising zero-day threats—focus on logging, MFA, and hardening.

Read more
June 11, 2025

Hacktivist Groups Transition to Ransomware-as-a-Service Operations

Hacktivist groups shift to ransomware as motives blur, driven by profit and easier access to malware tools around 2024.

Read more