Skip to content

Google's Gemini for Workspace vulnerability unveiled: Malicious email instructions could manipulate AI to participate in phishing campaigns through the 'Summarize this email' feature.

Google's professional service, Gemini for Workspace, unveiled on July 10 to potentially mislead users into believing their accounts had been compromised, as per the disclosure by Mozilla's 0-Day Investigative Network (0din).

Uncovered flaw in Google Gemini for Workspace: AI could be manipulated into participating in...
Uncovered flaw in Google Gemini for Workspace: AI could be manipulated into participating in phishing attacks through malicious coded emails unnoticed by the 'Summarize this email' tool.

Google's Gemini for Workspace vulnerability unveiled: Malicious email instructions could manipulate AI to participate in phishing campaigns through the 'Summarize this email' feature.

In a revealing report by Mozilla's 0-Day Investigative Network (0din) on July 10, it was disclosed that Google's AI features for Workspace can be manipulated to report false account compromises. This security loophole, if exploited, could pave the way for phishing attempts.

The attackers can trick Google's AI, such as Google Gemini for Workspace, by employing white-on-white text and indirect prompt injection attacks. These tactics enable them to hide malicious phishing messages within emails, making them invisible to recipients. The hidden messages are wrapped within specific tags (e.g.,

For instance, an attacker might disguise instructions suggesting that the recipient's password has been compromised and provide a malicious phone number for a "password reset". This deceptive strategy aims to trick victims into disclosing sensitive credentials during the call. This attack vector capitalises on the AI's tendency to follow embedded instructions verbatim, making it a potent vector for social engineering via AI-driven email summaries.

Indirect prompt injection attacks pose another threat. These attacks involve manipulating external knowledge bases or intermediary models that AI agents rely on to generate responses. By poisoning these external data sources or captions, attackers can indirectly inject malicious prompts that corrupt the AI’s autonomous functioning or influence its outputs maliciously. These attacks resemble SQL injection in classic computing and exploit vulnerabilities in retrieval-augmented generation or vector embeddings in language agents. Even when the core AI model is secure, vulnerabilities in external processing components can be exploited to degrade or hijack AI behaviour.

To mitigate these threats, suggested measures include reinforcing AI models to detect and disregard hidden or malformed instructions, improving filtering and sanitisation of incoming prompts and external data sources, employing robust memory and context sanitisation techniques, and instituting comprehensive bug bounty programs aimed at generative AI vulnerabilities, such as prompt injection flaws.

It is crucial for security teams to treat AI assistants as part of the attack surface and instrument them, sandbox them, and never assume their output is benign. Similar indirect prompt attacks on Google's AI were first reported in 2024. A paper on this Google-related issue was published by Google in May.

The attack demonstrates that trustworthy AI summaries can be subverted with a single invisible tag. The injected text (malicious prompt) is rendered in white-on-white or otherwise hidden, so the victim never sees the instruction in the original message. Anyone can view the malicious prompt by highlighting the bottom of the email in which it was sent.

Until Language Models (LLMs) gain robust context-isolation, every piece of third-party text ingested by the model is executable code. It is, therefore, essential to remain vigilant and implement the suggested mitigation measures to safeguard against such attacks.

Artificial-Intelligence (AI) systems, such as Google Gemini for Workspace, are susceptible to manipulation through white-on-white text and indirect prompt injection attacks. This can result in hidden malicious phishing messages being included in email summaries, potentially tricking recipients into disclosing sensitive information.

Furthermore, indirect prompt injection attacks can exploit vulnerabilities in external data sources relied upon by AI agents to generate responses, potentially corrupting the AI's functioning and influencing its outputs maliciously. Cybersecurity measures, including reinforcing AI models to detect and disregard hidden or malformed instructions, improving filtering and sanitisation, and instituting bug bounty programs for generative AI vulnerabilities, are essential to mitigate these threats.

Read also:

    Latest