Close Menu
  • Home
  • AI
  • Big Data
  • Cloud Computing
  • iOS Development
  • IoT
  • IT/ Cybersecurity
  • Tech
    • Nanotechnology
    • Green Technology
    • Apple
    • Software Development
    • Software Engineering

Subscribe to Updates

Get the latest technology news from Bigteetechhub about IT, Cybersecurity and Big Data.

    What's Hot

    Santa Claus doesn’t exist (according to AI) • Graham Cluley

    December 28, 2025

    ios – Background Assets Framework server connection problem

    December 27, 2025

    FaZe Clan’s future is uncertain after influencers depart

    December 27, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    Big Tee Tech Hub
    • Home
    • AI
    • Big Data
    • Cloud Computing
    • iOS Development
    • IoT
    • IT/ Cybersecurity
    • Tech
      • Nanotechnology
      • Green Technology
      • Apple
      • Software Development
      • Software Engineering
    Big Tee Tech Hub
    Home»IT/ Cybersecurity»Mitigating prompt injection attacks with a layered defense strategy
    IT/ Cybersecurity

    Mitigating prompt injection attacks with a layered defense strategy

    big tee tech hubBy big tee tech hubJune 15, 2025017 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email Telegram WhatsApp
    Follow Us
    Google News Flipboard
    Mitigating prompt injection attacks with a layered defense strategy
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link


    With the rapid adoption of generative AI, a new wave of threats is emerging across the industry with the aim of manipulating the AI systems themselves. One such emerging attack vector is indirect prompt injections. Unlike direct prompt injections, where an attacker directly inputs malicious commands into a prompt, indirect prompt injections involve hidden malicious instructions within external data sources. These may include emails, documents, or calendar invites that instruct AI to exfiltrate user data or execute other rogue actions. As more governments, businesses, and individuals adopt generative AI to get more done, this subtle yet potentially potent attack becomes increasingly pertinent across the industry, demanding immediate attention and robust security measures.

    At Google, our teams have a longstanding precedent of investing in a defense-in-depth strategy, including robust evaluation, threat analysis, AI security best practices, AI red-teaming, adversarial training, and model hardening for generative AI tools. This approach enables safer adoption of Gemini in Google Workspace and the Gemini app (we refer to both in this blog as “Gemini” for simplicity). Below we describe our prompt injection mitigation product strategy based on extensive research, development, and deployment of improved security mitigations.

    A layered security approach

    Google has taken a layered security approach introducing security measures designed for each stage of the prompt lifecycle. From Gemini 2.5 model hardening, to purpose-built machine learning (ML) models detecting malicious instructions, to system-level safeguards, we are meaningfully elevating the difficulty, expense, and complexity faced by an attacker. This approach compels adversaries to resort to methods that are either more easily identified or demand greater resources. 

    Our model training with adversarial data significantly enhanced our defenses against indirect prompt injection attacks in Gemini 2.5 models (technical details). This inherent model resilience is augmented with additional defenses that we built directly into Gemini, including: 

    1. Prompt injection content classifiers

    2. Security thought reinforcement

    3. Markdown sanitization and suspicious URL redaction

    4. User confirmation framework

    5. End-user security mitigation notifications

    This layered approach to our security strategy strengthens the overall security framework for Gemini – throughout the prompt lifecycle and across diverse attack techniques.

    1. Prompt injection content classifiers

    Through collaboration with leading AI security researchers via Google’s AI Vulnerability Reward Program (VRP), we’ve curated one of the world’s most advanced catalogs of generative AI vulnerabilities and adversarial data. Utilizing this resource, we built and are in the process of rolling out proprietary machine learning models that can detect malicious prompts and instructions within various formats, such as emails and files, drawing from real-world examples. Consequently, when users query Workspace data with Gemini, the content classifiers filter out harmful data containing malicious instructions, helping to ensure a secure end-to-end user experience by retaining only safe content. For example, if a user receives an email in Gmail that includes malicious instructions, our content classifiers help to detect and disregard malicious instructions, then generate a safe response for the user. This is in addition to built-in defenses in Gmail that automatically block more than 99.9% of spam, phishing attempts, and malware.

    AD 4nXePgxMV37fyKPlpp54xFHVtYLigC9Tt32y2B8dOEjRDzvL2xiwN9vx0 oNwPO19CMMNGHovtXa4dLS97sZSJICODbOrCyvKIOivcR oJzTwM2SFG8DdFIoZPzKOvrV05G7 0fMSQw?key=qBd QKEP GpIw65Q IOEgQ

    A diagram of Gemini’s actions based on the detection of the malicious instructions by content classifiers.

    2. Security thought reinforcement

    This technique adds targeted security instructions surrounding the prompt content to remind the large language model (LLM) to perform the user-directed task and ignore any adversarial instructions that could be present in the content. With this approach, we steer the LLM to stay focused on the task and ignore harmful or malicious requests added by a threat actor to execute indirect prompt injection attacks.

    AD 4nXfSfilVHvpF2aOIIc5I2S0mobMGoRwXb85XXRdyCq j2ylJ30oAaY7KrIYtpcHWs21MxkrOtT1 zpVy R4CSk3owlx896fckltob Fc7t7K2TYk7XZjVMWDCY O8RE4l9QbXxKK5w?key=qBd QKEP GpIw65Q IOEgQ

    A diagram of Gemini’s actions based on additional protection provided by the security thought reinforcement technique. 

    3. Markdown sanitization and suspicious URL redaction 

    Our markdown sanitizer identifies external image URLs and will not render them, making the “EchoLeak” 0-click image rendering exfiltration vulnerability not applicable to Gemini. From there, a key protection against prompt injection and data exfiltration attacks occurs at the URL level. With external data containing dynamic URLs, users may encounter unknown risks as these URLs may be designed for indirect prompt injections and data exfiltration attacks. Malicious instructions executed on a user’s behalf may also generate harmful URLs. With Gemini, our defense system includes suspicious URL detection based on Google Safe Browsing to differentiate between safe and unsafe links, providing a secure experience by helping to prevent URL-based attacks. For example, if a document contains malicious URLs and a user is summarizing the content with Gemini, the suspicious URLs will be redacted in Gemini’s response. 

    AD 4nXeOQ69HwjpDYi9jCb 8alM2rBNC09Kx124kotYR8xzNSXygaNI 2eMBVdMYiBJ aH 9tGFuXXphLbCg g74nzI9VsEuSxpLrBGuwkjH2vt2j WAgJz0dyD1EJt6qGugfBMtTHjPnqU?key=qBd QKEP GpIw65Q IOEgQ

    Gemini in Gmail provides a summary of an email thread. In the summary, there is an unsafe URL. That URL is redacted in the response and is replaced with the text “suspicious link removed”. 

    4. User confirmation framework

    Gemini also features a contextual user confirmation system. This framework enables Gemini to require user confirmation for certain actions, also known as “Human-In-The-Loop” (HITL), using these responses to bolster security and streamline the user experience. For example, potentially risky operations like deleting a calendar event may trigger an explicit user confirmation request, thereby helping to prevent undetected or immediate execution of the operation.

    AD 4nXegtOKYti58PeD2EGB9 tpYShPjKYJdHcv7I5u4s9WL0ZkzZy2P95ETc RkdU1YUZWtNat9 WhfF5aekjmUiKxLNlV5yUYXUsTiXXj RjmPd2Z50u4cUuwhFndXU5GslBudLDkGspk?key=qBd QKEP GpIw65Q IOEgQ

    The Gemini app with instructions to delete all events on Saturday. Gemini responds with the events found on Google Calendar and asks the user to confirm this action.

    5. End-user security mitigation notifications

    A key aspect to keeping our users safe is sharing details on attacks that we’ve stopped so users can watch out for similar attacks in the future. To that end, when security issues are mitigated with our built-in defenses, end users are provided with contextual information allowing them to learn more via dedicated help center articles. For example, if Gemini summarizes a file containing malicious instructions and one of Google’s prompt injection defenses mitigates the situation, a security notification with a “Learn more” link will be displayed for the user. Users are encouraged to become more familiar with our prompt injection defenses by reading the Help Center article. 

    AD 4nXfW5ATEo44XQK2kcE3wjUkVhF719xvF rNzKu 1rQNwWuToygqGObc1dlJGFHobh1 94FMUbpJKWeng3giTTfse17sZHnWnm0udVBafKOKmdzstST24BJnkALrWGyCz4qD08t Q w?key=qBd QKEP GpIw65Q IOEgQ

    Gemini in Docs with instructions to provide a summary of a file. Suspicious content was detected and a response was not provided. There is a yellow security notification banner for the user and a statement that Gemini’s response has been removed, with a “Learn more” link to a relevant Help Center article.

    Moving forward

    Our comprehensive prompt injection security strategy strengthens the overall security framework for Gemini. Beyond the techniques described above, it also involves rigorous testing through manual and automated red teams, generative AI security BugSWAT events, strong security standards like our Secure AI Framework (SAIF), and partnerships with both external researchers via the Google AI Vulnerability Reward Program (VRP) and industry peers via the Coalition for Secure AI (CoSAI). Our commitment to trust includes collaboration with the security community to responsibly disclose AI security vulnerabilities, share our latest threat intelligence on ways we see bad actors trying to leverage AI, and offering insights into our work to build stronger prompt injection defenses. 

    Working closely with industry partners is crucial to building stronger protections for all of our users. To that end, we’re fortunate to have strong collaborative partnerships with numerous researchers, such as Ben Nassi (Confidentiality), Stav Cohen (Technion), and Or Yair (SafeBreach), as well as other AI Security researchers participating in our BugSWAT events and AI VRP program. We appreciate the work of these researchers and others in the community to help us red team and refine our defenses.

    We continue working to make upcoming Gemini models inherently more resilient and add additional prompt injection defenses directly into Gemini later this year. To learn more about Google’s progress and research on generative AI threat actors, attack techniques, and vulnerabilities, take a look at the following resources:



    Source link

    Attacks Defense Injection layered Mitigating prompt Strategy
    Follow on Google News Follow on Flipboard
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
    tonirufai
    big tee tech hub
    • Website

    Related Posts

    Santa Claus doesn’t exist (according to AI) • Graham Cluley

    December 28, 2025

    Architecting Security for Agentic Capabilities in Chrome

    December 27, 2025

    Trust Wallet confirms extension hack led to $7 million crypto theft

    December 26, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Editors Picks

    Santa Claus doesn’t exist (according to AI) • Graham Cluley

    December 28, 2025

    ios – Background Assets Framework server connection problem

    December 27, 2025

    FaZe Clan’s future is uncertain after influencers depart

    December 27, 2025

    Airbus prepares tender for European sovereign cloud

    December 27, 2025
    About Us
    About Us

    Welcome To big tee tech hub. Big tee tech hub is a Professional seo tools Platform. Here we will provide you only interesting content, which you will like very much. We’re dedicated to providing you the best of seo tools, with a focus on dependability and tools. We’re working to turn our passion for seo tools into a booming online website. We hope you enjoy our seo tools as much as we enjoy offering them to you.

    Don't Miss!

    Santa Claus doesn’t exist (according to AI) • Graham Cluley

    December 28, 2025

    ios – Background Assets Framework server connection problem

    December 27, 2025

    Subscribe to Updates

    Get the latest technology news from Bigteetechhub about IT, Cybersecurity and Big Data.

      • About Us
      • Contact Us
      • Disclaimer
      • Privacy Policy
      • Terms and Conditions
      © 2025 bigteetechhub.All Right Reserved

      Type above and press Enter to search. Press Esc to cancel.