Home > Security and Compliance Blog > Cybersecurity Risk Management > Six AI Vulnerabilities. Three Failure Patterns. Most Organizations Are Fixing the Wrong One.

Six AI Vulnerabilities. Three Failure Patterns. Most Organizations Are Fixing the Wrong One.

by Patrick Spencer updated April 15, 2026 Cybersecurity Risk Management

Reading Time: 10 minutes

The Wave That Redefined AI Security Risk

Between June 2025 and April 2026, security researchers disclosed six critical AI vulnerabilities across platforms that most enterprises rely on daily. Taken individually, each disclosure prompted a patch and a news cycle. Taken together, they constitute the most significant body of evidence the industry has for a structural shift in how enterprise data gets stolen.

Key Takeaways

Six critical AI vulnerabilities disclosed in under a year. EchoLeak, Reprompt, GeminiJack, ForcedLeak, GrafanaGhost, and the OpenAI plugin supply chain attack targeted Microsoft Copilot, Salesforce, Google Gemini, Grafana, and the OpenAI ecosystem between mid-2025 and April 2026.
The industry is treating them as one problem — they are three. These six disclosures contain three distinct failure patterns: untrusted input processed by AI without validation, overly broad data access without per-operation enforcement, and back-end processes with functional scope they were never designed to use
Untrusted input is the most consistent failure. Every vulnerability in the series begins with external data entering a system through a legitimate channel and being processed by AI without validation. This is the pattern the industry is largely ignoring.
GrafanaGhost is architecturally different from the other five. Grafana has RBAC on user-facing data access. The attack never triggered it — because it operated through system-level back-end processes, not user sessions. Data access controls address the other five. They do not address GrafanaGhost.
Model-level guardrails failed in every case. Grafana’s defenses fell to one keyword. Salesforce’s CSP was bypassed for five dollars. Guardrails are configuration settings inside the system being attacked — they supplement real controls but substitute for none of them.

EchoLeak in Microsoft 365 Copilot was the first formally recognized zero-click AI vulnerability — CVSS 9.3, patched June 2025. ForcedLeak in Salesforce Agentforce followed in September 2025 — CVSS 9.4, exploitable with a five-dollar domain purchase. GeminiJack in Google Gemini Enterprise was a true zero-click attack that could exfiltrate years of Workspace data from a single poisoned document. Reprompt demonstrated single-click Copilot exfiltration through a crafted URL. GrafanaGhost turned trusted back-end processes into an invisible data courier. And a supply chain attack on the OpenAI plugin ecosystem ran undetected for six months across 47 enterprises using harvested agent credentials.

Table of Contents

Every vendor responded responsibly. Every platform was patched. And every attack exploited architectural gaps that patching individual platforms does not close.

The CrowdStrike 2026 Global Threat Report found that 82% of detections in 2025 were malware-free — adversaries are already operating through legitimate tools. These six vulnerabilities take that trend to its endpoint: The AI is the legitimate tool, the trusted data access channel is the exfiltration path, and the monitoring stack sees nothing unusual.

Six Vulnerabilities at a Glance

Vulnerability	Platform	Disclosed	How It Worked	Data at Risk
EchoLeak (CVE-2025-32711)	Microsoft 365 Copilot	June 2025	Crafted email ingested as Copilot context; data exfiltrated via image tag through trusted Microsoft domains	OneDrive, SharePoint, Teams — all content Copilot can access
ForcedLeak (CVSS 9.4)	Salesforce Agentforce	September 2025	Prompt injection in 42,000-char Web-to-Lead form field; exfiltration via PNG to $5 expired allowlisted domain	CRM records, lead data, attached documents
GeminiJack	Google Gemini Enterprise	December 2025	Poisoned Google Doc indexed by RAG; zero-click sweep across Gmail, Docs, Calendar	Years of Workspace data — email, documents, calendar, API keys
Reprompt (CVE-2026-24307)	Microsoft Copilot	January 2026	Prompt injection embedded in URL parameter; single-click exfiltration	Same as EchoLeak — OneDrive, SharePoint, Teams
GrafanaGhost	Grafana AI Components	April 2026	Prompts hidden in URL query parameters stored in event logs; back-end enrichment process with system-level privileges executed hidden instructions	Financial metrics, infrastructure telemetry, customer records
OpenAI Plugin Attack	OpenAI Plugin Ecosystem	2026	Compromised plugin harvested agent credentials; six months of access across 47 enterprises	Customer data, financial records, proprietary code

Pattern One: Untrusted Input Processed as Trusted AI Context

Every vulnerability in this series begins the same way. External data enters a system through a legitimate channel — an email, a shared document, a web form submission, URL query parameters, a compromised plugin — and an AI component later processes it without treating it as adversarial.

EchoLeak’s payload was a crafted email that Copilot ingested as context during a routine query. The user never opened it. GeminiJack’s was a poisoned Google Doc shared with anyone in the target organization, indexed by Gemini’s RAG system, and lying dormant until any employee’s search triggered it. ForcedLeak’s was text hidden in a 42,000-character Web-to-Lead form field — the AI could not distinguish the form data from the injected instructions. GrafanaGhost’s was URL query parameters stored in Grafana’s event monitoring logs — external web requests logged as routine traffic, later processed by AI-enabled back-end enrichment processes.

The principle that external input must be validated before any system processes it is foundational to web application security. Organizations build WAFs around it. Developers are trained on it. Nobody applied it to AI-processed data — because nobody thought of emails, shared documents, event logs, and form fields as input channels for AI prompt injection.

The Cyera 2025 State of AI Data Security Report found that 83% of enterprises already use AI in daily operations, but only 13% have strong visibility into how AI accesses their data. That 70-point gap is the attack surface these vulnerabilities exploit. The AI processes data from dozens of sources. Nobody is validating those sources for adversarial AI instructions.

This is the most consistent failure across all six vulnerabilities, and it is the one the industry is largely ignoring. Data access controls do not address it. Model-level guardrails do not address it. It requires input validation discipline extended to every data source AI touches.

Pattern Two: Overly Broad AI Data Access Without Per-Operation Enforcement

Five of the six vulnerabilities — EchoLeak, Reprompt, GeminiJack, ForcedLeak, and the OpenAI plugin attack — involve AI systems operating on behalf of a user with broad, implicit data access and no per-operation policy enforcement.

Microsoft 365 Copilot has pre-configured access to OneDrive, SharePoint, and Teams — the full productivity suite. Google Gemini Enterprise’s RAG has native access across Gmail, Docs, and Calendar. Salesforce Agentforce can query the entire CRM. In each case, the AI authenticated once at session or connection level, then accessed whatever it could reach. When injected instructions executed, the AI retrieved data far beyond what any user intended — and nothing evaluated each individual retrieval against policy.

The OpenAI plugin attack is a variation on this pattern: Compromised credentials operated as the agent’s identity, granting broad access across 47 environments for six months. The credentials were valid. The access looked normal. Nothing constrained what those credentials could do on each operation.

Per-operation access control — authenticating each request independently, evaluating attribute-based policy on every operation, isolating credentials from the AI’s accessible context, and logging every access with complete attribution — would have constrained the blast radius in each of these five cases.

The Kiteworks Data Security and Compliance Risk: 2026 Forecast Report found a 15–20 point gap between governance controls (monitoring, logging, human-in-the-loop) and containment controls (purpose binding, kill switches, network isolation). The per-operation enforcement gap is real and urgent — for the five vulnerabilities where it applies.

Pattern Three: Process Containment and Functional Scoping Failures

GrafanaGhost is architecturally different from the other five, and treating it as another data access control problem misreads the vulnerability.

Grafana has RBAC on user-facing data access. GrafanaGhost never triggered it. The attack never operated on behalf of any user. Instead, it operated through trusted back-end enrichment processes running with system-level privileges — processes designed to correlate, analyze, and prepare event data for dashboards.

When the enrichment process analyzed the attacker’s event (containing the hidden AI prompt in URL query parameters), the AI component executed the instructions within the process’s privileged context. It built a dashboard nobody requested, embedded sensitive data in image tags, and made them externally accessible. Noma’s researchers found that the keyword “INTENT” collapsed the AI’s guardrails entirely. A URL validation flaw disguised an external server as internal.

The back-end process needed broad data read access. That is defensible. What it did not need was the ability to call routines that render dashboards, generate image tags, or make outbound requests to external servers. Those are output capabilities the process was never designed to use — but nobody actively prevented it from accessing them.

The OpenAI plugin attack shares a pattern-three element: Agent credentials were stored where compromised plugin code could access them, because authentication tokens were not isolated from the AI’s accessible context.

Least privilege must apply to functional scope — which APIs, rendering routines, and output channels a process can invoke — and to credential storage, not just to data access. This is the containment gap, and it requires a different architectural response than data access governance.

Three Failure Patterns Mapped to Controls

Pattern	Vulnerabilities	Primary Failure	What Addresses It	What Does NOT Address It
1. Untrusted input as trusted AI context	All six	External data processed by AI without validation	Input validation for AI-processed data; zero-trust treatment of all data sources AI consumes	Data access controls (RBAC/ABAC); model-level guardrails
2. Overly broad AI data access	EchoLeak, Reprompt, GeminiJack, ForcedLeak, OpenAI plugin	AI authenticates once, then accesses everything in scope	Per-operation authentication; ABAC on every request; credential isolation; audit trails	Input validation (pattern 1); process containment (pattern 3)
3. Process containment / credential isolation	GrafanaGhost, OpenAI plugin	Back-end process with excessive functional scope; credentials accessible to compromised code	Functional scoping (least privilege on capabilities, not just data); credential isolation in OS keystore	Data access controls (wrong layer for GrafanaGhost); input validation alone

Model-Level Guardrails Failed Across the Board — But That Is the Symptom

Grafana’s AI guardrails were defeated by a single keyword. Salesforce’s Content Security Policy was bypassed with a five-dollar expired domain. Google Gemini’s RAG could not distinguish a poisoned document from a legitimate one. Microsoft Copilot’s safety features could not prevent a crafted email — or a crafted URL — from hijacking its context window.

Model-level guardrails are configuration settings inside the system being attacked. They can be overridden by prompt injection, circumvented by targeting trust boundaries, or neutralized by manipulating the context the AI processes. Every major LLM has been jailbroken at near-perfect success rates in controlled research. The Agents of Chaos study from February 2026 — conducted by 20 researchers from MIT, Harvard, Stanford, CMU, and others — documented AI agents destroying infrastructure, disclosing PII databases, and accepting identity spoofing in live environments.

Guardrails are a useful defense layer. They supplement real controls. They substitute for none of them. No regulator, auditor, or forensic investigator will accept “our model was instructed not to” as evidence of access control, input validation, or process containment.

How Kiteworks Addresses the Data Access Pattern — and Where the Challenge Extends Beyond It

Kiteworks provides a governed data layer between AI systems and enterprise data repositories through its Secure MCP Server and AI Data Gateway. Every AI data request — whether from an interactive assistant through MCP or a RAG pipeline through the API — is authenticated via OAuth 2.0 with credentials stored in the OS keychain (never exposed to the AI model), evaluated against RBAC and ABAC policies in real time on every operation, rate-limited to prevent bulk extraction, and logged in a tamper-evident audit trail fed to SIEM with complete attribution.

These controls directly address pattern two — the five vulnerabilities where AI systems operate on behalf of a user with broad implicit access and no per-operation enforcement. Per-operation ABAC constrains what the AI can access on each request. Credential isolation prevents harvesting. Audit trails enable detection and satisfy compliance requirements.

For pattern one (untrusted input), the principle Kiteworks implements — controls that operate independently of the AI model and outside the AI’s accessible context — extends to the input validation challenge. But validating whether content entering Salesforce forms, Google Docs, Microsoft emails, or Grafana event logs contains adversarial AI instructions is an application-layer responsibility that data access governance alone does not solve.

For pattern three (process containment), Kiteworks’ MCP implementation demonstrates the right architectural approach: OAuth tokens in the OS keychain, ABAC on every MCP operation, path traversal validation. Extending these principles to the functional scoping of third-party back-end processes — constraining what those processes can do, not just what data they can access — is the next frontier.

The honest assessment: Kiteworks reduces blast radius and enables detection for pattern two. Patterns one and three require additional architectural controls that the industry is still building.

What Security Leaders Should Do — All Three Patterns

First, audit input trust boundaries across every AI integration. Identify every data source AI processes — emails, shared documents, form submissions, event logs, API responses, metadata fields. If external data feeds into any system where an AI component processes it, treat that input as adversarial regardless of how deep inside the system it has been stored. Apply the same validation discipline you apply to web-facing user input.

Second, require per-operation data access enforcement for every AI system operating on behalf of a user. Authentication on every request, not just at connection time. ABAC evaluated on every operation. Credentials stored outside the AI’s accessible context. Tamper-evident audit trails with complete attribution feeding your SIEM. If any of these are missing, the AI integration has no data access control that survives prompt injection.

Third, scope back-end AI processes to only the functional capabilities they require. Broad data read access may be necessary. The ability to render content, generate outbound requests, build dashboards, or invoke output routines is not. Least privilege applies to what processes can do, not just what data they can access. This is the control GrafanaGhost was missing.

Fourth, stop treating model-level guardrails as compensating controls. They failed in every case in this series. They are a useful defense layer — and they substitute for none of the three patterns above.

Fifth, red-team AI integrations for all three patterns. Test for prompt injection through user-facing channels (pattern two) and through event data, log entries, metadata, and back-end data sources (patterns one and three). Every vulnerability in this series was discovered by researchers, not by the organizations running the platforms. If you are not testing, you are leaving the discovery to someone with different intentions.

The patches are in. The three architectural gaps are not. The next variant will exploit whichever pattern you left unaddressed.

Frequently Asked Questions

EchoLeak (Microsoft 365 Copilot), ForcedLeak (Salesforce Agentforce), GeminiJack (Google Gemini Enterprise), Reprompt (Microsoft Copilot), GrafanaGhost (Grafana), and a supply chain attack on the OpenAI plugin ecosystem. Analyzed together, they reveal three distinct architectural failure patterns — untrusted input, overly broad data access, and process containment failures — that patching individual platforms does not close. The CrowdStrike 2026 Global Threat Report found 82% of detections were malware-free, confirming that adversaries are operating through legitimate tools — exactly the pattern these AI vulnerabilities exploit.

GrafanaGhost operated through trusted back-end enrichment processes with system-level privileges — not through a user session. Grafana has RBAC on user-facing data access, and the attack never triggered it. The primary failures were untrusted input (URL parameters stored in event logs) processed without validation, and the back-end process having functional scope (rendering, outbound communication) it was never designed to use. Data access controls address the other five vulnerabilities. They do not address GrafanaGhost’s failure pattern.

Model-level guardrails are configuration settings inside the system being attacked. Noma Security’s researchers defeated Grafana’s guardrails with a single keyword. Salesforce’s CSP was bypassed with a five-dollar domain. Every major LLM has been jailbroken in controlled research. Guardrails supplement real controls — input validation, per-operation access enforcement, and process containment — but they substitute for none of them.

Standard RBAC evaluates access at the session or connection level — once authenticated, the AI accesses everything within scope. Per-operation access control evaluates each individual data request against policy: who is requesting, what data, for what purpose, under what policy. This is the difference between “Copilot can access SharePoint” and “this specific retrieval, right now, is authorized for this data category.” The Kiteworks 2026 Forecast Report found a 15–20 point gap between governance and containment controls — per-operation enforcement is the containment control most organizations lack.

Start with a comprehensive AI integration inventory — every tool with AI features that processes data from external sources or operates on behalf of users. Then assess each integration against all three patterns: Does external data reach the AI without validation (pattern one)? Does the AI have broad session-level access without per-operation enforcement (pattern two)? Do back-end AI processes have functional capabilities beyond their intended scope (pattern three)? The Agents of Chaos study from February 2026 documented both pattern-two and pattern-three failures in live environments. Red-team for all three.