Home > Security and Compliance Blog > Cybersecurity Risk Management > Microsoft Named 7 Ways Your AI Agents Can Be Hacked. One Is Already Happening in Production.

Microsoft Named 7 Ways Your AI Agents Can Be Hacked. One Is Already Happening in Production.

by Patrick Spencer updated June 8, 2026 Cybersecurity Risk Management

Reading Time: 8 minutes

At Microsoft Build 2026, the security research team extended its existing AI agent threat taxonomy with seven new categories reflecting what adversaries are doing — and what defenders are missing — in deployed agentic systems. The naming matters: “AI agents are risky” is a posture; seven specific, distinct attack vectors with distinct mitigations is an operational framework.

Agentic Supply Chain Compromise. Unlike traditional supply chain attacks relying on malicious code, this vector works through natural language. An agent’s behavior can be influenced by adversarial content embedded in prompts, retrieved documents, or instructions from upstream agents — without any code injection.

Table of Contents

Goal Hijacking. An adversary embeds instructions appearing consistent with the agent’s legitimate task while silently redirecting its terminal objective. Particularly dangerous in multi-step agentic workflows where the agent exercises judgment autonomously.

Inter-Agent Trust Escalation. In multi-agent architectures, a compromised agent can assert false identity or inflated permissions to an orchestrating agent. The orchestrator, lacking cryptographic verification, accepts the claim and grants access it should not.

Computer Use Agent Visual Attack. Agents operating through graphical interfaces can be manipulated by adversarial content rendered on screen — hidden instructions embedded in documents, web pages, or UI elements that redirect the agent’s actions through the visual channel.

Session Context Contamination. An adversary introduces data that biases the agent’s reasoning across subsequent steps without triggering safety controls at any individual decision point. Contamination accumulates across a session, producing a compromised conclusion from individually innocuous inputs.

MCP/Plugin Abuse. The category the Asana incident belongs to. The Model Context Protocol creates a structured channel through which agents interact with external tools and data sources. Vulnerabilities in that channel — logic flaws, misconfigured permissions, or malicious tools — can expose data the agent should not reach, enable exfiltration, or pivot across organizational boundaries.

Capability/Architecture Disclosure. An agent revealing internal implementation details — system prompt contents, tool capabilities, available integrations — provides the reconnaissance needed to design more targeted attacks against it.

5 Key Takeaways

1. MCP/Plugin Abuse is already a production risk, not a theoretical one.

Asana’s 2025 MCP server breach exposed approximately 1,000 organizations when a logic flaw crossed tenant boundaries — task data, project metadata, comments, and uploaded files visible across organizations. Microsoft has now formally named this category in its updated AI agent threat taxonomy. The Kiteworks Secure MCP Server addresses it by enforcing policy at the protocol layer before data reaches the agent — the governance layer the Asana incident demonstrated is necessary.

2. Microsoft’s Build 2026 response provides enforcement infrastructure, not just policy guidance.

The Execution Container, Agent Control Specifications, and ASSERT toolset create a runtime governance layer security teams can deploy against specific threat scenarios. The Execution Container enforces explicit boundaries at runtime rather than trusting agents to self-limit. Agent Control Specifications are portable, auditable policy definitions that can be versioned independently of the agent. ASSERT operationalizes all seven failure modes through adversarial test scaffolding. This is an engineering discipline now — with named vectors and published countermeasures.

3. The seven failure modes give red teams a concrete checklist.

Agentic Supply Chain Compromise, Goal Hijacking, Inter-Agent Trust Escalation, Computer Use Agent Visual Attack, Session Context Contamination, MCP/Plugin Abuse, and Capability/Architecture Disclosure are now documented attack surfaces requiring specific test coverage. Microsoft recommends inventorying agent supply chains, verifying agent identity cryptographically rather than through assertion, and adding all seven to red-team coverage matrices.

4. Traditional IAM and DLP controls do not cover agentic behavior.

AI agents operate asynchronously, chain tool calls, and act on behalf of users in ways that existing perimeter and identity controls were not designed to govern. DLP tools inspect known channels; agents invoke external APIs and process content from sources outside those channels. A dedicated AI content governance layer — enforcing what sensitive content agents can retrieve, process, and transmit — is required.

5. Regulated industries face compounded exposure.

AI agents accessing CUI, PHI, or ITAR-controlled data through misconfigured MCP integrations produce both a security incident and a compliance violation. CMMC, HIPAA, and GDPR do not pause because the actor was an AI agent — access must be authorized, logged, and secured under the same frameworks that govern human access to regulated content.

You Trust Your Organization is Secure. But Can You Verify It?

Read Now

MCP/Plugin Abuse: From Taxonomy Entry to Production Breach

Asana launched MCP server integration with LLM capabilities on May 1, 2025. A logic flaw in the implementation allowed users in one organization to see data from another organization’s Asana instance — task-level information, project metadata, team details, comments, discussions, and uploaded files. The exposure was bounded by each user’s access scope, but it was cross-tenant data visibility from an AI integration organizations had every reason to trust.

Asana discovered the flaw in June 2025, roughly a month after the MCP feature went live. Approximately 1,000 customers were affected. The lesson is not that MCP is inherently insecure — the protocol is a communication layer. The lesson is that the protocol requires a governance layer on top of it: explicit policies defining what each agent can access, under what conditions, with what scope. Without that layer, the MCP integration inherits whatever permissions the underlying service exposes — which in multi-tenant SaaS environments produces exactly the kind of cross-organizational exposure Asana experienced.

Microsoft has now formally named this category of failure and placed it alongside six other vectors in a taxonomy security teams should use to structure their AI red-team programs. The AI data governance failures in MCP deployments are not exotic edge cases. They are the predictable result of deploying powerful integrations without sufficient policy enforcement between tenants.

Microsoft’s Runtime Response: Execution Container, ASSERT, and Agent Control Specifications

Three Build 2026 components work together as the most complete enterprise-grade AI agent security framework any major vendor has published.

The Microsoft Execution Container enforces explicit boundaries at runtime rather than trusting agents to self-limit. Agent output, plugin calls, and tool invocations are treated as untrusted execution paths requiring policy evaluation before they proceed — shifting the security model from “configure the agent correctly and hope” to “enforce boundaries at the execution layer regardless of agent behavior.”

Agent Control Specifications are portable policy definitions describing what an agent is permitted to do, which tools it can invoke, and what external endpoints it can reach. Specifications are separate from agent implementation — auditable, versionable, and updatable without modifying the agent itself. This treats agent governance as a configuration management problem rather than an embedded-code problem.

ASSERT (Adversarial Stress and Security Evaluation for Resilient Thinking) operationalizes the seven failure modes through adversarial test scaffolding security teams can run against deployed agents to identify specific vulnerabilities before adversaries find them.

Why Traditional Security Controls Fall Short

A user logging into a SaaS platform presents credentials, passes MFA, and operates within a defined session. Audit logs record what files were accessed. An AI agent does none of these things cleanly. It operates asynchronously, executing hundreds of tool calls without human intervention at each step. It chains requests, retrieving data from one source to inform a query to another. It acts on behalf of a user whose session may have concluded hours earlier.

Zero-trust architecture provides the right conceptual frame — never trust, always verify — but the verification mechanisms built for human actors do not map cleanly onto asynchronous agents needing continuous authorization across multi-step tool chains. Microsoft’s Execution Container and Agent Control Specifications address the execution layer. The data layer — governing what sensitive content the agent is permitted to retrieve, process, and transmit — requires a purpose-built content governance framework that extends beyond runtime containment.

The Compliance Stakes in Regulated Industries

The seven failure modes Microsoft named are not just security threats — they are potential compliance violations under frameworks governing how specific categories of sensitive content are handled.

For defense contractors under CMMC 2.0, an AI agent retrieving CUI from a project management system and transmitting it through an MCP integration to an external tool without proper controls is a CMMC compliance event. The same logic applies to HIPAA: PHI flowing through an AI agent’s context is subject to access control and audit requirements regardless of whether the actor is human or AI. GDPR adds data minimization: an agent retrieving more data than needed for the task at hand may violate GDPR’s minimization requirement simply by how it operates.

Enterprises in regulated industries cannot treat AI agent governance as a separate program from existing compliance obligations. They need to extend the same data classification, access controls, audit logging, and transmission security that govern human access to regulated content to cover AI agent access as well.

Building a Zero-Trust Content Governance Layer for AI Agents

The Kiteworks Secure MCP Server wraps AI agent access to Kiteworks-governed content in a policy enforcement point. Rather than allowing agents to pull content directly from wherever they can reach, the Secure MCP Server evaluates each request against explicit access policies before data flows to the agent — enforcement at the protocol level, not at the agent implementation level.

The Kiteworks AI Data Gateway extends this to enterprise AI workflows broadly, providing a controlled channel for AI queries against sensitive content repositories with classification-aware access rules. CUI, PHI, and other regulated content is only accessible to AI agents explicitly authorized to process it. Every interaction is captured in a tamper-evident audit log feeding directly into SIEM — the documentation CMMC assessors, HIPAA auditors, and GDPR supervisory authorities will look for when examining AI access to sensitive content.

The Kiteworks Private Data Network extends this governance across email, file sharing, MFT, SFTP, web forms, and APIs under one policy engine and one consolidated audit log. Microsoft has named the threats and provided the execution-layer enforcement tools. The remaining question for each enterprise is whether the content governance layer is mature enough to match the risk.

To learn more about protecting your sensitive content from compromised AI agents, schedule a custom demo today.

Frequently Asked Questions

Microsoft’s taxonomy adds Agentic Supply Chain Compromise, Goal Hijacking, Inter-Agent Trust Escalation, Computer Use Agent Visual Attack, Session Context Contamination, MCP/Plugin Abuse, and Capability/Architecture Disclosure. Microsoft recommends adding all seven to red-team coverage matrices, inventorying agent supply chains, and verifying agent identity cryptographically. The Kiteworks Secure MCP Server specifically addresses MCP/Plugin Abuse by enforcing policy at the protocol layer before data reaches the agent.

In May 2025, Asana’s MCP server integration allowed users in one organization to see task data, project metadata, and files from other organizations — a cross-tenant data leak from the MCP integration itself, not an external attacker. Around 1,000 customers were affected before the server was taken offline in June 2025. Microsoft has now formally named this failure mode (MCP/Plugin Abuse) as a top-priority attack vector, making it the clearest available evidence that AI data governance for MCP deployments is an operational concern, not a theoretical one.

The Execution Container enforces explicit boundaries on file system access, network connections, credential access, and tool invocations at runtime — regardless of how the agent is configured or instructed. Agent Control Specifications provide portable, auditable policy definitions describing permitted behavior, separate from the agent implementation. Together they address the execution layer. Content-layer governance — policies controlling what sensitive data agents can retrieve and process — requires additional controls like the Secure MCP Server and AI Data Gateway.

CMMC Level 2 access controls, audit logging, and transmission security requirements for CUI do not create an exception for AI agents. An agent retrieving CUI through an MCP integration and passing it to an external tool must be authorized, logged, and secured under the applicable NIST 800-171 controls. The same applies to HIPAA: PHI processed by an AI agent is still PHI, and access must be documented. Organizations must extend existing content governance programs to cover AI agent access explicitly.

The Secure MCP Server enforces access policies at the protocol layer before data flows to the requesting agent. The AI Data Gateway provides a controlled channel for AI queries against sensitive content repositories with classification-aware enforcement. Tamper-evident audit logs capture every agent interaction with regulated content for CMMC, HIPAA, and GDPR audit requirements — making zero-trust data protection for AI workflows operational rather than aspirational.

Additional Resources