Microsoft Copilot Read Your Confidential Emails for Weeks. Here’s What Broke and How to Fix It.
For weeks, Microsoft 365 Copilot was quietly reading, summarizing, and surfacing emails that organizations had explicitly marked as confidential. Legal memos. Business agreements. Government correspondence. Protected health information. All processed by an AI that was never supposed to touch it.
Key Takeaways
- Application-Layer AI Controls Are a Single Point of Failure. Microsoft’s sensitivity labels and DLP policies lived inside the same platform as Copilot. When a code error hit, every control failed simultaneously. Organizations had no independent defense layer to prevent the AI from processing confidential content—including legal communications, business agreements, and protected health information.
- Organizations Had Zero Independent Visibility Into What Copilot Accessed. From January 21 through early February 2026, Copilot read and summarized confidential emails without triggering any independent alert. Organizations discovered the breach when Microsoft disclosed it—weeks later. Without independent audit trails, security teams were blind to the unauthorized AI processing happening inside their own environments.
- Microsoft’s Response Missed the Real Compliance Question. Microsoft stated that users only accessed information they were already authorized to see. But the compliance question isn’t whether the user had clearance. It’s whether the AI was authorized to ingest, process, and summarize confidential content. Under HIPAA, GDPR, and the EU AI Act, that distinction matters enormously.
- Defense in Depth for AI Governance Is No Longer Optional. This incident validates what security architects have long argued: AI governance requires independent, data-centric controls that operate separately from the AI platform. Zero-trust architecture, purpose binding, and least-privilege AI access aren’t aspirational—they’re operational necessities. Kiteworks provides this independent governance layer, ensuring that when vendor controls fail, data-layer defenses remain intact.
- The Regulatory Exposure Is Real and Unresolved. If Copilot processed emails containing PHI, PII, or other regulated data, organizations may face breach notification obligations under HIPAA, GDPR Article 33, and state data breach laws. Microsoft has not disclosed how many organizations were affected, leaving compliance teams to assess exposure with incomplete information.
The bug—tracked as CW1226324—was first reported by customers on January 21, 2026. Microsoft began rolling out a fix in early February, but as of mid-February, remediation is still not complete across all affected tenants. The U.K.’s National Health Service flagged the issue internally. Microsoft has not disclosed how many organizations were caught in the blast radius.
Here’s the part that should stop every security leader mid-scroll: The sensitivity labels were in place. The data loss prevention (DLP) policies were configured correctly. Every box was checked. And none of it mattered.
This wasn’t a misconfiguration. It wasn’t an admin error. It was a code bug inside the platform—and it took down every protection that was supposed to keep the AI away from confidential data. That’s not a patch-and-move-on situation. That’s a fundamental architecture problem.
What Actually Happened
A code error in Copilot Chat’s “work tab” feature allowed the AI to pull emails from users’ Sent Items and Drafts folders—even when those emails carried confidentiality labels and had DLP rules explicitly configured to block AI processing. The labels said “hands off.” Copilot ignored them.
Microsoft confirmed the issue in a service advisory, stating that emails with a confidential label applied were being “incorrectly processed” by Microsoft 365 Copilot Chat. The company attributed the cause to a code issue and began deploying a server-side fix in early February.
Microsoft’s public response was carefully worded: Users only accessed information they were already authorized to see. On a narrow technical reading, that’s defensible—Copilot operates within the user’s own mailbox context. But it completely sidesteps the real question.
The question was never whether the user had clearance. The question is whether the AI was authorized to ingest, process, and summarize confidential content. The sensitivity labels and DLP policies existed for exactly one reason: to prevent that from happening. When those controls failed, confidential data was processed by an AI system in ways the organization explicitly prohibited. That’s the breach that matters for compliance purposes.
The Architectural Problem Nobody Wants to Talk About
This isn’t a story about one bug. Bugs happen. Code ships with errors. Patches roll out. That’s the life cycle.
The real story is structural. Every security control that was supposed to prevent unauthorized AI processing—sensitivity labels, DLP, access restrictions—lived inside the same platform as the AI itself. When the platform broke, everything broke. No second layer. No independent check. No backstop.
Think of it this way: Imagine a bank where the vault door, the alarm, and the security cameras all run on a single circuit breaker. One tripped wire and you have an open vault, no alarm, and no footage. That’s what happened here.
This is not a theoretical risk category. The World Economic Forum’s 2026 Global Cybersecurity Outlook found that data leaks through generative AI are now the number one cybersecurity concern among CEOs globally, cited by 30% of respondents. Among cybersecurity professionals more broadly, the concern rose from 21% in 2024 to 34% in 2026. Meanwhile, roughly one-third of organizations still have no process to validate AI security before deployment.
The Copilot incident is what that gap looks like when it hits production.
Why “Trust But Verify” Fails When the Verifier Is Also the Vendor
Microsoft is simultaneously the AI provider, the security control provider, and the entity responsible for auditing whether those controls are working. When the controls failed, organizations had no independent way to know. They found out when Microsoft told them—weeks after it started.
No independent audit trail captured Copilot’s access to confidential content. No anomaly detection flagged unusual processing patterns. No real-time alerting triggered when the AI suddenly began summarizing emails it had never been authorized to touch. The only signal was a service advisory, published after the fact.
This is the governance gap that Kiteworks is built to close. The argument—and the Copilot bug makes it difficult to dispute—is that AI governance controls must operate on a separate layer from the AI platform. Not as a policy toggle within the same ecosystem. As an independent control plane.
Here’s what that looks like in practice:
Independent data governance layer. Kiteworks operates as a separate control plane. AI platforms access data through Kiteworks APIs with enforced policies—not through direct access to email repositories or file systems. Even if an AI platform has a bug, it cannot bypass controls it doesn’t manage.
Purpose binding and least-privilege access. Restrict AI access to specific data classifications and use cases. Instead of giving Copilot access to an entire mailbox, purpose binding allows organizations to specify that AI can access general business emails but not emails labeled confidential, PHI, or CUI. Every request is evaluated against current policies—not “authenticate once, access everything forever.”
Anomaly detection and real-time monitoring. Kiteworks identifies when AI agents exhibit unusual data access patterns. If an AI system suddenly begins processing large volumes of confidential content, automated alerts reach the security team in real time—not weeks later through a service advisory.
Immutable, independent audit trails. Comprehensive logs documenting what data AI accessed, when it accessed it, under whose authorization, and what actions it took. These logs are controlled by the organization—not the AI vendor. When a regulator asks what happened, the organization has its own evidence, independent of the vendor’s account.
The Compliance Exposure Organizations Need to Assess Now
The regulatory implications of the Copilot bug extend beyond the technical fix. If Copilot processed emails containing protected health information, personal data, or other regulated content, organizations may face compliance obligations they haven’t yet considered.
HIPAA. Under § 164.308(b)(1), covered entities must have written contracts with business associates that establish permitted uses and disclosures of PHI. If Copilot processed PHI marked as confidential in ways not covered by the existing agreement, organizations may need to assess whether this constitutes a reportable breach. Microsoft’s assertion that users were authorized to see the data does not address whether the AI was authorized to process it—a distinction HIPAA regulators will scrutinize.
GDPR. Article 32 requires “appropriate technical and organizational measures” to ensure security of personal data processing. Organizations that relied solely on Microsoft’s sensitivity labels as their technical safeguard face a difficult argument when those safeguards failed for weeks. Article 33 requires notification to supervisory authorities within 72 hours of becoming aware of a personal data breach. If confidential emails contained EU personal data, that clock may have already started.
EU AI Act. Article 12 requires that high-risk AI systems maintain detailed records of their operations. Organizations using Copilot to process sensitive data may be classified under high-risk provisions. If their only operational records come from Microsoft’s own logs—the same vendor that had the failure—they lack the independent documentation the regulation contemplates.
State data privacy laws. Multiple U.S. states have notification requirements triggered by unauthorized access to personal information. If Copilot processed confidential emails containing information covered by state breach notification statutes, organizations may have obligations that Microsoft’s service advisory does not resolve.
Kiteworks addresses these compliance requirements directly. Independent audit trails provide the evidence organizations need for breach notification assessments. Exportable compliance reports demonstrate AI governance controls to regulators. Data processing agreements and business associate agreements are enforced at the data layer—not just through contractual promises that depend on the vendor’s controls working correctly.
What Organizations Should Do Now
Assess your exposure. Determine whether your organization was affected by CW1226324. Review Microsoft 365 admin center alerts and Copilot usage logs from January 21 through the date your tenant received remediation confirmation. Export metadata for any confidential emails in Sent Items and Drafts during this window. If Microsoft cannot provide access logs showing what Copilot processed, document that gap formally.
Evaluate breach notification obligations. If confidential emails contained PHI, PII, or other regulated data, consult legal counsel on potential obligations under HIPAA § 164.408, GDPR Article 33, or applicable state breach notification laws. Do not assume Microsoft’s characterization of the incident resolves your compliance obligations.
Implement independent AI governance. Do not rely solely on AI platform controls to protect sensitive data from AI processing. Deploy an independent data governance layer—like Kiteworks—that enforces access policies regardless of vendor bugs. Purpose binding, least-privilege access, and continuous verification should be enforced at the data layer, not the application layer.
Establish independent audit trails. Ensure your organization has logs of AI data access that are not solely controlled by the AI vendor. Kiteworks provides immutable audit trails that document every AI interaction with organizational data, giving security and compliance teams evidence that doesn’t depend on the vendor’s own reporting.
Review AI access architecture. Evaluate whether your AI tools have broad access to data repositories or whether access is restricted by purpose and classification. The Copilot bug affected Sent Items and Drafts because Copilot had access to the full mailbox context. Purpose binding and attribute-based access controls would have restricted processing to authorized classifications only—even when the platform’s own controls failed.
Demand transparency from vendors. Request a post-incident report from Microsoft detailing the scope of impact, affected tenants, data retention for content Copilot processed during the incident, and the timeline for full remediation. If the vendor cannot provide this transparency, that itself is a governance signal that should inform your architecture decisions going forward.
The Lesson the Industry Needs to Learn
The Microsoft Copilot bug is not an isolated incident. It is a case study in what happens when organizations trust AI platforms to police themselves.
The WEF’s 2026 Global Cybersecurity Outlook warned that as AI agents become more widely adopted, managing their credentials, permissions, and interactions becomes as critical—and likely more complex—than managing those of human users. The report called for continuous verification, audit trails, and robust accountability structures grounded in zero-trust principles that treat every AI interaction as untrusted by default.
The Copilot bug demonstrates exactly why. The sensitivity labels were a trust mechanism. They trusted the platform to respect them. The platform didn’t—not because of malice, but because of a code error. And without an independent governance layer, there was nothing to catch the failure.
Kiteworks provides that independent layer. Its zero-trust data exchange architecture ensures that AI platforms must authenticate through Kiteworks to access sensitive data, with policies enforced at the data layer—not the application layer. Purpose binding restricts what AI can access. Continuous verification evaluates every request. And comprehensive audit trails prove what happened, when, and under whose authorization.
The organizations that will navigate the AI era without becoming the next service advisory are the ones building independent governance now. Not committees. Not labels. Not vendor promises. Operational infrastructure that enforces policies at the data layer—where it cannot be bypassed by a code bug in the very platform it’s supposed to govern.
The labels were in place. The policies were configured. The AI read your confidential emails anyway. If that doesn’t change how you think about AI governance, what will?
Frequently Asked Questions
In January 2026, a code error (tracked as CW1226324) in Microsoft 365 Copilot Chat allowed the AI to read and summarize emails from users’ Sent Items and Drafts folders that were marked with confidentiality sensitivity labels. This occurred despite DLP policies being configured to block AI processing of those emails. The bug was first reported on January 21, 2026, and Microsoft began deploying a fix in early February, though full remediation was not yet complete as of mid-February. Affected content included business agreements, legal communications, governmental inquiries, and protected health information.
A code error in Copilot Chat’s “work tab” feature caused the AI to process emails in Sent Items and Drafts folders regardless of whether confidentiality labels were applied. The sensitivity labels and DLP policies were correctly configured—the controls simply did not function as designed because the bug existed within the same platform that was supposed to enforce those controls. This is the core architectural vulnerability: When AI governance controls and the AI itself share the same platform, a single bug can defeat all protections simultaneously.
Microsoft stated that users only accessed information they were already authorized to see, since Copilot operates within the user’s mailbox context. However, the compliance concern is different from the access authorization concern. The AI was not authorized to process confidential content—that’s why the sensitivity labels and DLP policies existed. Whether the user could have read the email manually does not resolve whether the AI’s automated ingestion and summarization of confidential content constitutes a compliance violation under HIPAA, GDPR, or other regulatory frameworks.
If Copilot processed emails containing protected health information (PHI) that was marked confidential, healthcare organizations may need to assess whether this constitutes a reportable breach under HIPAA. The key question is whether the AI’s processing of PHI was authorized under the organization’s business associate agreement with Microsoft. Organizations should review Copilot usage logs from the affected period, identify any confidential emails containing PHI, and consult legal counsel on notification obligations under HIPAA § 164.408.
Data-layer AI governance means enforcing access controls, purpose binding, and audit logging at the data infrastructure level—independent of the AI platform. Instead of relying on the AI platform’s own controls (like Microsoft’s sensitivity labels), data-layer governance requires AI to authenticate through a separate control plane before accessing any data. This means a bug in the AI platform cannot bypass governance controls because those controls exist outside the platform. Kiteworks provides this independent governance layer through zero-trust data exchange, purpose binding, least-privilege access, and comprehensive audit trails.
Kiteworks operates as an independent data governance layer between AI platforms and sensitive organizational data. AI systems must authenticate through Kiteworks APIs and comply with enforced policies before accessing any content. Purpose binding restricts AI access to specific data classifications—preventing access to confidential content regardless of whether the AI platform’s own controls are functioning. Continuous verification evaluates every data request against current policies. Anomaly detection identifies unusual access patterns in real time. And immutable audit trails document every AI interaction, providing independent evidence that does not depend on the AI vendor’s logs.
Affected organizations should take several immediate steps. First, review the Microsoft 365 admin center for alerts related to CW1226324 and examine Copilot usage logs from January 21 through their remediation date. Second, identify any confidential emails in Sent Items and Drafts that may have been processed during the affected period, preserving metadata for legal review. Third, request access logs from Microsoft showing Copilot’s processing activity during the incident window. Fourth, assess whether affected emails contained PHI, PII, or other regulated data that could trigger breach notification obligations. Finally, evaluate implementing independent data-layer governance to prevent similar incidents regardless of future vendor bugs.