Verizon DBIR 2026: Shadow AI Now a Top Insider Threat
The most consequential finding in the 2026 Verizon Data Breach Investigations Report has very little to do with attackers. It has to do with employees. In the twelve months covered by the report, the share of workers using AI tools on corporate devices jumped from 15% to 45%. Two-thirds of those workers are using personal accounts to do it. And the most common data type they are uploading to those services is not customer PII or marketing copy. It is source code.
Key Takeaways
- Shadow AI Is No Longer a Fringe Behavior. The 2026 Verizon DBIR found that 45% of employees are now regular AI users on corporate devices, up from 15% the year before. Shadow AI is the third most common non-malicious insider action in DLP data.
- Two-Thirds of AI Access Happens Outside Corporate Accounts. 67% of users are reaching public AI services from non-corporate accounts on their corporate devices. That is corporate data leaving through doors the enterprise does not own.
- Source Code Is the Number One Data Type Flowing Into Ungoverned LLMs. Across 858,440 DLP events targeting AI tools, source code led by a large margin, followed by images, structured data, and research and technical documentation.
- AI Browser Extensions Are the Quiet Second Egress Channel. The average company has more than 15% of users with unauthorized AI browser extensions installed — many of which silently retain the context of every page visited.
- Prohibition Has Empirically Failed. The data does not support a blanket AI ban; it supports sanctioned, governed AI access routed through a control plane that enforces policy, logs every interaction, and produces audit-ready evidence.
This is the empirical foundation that AI data governance arguments have lacked for two years. Until now, the case for governed AI data access leaned on hypothetical scenarios and survey-based estimates of employee behavior. The DBIR has converted those estimates into measured behavior at industry scale, drawn from a data loss prevention dataset spanning 858,440 events. The conclusion is uncomfortable: Shadow AI is not a future problem to plan for. It is a present problem operating at scale, inside almost every organization that has not built the controls to govern it.
What follows is what the data actually says, why it matters, and what the architecture of a working response looks like.
The Tripling: From 15% to 45% in Twelve Months
The headline finding deserves to be stated cleanly. In the 2025 DBIR dataset, 15% of employees were classified as regular AI users on their corporate devices. In the 2026 dataset, that figure reached 45%. That is a tripling in a single twelve-month window. For context, very few enterprise technology adoption curves move that quickly — the closest analog is the smartphone wave of the early 2010s, and even that transition took two to three years to reach this level of workplace penetration.
The 2026 DBIR also documents that Shadow AI is now the third most common non-malicious insider action in the data loss prevention dataset, representing a fourfold increase in percentage terms from the previous year. To be clear about what that means: When DLP systems detect employees doing something that violates policy without malicious intent, the third most common thing they catch is now employees moving corporate data into AI services the enterprise does not control.
The implication for security strategy is straightforward. Any AI governance program built around the assumption that AI use inside the organization is rare, exceptional, or confined to a few pilot teams has been built on a false premise for at least a year. The default state is now widespread, daily, and routine AI use on corporate devices, regardless of whether the organization has issued a policy.
The Account Problem: Two-Thirds of AI Access Bypasses the Enterprise
If 45% of employees using AI were doing so through corporate accounts on sanctioned enterprise platforms, this would still be a governance challenge, but a manageable one. They are not. The 2026 DBIR found that 67% of users are accessing AI services from non-corporate accounts on their corporate devices. Verizon’s framing is direct: These are unaccounted AI systems that contain corporate data operating outside the control of the organizations.
This is structurally the same problem as Shadow IT, but with a critical difference. With Shadow IT, an employee using a personal Dropbox to share a file is creating a discrete exposure event. With Shadow AI, every prompt is potentially a permanent training input or a logged interaction sitting on infrastructure the enterprise does not control. The data does not move once. It is ingested, processed, and retained by systems whose data handling practices the enterprise has not reviewed, cannot audit, and frequently cannot even identify.
The 2026 Thales Data Threat Report adds a useful adjacent finding: only 33% of organizations have complete knowledge of where their sensitive data resides. Combine that with the DBIR’s finding on non-corporate AI account use, and the picture becomes sharper. Two-thirds of organizations cannot fully account for where their sensitive data is. Two-thirds of employees are using AI accounts the enterprise does not control. These two statistics describe the same governance gap from two different vantage points.
The Data Type Problem: Source Code, Not Spreadsheets
The 2026 DBIR analyzed 858,440 DLP events involving uploads to generative AI tools and ranked the data types submitted by frequency. Source code was first by a large margin, followed by images and structured data. In 3.2% of policy violations, research and technical documentation was uploaded to unauthorized AI systems. Verizon’s own commentary on the finding is uncharacteristically blunt: As if the source code part was not enough, you now have potential intellectual property walking out the door.
This finding reframes the entire AI data leakage conversation. Most early discussions of Shadow AI focused on PII and PHI exposure — customer records, healthcare data, the kinds of regulated data categories that carry breach disclosure obligations. Those exposures are real. But the DBIR data shows that the dominant exfiltration category is something different and arguably more damaging to long-term competitive position: the engineering, research, and proprietary work product that defines what the organization actually does.
The 2026 DTEX Insider Threat Report reinforces this from another angle. DTEX identifies Shadow AI as the top driver of negligent insider incidents and reports that 92% of organizations say generative AI has changed how employees share information, yet only 13% have integrated AI into their formal insider threat strategy. The behavior has changed. The strategy, in most cases, has not.
The Second Egress Channel: AI Browser Extensions
The DBIR documents a quieter form of AI-related data leakage that most organizations are not measuring: AI browser extensions. The 2026 report found that the average company has more than 15% of users with unauthorized AI extensions installed on their browsers. Many of these extensions are designed to collect and retain the context of pages the user visits, in order to provide AI-assisted summaries, suggestions, or workflow automation. The DBIR puts the consequence plainly: If corporate users are browsing internal sites, some of that non-public data is getting vacuumed up.
This is a meaningful expansion of the AI egress story. The Shadow AI conversation has largely centered on the active behavior of pasting content into ChatGPT or Claude. Browser extensions are a passive collection vector. An employee who never deliberately uploads anything to an AI service can still leak the contents of internal portals, ticketing systems, document management interfaces, and SaaS dashboards simply by browsing them with an extension installed.
Traditional DLP tools, which are built to inspect outbound network traffic or block specific upload actions, frequently do not catch this pattern. The extension is part of the browser; the data movement looks like normal page rendering and telemetry. Governance has to move closer to the data layer to address this, not stay at the network or behavior layer.
Threat Actors Are Using AI Across the Attack Chain — But Not in Novel Ways
There is a parallel finding in the 2026 DBIR worth pairing with the Shadow AI data. Through a collaboration with Anthropic, the report analyzed 793 malicious threat actors who were sanctioned for misusing AI platforms. The median actor sought AI assistance for around 15 distinct MITRE ATT&CK techniques. In extreme cases, actors queried for 40 to 50 techniques, effectively treating the AI platform as a co-developer across the full attack chain. Among AI-assisted initial access techniques, 44% mapped to phishing and 32% to vulnerability exploitation.
What the DBIR found, however, was that less than 2.5% of AI-assisted techniques involved rare or novel attack methods. The median technique already had 55 known existing malware examples implementing the same function. The implication is important: Attackers are using AI to do well-known things faster, not to unlock entirely new attack classes. The 2026 CrowdStrike Global Threat Report supports this with its own finding of an 89% year-over-year increase in AI-enabled adversary activity and a 42% increase in zero-day exploits.
Why does this matter for the Shadow AI conversation? Because it defeats the strongest argument for AI prohibition. If banning AI inside the enterprise meaningfully reduced asymmetric threat actor advantage, prohibition might be defensible. The DBIR data shows it does not. Attackers will use AI either way. The productivity battle is the one defenders lose by banning it.
The Architectural Response: Governed AI Data Access, Not Prohibition
The structural lesson in the DBIR data is that policies forbidding the use of unsanctioned AI fail for the same reason policies forbidding the use of personal email accounts for work failed — because employees prioritize getting their work done. The 2026 DBIR documents this directly in its Privilege Misuse analysis: 60% of malicious-insider breaches in this year’s dataset were motivated by Convenience, with the canonical example being an employee emailing company data to a personal account to keep working from home. The Shadow AI version is the same employee pasting a contract into a public LLM to summarize it before a meeting.
Governance that succeeds in this environment has to operate at the data layer, not the user-behavior layer. That means three architectural commitments:
First, sanctioned AI access through governed pipes. Employees use AI either way. The question is whether they are using it through enterprise infrastructure that enforces policy, logs interactions, and protects sensitive content — or through their personal accounts on infrastructure the enterprise does not control. The Kiteworks Secure MCP Server and AI Data Gateway are designed for this layer: a governed bridge between AI systems and enterprise data that enforces attribute-based access control on every request, authenticates through OAuth 2.0, and produces a complete audit trail of what each AI system accessed, on whose behalf, when.
Second, policy enforcement on every AI data request, not just user actions. Traditional DLP inspects what users do. Data-layer governance inspects what data moves — regardless of whether the request is human-initiated or agent-initiated. This is what closes the browser-extension gap and the agentic-AI gap simultaneously. If the data is governed at the layer where it lives, it is governed regardless of which integration pattern reaches for it.
Third, audit-ready evidence of every AI interaction. Regulators are not going to wait. HIPAA, GDPR, the EU AI Act, and emerging U.S. state AI laws all converge on the same expectation: Organizations using AI to process regulated data must be able to demonstrate what data the AI accessed, on whose authority, and with what protections. Tamper-evident logs of every AI request, streamed in real time to SIEM, are the foundation of that evidentiary record. Organizations relying on inferred logs from AI vendor consoles will discover, late, that the inference does not survive forensic review.
This is the model of governed AI access that the DBIR data argues for — not as a marketing position, but as the only architecture that matches the empirical pattern of how AI is actually being used inside organizations today.
What Security Leaders Should Do This Quarter
The 2026 DBIR data should drive concrete action in the next ninety days. The following five steps are the minimum reasonable response.
First, measure the actual scale of Shadow AI inside your organization. Most DLP and CASB deployments can be configured to detect employee uploads to public AI services within a week. If you do not know whether your organization looks like the 45% baseline, the 67% non-corporate account baseline, or something better or worse, that visibility is the first deliverable.
Second, identify the data types moving. The 2026 DBIR finding that source code leads is the starting point, but every organization has its own distribution. A financial services firm may see contracts and trade documentation; a healthcare provider may see clinical notes; a manufacturer may see CAD files and supplier specifications. The remediation plan depends on knowing which data types are actually leaking.
Third, sanction one or more enterprise AI pathways. Prohibition will fail. The alternative is to provide governed AI access through enterprise accounts on platforms the organization has vetted, contracted with appropriate data handling terms, and integrated through a control plane that enforces policy at the data layer.
Fourth, audit your AI browser extension posture. This is the gap most organizations have not measured. If the average company has 15% of users with unauthorized AI extensions installed, your organization probably does too. Endpoint management tooling can inventory installed extensions; the policy question is which ones the enterprise will sanction and how.
Fifth, build the evidentiary trail before the regulator asks. Every AI interaction with regulated data needs a log entry: what data, accessed by what system, on whose behalf, with what protections. Building this after a regulator or auditor asks is significantly more expensive than building it before.
The 2026 DBIR did not just publish a statistic. It published the end of the prohibition debate. The 45% figure is the baseline. The question now is which organizations build governance to match it, and which ones spend the next twelve months explaining to their boards why source code is showing up in places it should not.
Frequently Asked Questions
The 2026 Verizon DBIR found that regular AI use on corporate devices jumped from 15% to 45% in one year, and that 67% of users access AI services from non-corporate accounts. Shadow AI is now the third most common non-malicious insider action in DLP data, a fourfold year-over-year increase. The behavior is mainstream, not fringe.
You should be very concerned. The 2026 Verizon DBIR analyzed 858,440 DLP events involving AI tools and found source code was the most common data type uploaded by a large margin, followed by images, structured data, and research documentation. Source code in a public LLM is intellectual property in someone else’s training pipeline.
The 2026 Verizon DBIR found 3.2% of DLP policy violations involved research and technical documentation uploaded to unauthorized AI systems, alongside the dominant source code leakage. Healthcare-specific exposure was not broken out separately, but Healthcare’s Miscellaneous Errors pattern has been a top-three breach driver for over a decade — the Shadow AI vector compounds that risk.
No. The 2026 Verizon DBIR finding that 60% of malicious-insider breaches are now driven by Convenience — employees prioritizing getting their work done over policy — predicts that AI prohibition will fail the same way email-to-personal-account prohibition has. The Kiteworks approach is sanctioned AI access through a governed data layer, not prohibition.
Require five capabilities aligned with the 2026 Verizon DBIR data: governed AI data access that enforces policy at the data layer; OAuth 2.0 authentication for every AI session; attribute-based access control on every request; tamper-evident audit logs streamed to SIEM in real time; and content-aware controls that work for AI agents and human users alike. The Kiteworks Secure MCP Server and AI Data Gateway are designed for exactly this.