Home > Security and Compliance Blog > Cybersecurity Risk Management > When Your Vector Database Hands Out Pre-Auth RCE, RAG Has a Data-Layer Problem

When Your Vector Database Hands Out Pre-Auth RCE, RAG Has a Data-Layer Problem

by Patrick Spencer updated May 29, 2026 Cybersecurity Risk Management

Reading Time: 8 minutes

On May 18, 2026, HiddenLayer published research on ChromaToast — formally CVE-2026-45829. CVSS score: 10.0. Attack vector: network. Privileges required: none. User interaction: none. The flaw lives in the ChromaDB Python FastAPI server’s create_collection handler: the server trusts client-supplied model identifiers and acts on them — including loading code from external HuggingFace repositories with trust_remote_code set to true — before the authentication check fires. An unauthenticated attacker sends an HTTP request, the server fetches and executes attacker-controlled model code, and the result is arbitrary code execution with access to API keys, environment variables, mounted secrets, and any file the server process can read.

HiddenLayer’s Shodan-based scan found roughly 73% of deployments exposed. HiddenLayer reports first contact with the Chroma project on February 17, 2026, with documented follow-ups on February 24, March 5, and April 16 — receiving no response. Independent researcher Azraelxuemo reported the same flaw in November 2025 and also received no response. The interim mitigation is network restriction. There is no patch.

Table of Contents

5 Key Takeaways

1. ChromaToast is a CVSS 10.0 pre-auth RCE in the wild.

HiddenLayer disclosed CVE-2026-45829 on May 18, 2026, affecting all ChromaDB Python FastAPI server versions since 1.0.0. Roughly 73% of internet-accessible deployments are exploitable. No patch is available. Capital One and UnitedHealthcare are featured on Chroma’s homepage. This is not a fringe tool — it is infrastructure that runs RAG at scale, and the incident response for it starts with no fix path.

2. The bug exposes a deeper RAG architectural failure.

ChromaDB has 13 million monthly pip downloads and sits behind production RAG pipelines at Mintlify, Factory AI, and Weights & Biases. When the database holding embeddings, prompts, and retrieved documents is pre-auth exploitable, every secret the server process can read is in scope — API keys, environment variables, mounted credentials, and anything connected to them. This is not a vulnerability in the application layer. It is a vulnerability in the data infrastructure layer underneath it.

3. Vulnerability exploitation is now the dominant breach vector for the first time in 19 years.

The 2026 Verizon DBIR reports 31% of breaches from unpatched vulnerabilities versus 13% from credential abuse — the first time in the report’s history that exploitation has led. IBM X-Force reports exploitation of public-facing applications rose 44% year-over-year, with 56% of disclosed vulnerabilities requiring no authentication to exploit. RAG infrastructure is a credible target under these dynamics, not a hypothetical one.

4. Most enterprises lack a governed AI data layer.

Only 43% of organizations operate a centralized AI Data Gateway today. 57% are running fragmented, partial, or no controls. 90% of government organizations and 77% of healthcare organizations lack centralization entirely. AI deployment velocity is outpacing AI governance maturity — and ChromaToast is what that gap looks like in a production environment.

5. The architectural answer is governing data, not patching infrastructure.

When RAG pipelines reach enterprise data through a governed gateway with zero-trust access, ABAC enforcement, FIPS 140-3 encryption, and tamper-evident audit logging, a pre-auth RCE in any single component cannot translate to data compromise. The vulnerability becomes a containment problem rather than an exfiltration problem.

You Trust Your Organization is Secure. But Can You Verify It?

Read Now

Why Pre-Auth RCE in a Vector Database Is a Different Problem

Pre-authentication RCE flaws happen. What makes this one architecturally different is where ChromaDB sits in the AI stack. A vector database in a RAG pipeline is co-located with the most sensitive material in the system: embeddings of enterprise documents, retrieved chunks grounding model responses, prompts that may contain regulated data, and application secrets needed to reach upstream data sources. When the vector database is compromised, the blast radius is the entire RAG pipeline — embeddings reveal document content, stored prompts can include PII, PHI, or CUI, and API keys mount into whatever those keys can reach.

The deeper failure is architectural. The trust assumption in ChromaDB’s Python server — that it is acceptable to fetch and execute model code from an external registry before checking who is asking — is the same trust assumption running through most RAG deployments today. The retrieval layer is treated as infrastructure plumbing rather than as governed access to enterprise data. When the plumbing has a pre-auth RCE, governance built only at the application layer above it has no way to compensate. AI infrastructure is data infrastructure, and the same controls governing who can read a sensitive folder need to govern who — or what — can query an embedding store, mount a model artifact, or invoke a tool through an agent runtime.

Vulnerability Exploitation Is Now the Breach Vector to Worry About

The 2026 Verizon DBIR reports vulnerability exploitation overtook credential theft for the first time in 19 years. Unpatched vulnerabilities accounted for 31% of breaches; credential abuse dropped to 13%. Organizations patched only 26% of CISA’s Known Exploited Vulnerabilities catalog in 2025, down from 38% in 2024. Median time to full patching rose to 43 days from 32. The defender’s remediation pace is slowing at the exact moment the attacker’s exploitation pace is accelerating.

The IBM X-Force 2026 Threat Intelligence Index adds the public-facing application dimension: exploitation of public-facing applications rose 44% year-over-year, vulnerability exploitation accounted for 40% of incidents observed, and 56% of disclosed vulnerabilities required no authentication to exploit. Vector databases are public-facing applications when self-hosted with a network-reachable port — which is exactly the deployment pattern HiddenLayer’s 73% exposure figure measures.

AI Adversaries Are Faster Than Patches

A second trend running parallel to the vulnerability-exploitation shift makes the ChromaToast pattern worse. The UK AI Security Institute reports the difficulty of cyber tasks that frontier AI models can complete was doubling every eight months in late 2025 and every 4.7 months by February 2026. Stanford’s 2026 AI Index reports unguided solve rates on the Cybench cybersecurity benchmark rose from 15% in 2024 to 93% in 2025.

The Anthropic GTG-1002 disclosure from November 2025 makes the abstract concrete: a Chinese state-sponsored group used Claude Code plus MCP tools to orchestrate 80–90% of the tactical work of a multi-target cyber-espionage campaign against approximately 30 entities — reconnaissance, vulnerability discovery, exploitation, lateral movement, credential harvesting — all AI-executed. Stack this against the ChromaDB disclosure timeline: HiddenLayer first contacted Chroma on February 17, 2026, followed up three more times through April, and received no response. The vulnerability is public, the proof-of-concept logic is documented, no patch exists. The defender’s clock is not moving. The attacker’s clock is now measured in compute cycles.

Where Most Enterprises Sit on AI Data Governance

Only 43% of organizations operate a centralized AI Data Gateway. 27% rely on distributed controls with policies. 19% have partial or ad hoc controls. 7% have no dedicated AI controls at all. Government is at 90% without centralization; healthcare at 77%; financial services at 60%. These numbers describe the population exposed to the ChromaToast pattern. An organization with a centralized gateway has one architectural point at which to enforce authentication, authorization, encryption, and audit for every AI data interaction. An organization running partial or no controls has a sprawl of individual vector databases, embedding stores, and agent runtimes — each of which is its own pre-auth surface waiting to be discovered.

The Architectural Answer: Govern the Data Layer, Not the Component

The architectural alternative is treating AI data access the way enterprise security has been moving for a decade: zero-trust at the data layer, with every request authenticated, authorized against policy, and audited regardless of which component brought the request.

The Kiteworks Secure MCP Server and AI Data Gateway implement this pattern. RAG pipelines query enterprise data through a governed bridge. AI assistants reach files through the Model Context Protocol with OAuth 2.0 authentication — tokens in OS keychain, never reaching the model. ABAC policies evaluate every operation in real time. Rate limiting prevents bulk extraction even when an upstream component is compromised. FIPS 140-3 validated encryption, TLS validation, and a hardened virtual appliance underlie the entire path. Every interaction is logged to SIEM in real time via a tamper-evident audit trail.

The architectural test is simple: if the vector database were compromised tomorrow, what is the blast radius? In a governed-gateway architecture, the answer is bounded by ABAC policy, not by what credentials the compromised component happened to hold. The Kiteworks Private Data Network extends this across email, file sharing, MFT, SFTP, web forms, APIs, and AI integrations under one policy engine and one consolidated audit log.

What Security Leaders Should Do This Quarter

First, inventory every vector database, embedding store, and agent runtime touching enterprise data. Map each component’s authentication model, network exposure, and the credentials it can reach. Most security teams have an incomplete picture because AI infrastructure was stood up by data science teams without a CMDB entry.

Second, treat AI infrastructure as production infrastructure for vulnerability management. Apply the same patch SLAs, exposure management discipline, and KEV-driven prioritization. The 2026 DBIR’s finding that organizations patched only 26% of KEV-listed vulnerabilities applies to AI components the same way it applies to traditional infrastructure.

Third, move RAG and agent data access through a governed AI data gateway. Only 43% of organizations have done this. The remaining 57% face the ChromaToast pattern multiplied across every AI workload they run. Centralization at the data layer turns dozens of component-level attack surfaces into one governed control plane.

Fourth, require zero-trust data access for AI agents the same way you would for human users. Every AI data request authenticated, authorized against policy, rate-limited, encrypted, and logged with full attribution. Stanford’s 2026 AI Index reports that security and risk concerns are the top barrier to scaling agentic AI, cited by 62% of organizations — zero-trust data access is the controllable variable.

Fifth, instrument the AI data layer for SIEM visibility. A pre-auth RCE produces no auth log entries by definition. Visibility has to come from upstream — from the gateway mediating every data interaction. Real-time SIEM feeds of AI data access are the forensic foundation when the next ChromaToast-class disclosure hits.

To learn more about protecting your sensitive data from AI-related threats, schedule a custom demo today.

Frequently Asked Questions

Restrict network access to trusted clients only and treat any exposed instance as potentially already compromised pending investigation. Rotate any secrets the server process could read — API keys, mounted credentials, environment variables. Move RAG data access behind a governed AI Data Gateway as the long-term posture; network restriction is triage, not architecture.

Materially exposed if any RAG component is internet-reachable and ungoverned. 77% of healthcare organizations lack a centralized AI data gateway and 14% have no dedicated AI controls per the Kiteworks 2026 Forecast. HIPAA‘s Security Rule requires access controls and audit trails that ungoverned RAG components cannot produce. A pre-auth RCE on a component holding PHI embeddings is a reportable event.

Pre-auth RCE in an AI component touching CUI is a CMMC and DFARS reporting event. 90% of government organizations lack a centralized AI data gateway per the Kiteworks 2026 Forecast. CMMC AC, AU, and IA control families require enforced authorization and audit logging for every data access, including by AI components. Data-layer governance with ABAC and tamper-evident logs satisfies all three families simultaneously.

Patching addresses one vulnerability in one component. Governing AI data access establishes who or what can reach enterprise data at all, regardless of which downstream component is compromised. Centralized gateways become the architectural standard precisely because patching cannot keep pace with AI infrastructure vulnerability discovery rates — the ChromaToast disclosure timeline (months of non-response, no patch) demonstrates why.

Yes — it is the headcount-efficient choice. One governed control plane replaces dozens of component-level controls. The Kiteworks Secure MCP Server and AI Data Gateway provide that single architectural enforcement point — small security teams gain more from one architectural control than from patching every vector database, embedding store, and agent runtime individually.

Additional Resources