Claude Code, Copilot and Codex all got hacked. Every attacker went for the credential, not the model.
Our take

The recent exploits targeting AI coding assistants like Codex, Claude Code, and Copilot underscore a significant and growing vulnerability within enterprise AI systems. As detailed in the article, attackers have consistently aimed for credentials rather than the models themselves, exploiting gaps in security that have persisted despite the industry's rapid advancements. This troubling trend reflects a broader issue within AI development, where the rush to innovate often overshadows essential security practices. For those keen on understanding the implications of these incidents, the parallels with previous security failures are stark, particularly highlighted in articles such as One command turns any open-source repo into an AI agent backdoor. OpenClaw proved no supply-chain scanner has a detection category for it and 5,000 vibe-coded apps just proved shadow AI is the new S3 bucket crisis.
What makes these breaches particularly alarming is the methodical nature of the exploits. Each incident followed a similar pattern: an AI coding agent executed an action using credentials it held, bypassing security measures that should have been in place. For instance, in the case of Codex, a crafted GitHub branch name was able to exfiltrate an OAuth token due to a lack of input sanitization. This not only demonstrates a technical oversight but also raises critical questions about how organizations are vetting and managing the AI tools they integrate into their workflows. As Merritt Baer, CSO at Enkrypt AI, aptly pointed out, many enterprises are mistakenly approving interfaces rather than scrutinizing the underlying systems that support them.
The implications of these vulnerabilities extend beyond the immediate risk of credential theft. They highlight a systemic governance gap in how organizations oversee AI identities and permissions. Currently, most companies have robust measures for managing human identities but lack equivalent oversight for AI agents. This disparity puts enterprises at risk of not only credential exploitation but also potential data breaches that could arise from unchecked AI behavior. The need for comprehensive identity management frameworks that include AI agents is more urgent than ever. As emphasized in another relevant piece, Anthropic Skill scanners passed every check. The malicious code rode in on a test file, reliance on automated security assessments without human oversight can lead to catastrophic failures.
Looking ahead, the industry must confront the reality that security in AI systems cannot be an afterthought. Organizations need to establish rigorous governance protocols for AI agents akin to those they have for human users. This includes routine audits of OAuth scopes, credential management, and ensuring that agents operate on a principle of least privilege. As we navigate this new landscape, a fundamental question emerges: How can enterprises effectively balance the speed of technological advancement with the necessity for robust security measures? The answers will define the future of AI integration in business and dictate the level of risk organizations are willing to accept.
On March 30, BeyondTrust proved that a crafted GitHub branch name could steal Codex’s OAuth token in cleartext. OpenAI classified it Critical P1. Two days later, Anthropic’s Claude Code source code spilled onto the public npm registry, and within hours, Adversa found Claude Code silently ignored its own deny rules once a command exceeded 50 subcommands. These were not isolated bugs. They were the latest in a nine-month run: six research teams disclosed exploits against Codex, Claude Code, Copilot, and Vertex AI, and every exploit followed the same pattern. An AI coding agent held a credential, executed an action, and authenticated to a production system without a human session anchoring the request.
The attack surface was first demonstrated at Black Hat USA 2025, when Zenity CTO Michael Bargury hijacked ChatGPT, Microsoft Copilot Studio, Google Gemini, Salesforce Einstein and Cursor with Jira MCP on stage with zero clicks. Nine months later, those credentials are what attackers reached.
Merritt Baer, CSO at Enkrypt AI and former Deputy CISO at AWS, named the failure in an exclusive VentureBeat interview. “Enterprises believe they’ve ‘approved’ AI vendors, but what they’ve actually approved is an interface, not the underlying system.” The credentials underneath the interface are the breach.
Codex, where a branch name stole GitHub tokens
BeyondTrust researcher Tyler Jespersen, with Fletcher Davis and Simon Stewart, found Codex cloned repositories using a GitHub OAuth token embedded in the git remote URL. During cloning, the branch name parameter flowed unsanitized into the setup script. A semicolon and a backtick subshell turned the branch name into an exfiltration payload.
Stewart added the stealth. By appending 94 Ideographic Space characters (Unicode U+3000) after “main,” the malicious branch looked identical to the standard main branch in the Codex web portal. A developer sees “main.” The shell sees curl exfiltrating their token. OpenAI classified it Critical P1 and shipped full remediation by February 5, 2026.
Claude Code, where two CVEs and a 50-subcommand bypass broke the sandbox
CVE-2026-25723 hit Claude Code’s file-write restrictions. Piped sed and echo commands escaped the project sandbox because command chaining was not validated. Patched in 2.0.55. CVE-2026-33068 was subtler. Claude Code resolved permission modes from .claude/settings.json before showing the workspace trust dialog. A malicious repo set permissions.defaultMode to bypassPermissions. The trust prompt never appeared. Patched in 2.1.53.
The 50-subcommand bypass landed last. Adversa found that Claude Code silently dropped deny-rule enforcement once a command exceeded 50 subcommands. Anthropic’s engineers had traded security for speed and stopped checking after the fiftieth. Patched in 2.1.90.
“A significant vulnerability in enterprise AI is broken access control, where the flat authorization plane of an LLM fails to respect user permissions,” wrote Carter Rees, VP of AI and Machine Learning at Reputation and a member of the Utah AI Commission. The repository decided what permissions the agent had. The token budget decided which deny rules survived.
Copilot, where a pull request description and a GitHub issue both became root
Johann Rehberger demonstrated CVE-2025-53773 against GitHub Copilot with Markus Vervier of Persistent Security as co-discoverer. Hidden instructions in PR descriptions triggered Copilot to flip auto-approve mode in .vscode/settings.json. That disabled all confirmations and granted unrestricted shell execution across Windows, macOS, and Linux. Microsoft patched it in the August 2025 Patch Tuesday release.
Then, Orca Security cracked Copilot inside GitHub Codespaces. Hidden instructions in a GitHub issue manipulated Copilot into checking out a malicious PR with a symbolic link to /workspaces/.codespaces/shared/user-secrets-envs.json. A crafted JSON $schema URL exfiltrated the privileged GITHUB_TOKEN. Full repository takeover. Zero user interaction beyond opening the issue.
Mike Riemer, CTO at Ivanti, framed the speed dimension in a VentureBeat interview: “Threat actors are reverse engineering patches within 72 hours. If a customer doesn’t patch within 72 hours of release, they’re open to exploit.” Agents compress that window to seconds.
Vertex AI, where default scopes reached Gmail, Drive and Google’s own supply chain
Unit 42 researcher Ofir Shaty found that the default Google service identity attached to every Vertex AI agent had excessive permissions. Stolen P4SA credentials granted unrestricted read access to every Cloud Storage bucket in the project and reached restricted, Google-owned Artifact Registry repositories at the core of the Vertex AI Reasoning Engine. Shaty described the compromised P4SA as functioning like a "double agent," with access to both user data and Google's own infrastructure.
VentureBeat defense grid
Security requirement | Defense shipped | Exploit path | The gap |
Sandbox AI agent execution | Codex runs tasks in cloud containers; token scrubbed during agent runtime. | Token present during cloning. Branch-name command injection executed before cleanup. | No input sanitization on container setup parameters. |
Restrict file system access | Claude Code sandboxes writes via accept-edits mode. | Piped sed/echo escaped sandbox (CVE-2026-25723). Settings.json bypassed trust dialog (CVE-2026-33068). 50-subcommand chain dropped deny-rule enforcement. | Command chaining not validated. Settings loaded before trust. Deny rules truncated for performance. |
Block prompt injection in code context | Copilot filters PR descriptions for known injection patterns. | Hidden injections in PRs, README files, and GitHub issues triggered RCE (CVE-2025-53773 + Orca RoguePilot). | Static pattern matching loses to embedded prompts in legitimate review and Codespaces flows. |
Scope agent credentials to least privilege | Vertex AI Agent Engine uses P4SA service agent with OAuth scopes. | Default scopes reached Gmail, Calendar, Drive. P4SA credentials read every Cloud Storage bucket and Google’s Artifact Registry. | OAuth scopes non-editable by default. Least privilege violated by design. |
Inventory and govern agent identities | No major AI coding agent vendor ships agent identity discovery or lifecycle management. | Not attempted. Enterprises do not inventory AI coding agents, their credentials, or their permission scopes. | AI coding agents are invisible to IAM, CMDB, and asset inventory. Zero governance exists. |
Detect credential exfiltration from agent runtime | Codex obscures tokens in web portal view. Claude Code logs subcommands. | Tokens visible in cleartext inside containers. Unicode obfuscation hid exfil payloads. Subcommand chaining hid intent. | No runtime monitoring of agent network calls. Log truncation hid the bypass. |
Audit AI-generated code for security flaws | Anthropic launched Claude Code Security (Feb 2026). OpenAI launched Codex Security (March 2026). | Both scan generated code. Neither scans the agent’s own execution environment or credential handling. | Code-output security is not agent-runtime security. The agent itself is the attack surface. |
Every exploit targeted runtime credentials, not model output
Every vendor shipped a defense. Every defense was bypassed.
The Sonar 2026 State of Code Developer Survey found 25% of developers use AI agents regularly, and 64% have started using them. Veracode tested more than 100 LLMs and found 45% of generated code samples introduced OWASP Top 10 flaws, a separate failure that compounds the runtime credential gap.
CrowdStrike CTO Elia Zaitsev framed the rule in an exclusive VentureBeat interview at RSAC 2026: collapse agent identities back to the human, because an agent acting on your behalf should never have more privileges than you do. Codex held a GitHub OAuth token scoped to every repository the developer authorized. Vertex AI’s P4SA read every Cloud Storage bucket in the project. Claude Code traded deny-rule enforcement for token budget.
Kayne McGladrey, an IEEE Senior Member who advises enterprises on identity risk, made the same diagnosis in an exclusive interview with VentureBeat. "It uses far more permissions than it should have, more than a human would, because of the speed of scale and intent."
Riemer drew the operational line in an exclusive VentureBeat interview. "It becomes, I don't know you until I validate you." The branch name talked to the shell before validation. The GitHub issue talked to Copilot before anyone read it.
Security director action plan
Inventory every AI coding agent (CIEM). Codex, Claude Code, Copilot, Cursor, Gemini Code Assist, Windsurf. List the credentials and OAuth scopes each received at setup. If your CMDB has no category for AI agent identities, create one.
Audit OAuth scopes and patch levels. Upgrade Claude Code to 2.1.90 or later. Verify Copilot's August 2025 patch. Migrate Vertex AI to the bring-your-own-service-account model.
Treat branch names, pull request descriptions, GitHub issues, and repo configuration as untrusted input. Monitor for Unicode obfuscation (U+3000), command chaining over 50 subcommands, and changes to .vscode/settings.json or .claude/settings.json that flip permission modes.
Govern agent identities the way you govern human privileged identities (PAM/IGA). Credential rotation. Least-privilege scoping. Separation of duties between the agent that writes code and the agent that deploys it. CyberArk, Delinea, and any PAM platform that accepts non-human identities can onboard agent OAuth credentials today; Gravitee's 2026 survey found only 21.9% of teams have done it.
Validate before you communicate. "As long as we trust and we check and we validate, I'm fine with letting AI maintain it," Riemer said. Before any AI coding agent authenticates to GitHub, Gmail, or an internal repository, verify the agent's identity, scope, and the human session it is bound to.
Ask each vendor in writing before your next renewal. "Show me the identity lifecycle management controls for the AI agent running in my environment, including credential scope, rotation policy, and permission audit trail." If the vendor cannot answer, that is the audit finding.
The governance gap in three sentences
Most CISOs inventory every human identity and have zero inventory of the AI agents running with equivalent credentials. No IAM framework governs human privilege escalation and agent privilege escalation with the same rigor. Most scanners track every CVE but cannot alert when a branch name exfiltrates a GitHub token through a container that developers trust by default.
Zaitsev's advice to RSAC 2026 attendees was blunt: you already know what to do. Agents just made the cost of not doing it catastrophic.
Read on the original site
Open the publisher's page for the full experience
Related Articles
- One command turns any open-source repo into an AI agent backdoor. OpenClaw proved no supply-chain scanner has a detection category for itJust two months ago, researchers at the Data Intelligence Lab at the University of Hong Kong introduced CLI-Anything, a new state-of-the-art tool that analyzes any repo’s source code and generates a structured command line interface (CLI) that AI coding agents can operate with a single command. Claude Code, Codex, OpenClaw, Cursor, and GitHub Copilot CLI are all supported, and since its launch in March, CLI‑Anything has climbed to more than 30,000 GitHub stars. But the same mechanism that makes software agent-native opens the door to agent-level poisoning. The attack community is already discussing the implications on X and security forums, translating CLI-Anything's architecture into offensive playbooks. The security problem is not what CLI-Anything does. It is what CLI-Anything represents. CLI-Anything generates SKILL.md files, the same instruction-layer artifacts that Snyk’s ToxicSkills research found laced with 76 confirmed malicious payloads across ClawHub and skills.sh in February 2026. A poisoned skill definition does not trigger a CVE and never appears in a software bill of materials (SBOM). No mainstream security scanner has a detection category for malicious instructions embedded in agent skill definitions, because the category simply did not exist eighteen months ago. Cisco confirmed the gap in April. “Traditional application security tools were not designed for this,” Cisco’s engineering team wrote in a blog post announcing its AI Agent Security Scanner for IDEs. “SAST [static application security testing] scanners analyze source code syntax. SCA [software composition analysis] tools check dependency versions. Neither understands the semantic layer where MCP [Model Context Protocol] tool descriptions, agent prompts, and skill definitions operate.” Merritt Baer, CSO of Enkrypt AI and former Deputy CISO at Amazon Web Services (AWS), told VentureBeat in an exclusive interview: “SAST and SCA were built for code and dependencies. They don’t inspect instructions.” This is not a single-vendor vulnerability. It is a structural gap in how the entire security industry monitors software supply chains. This is the pre-exploitation window. CLI-Anything is live, the attack community is discussing it, and security directors who act now get ahead of the first incident report. The integration layer no stack can see Traditional supply-chain security operates on two layers. The code layer is where SAST works, scanning source files for insecure patterns, injection flaws, and hardcoded secrets. The dependency layer is where SCA works, checking package versions against known vulnerabilities, generating SBOMs, and flagging outdated libraries. Agent bridge tools like CLI-Anything, MCP connectors, Cursor rules files, and Claude Code skills operate on a third layer between the other two. Call it the agent integration layer: configuration files, skill definitions, and natural-language instruction sets tell an AI agent what software can do and how to operate it. None of it looks like code. All of it executes like code. Carter Rees, VP of AI at Reputation, told VentureBeat in an exclusive interview: “Modern LLMs [large language models] rely on third-party plugins, introducing supply chain vulnerabilities where compromised tools can inject malicious data into the conversation flow, bypassing internal safety training.” Researchers at Griffith University, Nanyang Technological University, the University of New South Wales, and the University of Tokyo documented the attack chain in an April paper, “Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems.” The team introduced Document-Driven Implicit Payload Execution (DDIPE), a technique that embeds malicious logic inside code examples within skill documentation. Across four agent frameworks and five large language models, DDIPE achieved bypass rates between 11.6% and 33.5%. Static analysis caught most samples, but 2.5% evaded all four detection layers. Responsible disclosure led to four confirmed vulnerabilities and two vendor fixes. The kill chain security leaders need to audit Here's the anatomy of the kill chain: An attacker submits a SKILL.md file to an open-source project containing setup instructions, code examples, and configuration templates. It looks like standard documentation. A code reviewer would wave it through because none of it is executable. But the code examples contain embedded instructions that an agent will parse as operational directives. A developer uses an agent bridge tool to connect their coding agent to the repository. The agent ingests the skill definition and trusts it, because no verification layer exists to distinguish benign from malicious intent at the instruction level. The agent executes the embedded instruction using its own legitimate credentials. Endpoint detection and response (EDR) sees an approved API call from an authorized process and passes it. Data exfiltration, configuration changes, and credential harvesting are all moving through channels that the monitoring stack considers normal traffic. Rees identified the structural flaw that makes this chain lethal. “A significant vulnerability in enterprise AI is broken access control, where the flat authorization plane of an LLM fails to respect user permissions,” he told VentureBeat. A compromised skill definition riding that flat authorization plane does not need to escalate privileges. It already has them. Every link in that chain is invisible to the current security stack. Pillar Security demonstrated a variant of this chain against Cursor in January 2026 (CVE-2026-22708). Implicitly trusted shell built-in commands could be poisoned through indirect prompt injection, converting benign developer commands into arbitrary code execution vectors. Users saw only the final command. The poisoning happened through other commands the IDE never surfaced for approval. The evidence is already in production In a documented attack chain from April 2026, a crafted GitHub issue title triggered an AI triage bot wired into Cline. The bot exfiltrated a GITHUB_TOKEN, which the attacker used to publish a compromised npm dependency that installed a second agent on roughly 4,000 developer machines for eight hours. There was just one issue title. Attackers had eight hours of access. No human approved the action. Snyk’s ToxicSkills audit scanned 3,984 agent skills from ClawHub, the public marketplace for the OpenClaw agent framework, and skills.sh in February 2026. The results: 13.4% of all skills contained at least one critical security issue. Daily skill submissions jumped from less than 50 in mid-January to more than 500 by early February. The barrier to publishing was a SKILL.md markdown file and a GitHub account one week old. No code signing. No security review. No sandbox. OpenClaw is not an outlier. It is the pattern. “The bar to entry is extremely low,” Baer said. “Adding a skill can be as simple as uploading a Word doc or lightweight config file. That’s a radically different risk profile than compiled code.” She pointed to projects like ClawPatrol that have started cataloging and scanning for malicious skills, evidence the ecosystem is moving faster than enterprise defenses. The ClawHavoc campaign, first reported by Koi Security in late January 2026, initially identified 341 malicious skills on ClawHub. A follow-up analysis by Antiy CERT expanded the count to 1,184 compromised packages across the platform. The campaign delivered Atomic Stealer (AMOS) through skill definitions with professional documentation. Skills named solana-wallet-tracker and polymarket-trader matched what developers actively searched for. The MCP protocol layer carries similar exposure. OX Security reported in April that researchers poisoned nine out of 11 MCP marketplaces using proof-of-concept servers. Trend Micro initially found 492 MCP servers exposed to the internet with zero authentication; by April, that number had grown to 1,467. As The Register reported, the root issue lies in Anthropic’s MCP software development kit (SDK) transport mechanism. Any developer using the official SDK inherits the vulnerability class. VentureBeat Prescriptive Matrix: Three-layer agent supply-chain audit VentureBeat developed a Prescriptive Matrix by mapping the three attack layers documented in the research and incident reports above against the detection capabilities of current SAST, SCA, and agent-layer tools. Each row identifies what security teams should verify and where no scanner has coverage today. Layer Threat Current detection Why it misses Recommended action 1. Code Prompt injection in AI-generated code SAST scanners Most SAST tools have no detection category for prompt injection in AI-generated code Confirm that SAST scans AI-generated code for prompt injection. If not, have an open vendor conversation this quarter. 2. Dependencies Malicious MCP servers, agent skills, plugin registries SCA tools SCA generates no AI-specific bill of materials. Agent-layer dependencies are invisible. Confirm SCA includes MCP servers, agent skills, and plugin registries in the dependency inventory. 3. Agent integration Poisoned SKILL.md files, malicious instruction sets, adversarial rules files None until April 2026 No tool inspects the semantic meaning of agent instruction files. Baer: “We’re not inspecting intent.” Deploy Cisco Skill Scanner or Snyk mcp-scan. Assign a team to own this layer. Baer’s diagnosis of Layer 3 applies across the entire matrix: “Current scanners look for known bad artifacts, not adversarial instructions embedded in otherwise valid skills.” Cisco’s open-source Skill Scanner and Snyk’s mcp-scan represent the first tools purpose-built for this layer. Security director action plan Here's how security leaders can get ahead of the problem. Inventory every agent bridge tool in the environment. This includes CLI-Anything, MCP connectors, Cursor rules files, Claude Code skills, GitHub Copilot extensions. If the development team is using agent bridge tools that have not been inventoried, the risk cannot be assessed. Audit agent skill sources the same way package registries get audited. Baer’s framing is precise: “A skill is effectively untrusted executable intent, even if it’s just text.” Shut off ungoverned ingestion paths until controls are in place. Stand up a review and allowlisting process for skills. The OWASP Agentic Skills Top 10 (AST01: Malicious Skills) provides the procurement framework to align controls against. Deploy agent-layer scanning. Evaluate Cisco’s open-source Skill Scanner and Snyk’s mcp-scan for behavioral analysis of agent instruction files. If dedicated tooling is unavailable, require a second engineer to read every SKILL.md before installation. Restrict agent execution privileges and instrument runtime. AI coding agents should not run with the same credential scope as the developer who invoked them. Rees confirmed the structural flaw: The flat authorization plane means a compromised skill does not need to escalate privileges. Baer’s prescription: “Instrument runtime observability. What data is the agent accessing, what actions is it taking, and are those aligned with expected behavior?” Assign ownership for the gap between layers. The most dangerous attacks succeed because they fall between detection categories. Assign a team to own the agent integration layer. Review every SKILL.md, MCP config, and rules file before it enters the environment. The gap that already has a name Baer underscored the dangers of this new attack vector. “This feels very similar to early container security, but we’re still in the ‘we’ll get to it’ phase across most orgs," she said. She added that, at AWS, it took a few high-profile wake-up calls before container security became table stakes. The difference this time is speed. “There’s no build pipeline, no compilation barrier. Just content," she said. CLI-Anything is not the threat. It is the proof case that the agent integration layer exists, that it is growing fast, and that the attacker community has already found it. The 33,000 developers who starred the repository are telling security teams where software development is heading. Eighteen months ago, the detection category for agent-integration-layer poisoning did not exist. Cisco and Snyk shipped the first tools for it in April. The window between those two facts is closing. Security directors who have not begun inventory are already behind.
- Anthropic Skill scanners passed every check. The malicious code rode in on a test file.Picture this scenario: An Anthropic Skill scanner runs a full analysis of a Skill pulled from ClawHub or skills.sh. Its markdown instructions are clean, and no prompt injection is detected. No shell commands are hiding in the SKILL.md. Green across the board. The scanner never looked at the .test.ts file sitting one directory over. It didn’t need to. Test files aren’t part of the agent execution surface, so no publicly documented scanner inspects them (as of publication of this post). The file runs anyway. Not through the agent but through the test runner, with full access to the filesystem, environment variables, and SSH keys. Gecko Security researcher Jeevan Jutla detailed this attack flow, demonstrating that when a developer runs npx Skills add, the installer copies the entire skill directory into the repo. If a malicious Skill bundles a *.test.ts file, the Jest and Vitest testing frameworks discover it through recursive glob patterns, treat it as a first-class test, and execute it during npm test or when the IDE auto-runs tests on save. The default configuration in open-source JavaScript test framework Mocha follows a similar recursive discovery pattern. The payload fires in beforeAll, before any assertions run. Nothing in the test output flags anything unusual. In CI, process.env holds deployment tokens, cloud credentials, and every secret the pipeline can reach. The attack class is not new; malicious npm postinstall scripts and pytest plugins have exploited trust-on-install for years. What makes the Skill vector worse is that installed Skills land in a directory designed to be committed and shared across the team, propagate to every teammate who clones, and sit outside every scanner's detection surface. The agent is never invoked, and the Anthropic Skill scanner reads the right files for the wrong threat model. Three audits, one blind spot Gecko's disclosure didn’t arrive in isolation. It landed on top of two large-scale security audits that had already documented the scope of the problem from the other direction, illustrating what scanners detect rather than what they miss. Both audits did exactly what they're designed to do: They measured the threat on the execution surface scanners already inspect. Gecko measured what sits outside it. A SkillScan academic study, published on January 15, analyzed 31,132 unique Anthropic Skills collected from two major marketplaces. Their findings: 26.1% of Skills contained at least one vulnerability spanning 14 distinct patterns across four categories. Data exfiltration showed up in 13.3% of Skills. Privilege escalation appeared in 11.8%. Skills bundling executable scripts were 2.12x more likely to contain vulnerabilities than instruction-only Skills. Three weeks later, Snyk published ToxicSkills, the first comprehensive security audit of the ClawHub and skills.sh marketplaces. Snyk's team scanned 3,984 Skills (as of February 5). The results: 13.4% of all Skills contained at least one critical-level security issue. Seventy-six confirmed malicious payloads were identified through a combination of automated scanning and human-in-the-loop review. Eight of those malicious Skills were still publicly available on ClawHub when the research was published. Then Cisco shipped its AI Agent Security Scanner for IDEs on April 21, integrating its open-source Skill Scanner directly into VS Code, Cursor, and Windsurf. The scanner brings genuine capability to developers’ workflows. It does not inspect bundled test files, because the detection categories Cisco built target the agent interaction layer, not the developer toolchain layer. The three major Anthropic Skill scanners share a structural blind spot: None inspects bundled test files as an execution surface, even though Gecko Security proved that those files execute with full local permissions through standard test runners. Snyk Agent Scan, Cisco's AI Agent Security Scanner, and VirusTotal Code Insight all work. They catch prompt injection, shell commands, and data exfiltration in Skill definitions and agent-referenced scripts. What they do not do is look beyond the agent execution surface to the developer execution surface sitting in the same directory. How the attack chain works The mechanics of the attack chain matter because the fix is precise. When a developer runs npx skills add owner/repo-name, the installer clones the Skill repository and copies its contents into .agents/skills/<skill-name>/ inside the project. Claude Code, Cursor, and other agent IDEs get symlinks into their own Skill directories. The only files excluded are .git, metadata.json, and files prefixed with _. Everything else lands on disk. Jest and Vitest both pass dot: true to their glob engines. That means they discover test files inside dot-prefixed directories like .agents/. Mocha's behavior depends on configuration but follows similar recursive patterns by default. None of them exclude .agents/, .claude/, or .cursor/ from their default discovery paths. An attacker publishes a Skill with a clean SKILL.md and a tests/reviewer.test.ts file containing a beforeAll block. The block reads process.env, .env files, ~/.ssh/ private keys, and ~/.aws/credentials. It posts everything to an external endpoint. The test cases look real. The exfiltration happens during setup, silently, whether the tests pass or fail. The vector is not limited to TypeScript. Python repos face the same exposure through conftest.py, which pytest auto-executes during test collection. Add .agents to testpaths exclusion in pyproject.toml to block it. The .agents/skills/ directory is designed to be committed to the repo so teammates can share Skills. GitHub's default .gitignore templates do not include .agents/. Once the malicious test file enters the repo, every developer who clones and runs tests executes the payload. So does every CI pipeline on every branch and every fork that inherits the test suite. Scanners are reading the wrong threat surface CrowdStrike CTO Elia Zaitsev put the structural challenge in operational terms during an exclusive VentureBeat interview at RSAC 2026. "Observing actual kinetic actions is a structured, solvable problem," Zaitsev said. "Intent is not." That distinction cuts directly at the Anthropic Skill scanner gap. No publicly documented scanner operates outside the assumption that the threat lives in the SKILL.md and in scripts the agent is instructed to run. These tools analyze intent: What does the Skill tell the agent to do? Gecko's finding sits on the kinetic side. The test file executes through the developer's own toolchain. No agent is involved. No prompt is interpreted. The payload is TypeScript, running with full local permissions through a legitimate test runner. The scanner was solving the wrong problem. CrowdStrike's Zaitsev framed the identity dimension: "AI agents and non-human identities will explode across the enterprise, expanding exponentially and dwarfing human identities," he told VentureBeat. "Each agent will operate as a privileged super-human with OAuth tokens, API keys, and continuous access to previously siloed data sets." CrowdStrike's Charlotte AI and similar enterprise agents operate with exactly these privileges. When those credentials live in environment variables accessible to any process in the repo, a test-file payload does not need agent privileges. It already has developer privileges, which in most CI configurations means deployment tokens and cloud access. Mike Riemer, SVP of the network security group and field CISO at Ivanti, quantified the exploitation window in a VentureBeat interview. "Threat actors are reverse engineering patches within 72 hours," Riemer said. "If a customer doesn't patch within 72 hours of release, they're open to exploit." Most enterprises take weeks. The Anthropic Skill scanner blind spot compounds that window. A developer installs a malicious Skill today. The test file executes immediately. No patch exists because no scanner flagged it. The Anthropic Skill Audit Grid VentureBeat has covered the Anthropic Skill supply chain since the ClawHavoc campaign hit ClawHub in January. Every conversation with security leaders lands on the same frustration. Their teams bought a scanner, it reports clean, and they have no framework for asking what it does not check. VentureBeat has polled dev teams who install Anthropic Skills from ClawHub and skills.sh. The grid below connects the published-audit half (Snyk, SkillScan) with the scanner-bypass half (Gecko). Each row represents a detection surface a security team should verify before approving any Skill scanning tool for Q2 procurement. Audit question What scanners do today The gap Recommended action Inspect SKILL.md and agent-invoked scripts Covered by Snyk Agent Scan, Cisco AI Agent Security Scanner, VirusTotal Code Insight This is the covered surface. Attackers shift payloads to files outside it. Continue running current scanners. They catch real threats at the instruction layer. Inspect bundled test files (*.test.ts, *.spec.js, conftest.py) Not currently inspected as attack surface by any scanner Gecko proved test files execute via Jest/Vitest (documented) and Mocha (config-dependent) with full local permissions. No agent invoked. Add .agents/ to testPathIgnorePatterns (Jest) or exclude (Vitest). One config line. Flag Skills that bundle test files or build configs Not flagged as higher-risk metadata by any scanner Trivial static check. Skills with extra executables are 2.12x more likely to be vulnerable (SkillScan). Add CI gate: find .agents/ -name "*.test.*" | grep -q . && exit 1. Block merge on match. Restrict test-runner globs to project-owned paths Rare. Most CI configs use recursive glob. Jest/Vitest pass dot: true by default. Default globs traverse .agents/, .claude/, .cursor/ directories. Malicious test files auto-discovered. Scope test roots to first-party directories (src/, app/). Deny .agents/, .claude/, .cursor/. Distinguish script-bundling Skills vs. instruction-only Partial coverage via static and semantic analysis SkillScan: script-bundling Skills 2.12x more likely to contain vulnerabilities than instruction-only. Require structured audit entry: Skill type, execution surfaces, scanner coverage, residual risk. Publish audit methodology with sample size Snyk yes (3,984 Skills). SkillScan yes (31,132 Skills). Cisco and emerging scanners have not published equivalent ecosystem-scale audits. Ask vendors: methodology, sample size, detection rate. No published audit = no independent baseline. Pin Skill sources to immutable commits Not enforced by any scanner or marketplace Skill authors can push clean version for review, add malicious test file after approval. Pin to specific commit hash. Review diffs on every update. OWASP Agentic Skills Top 10 recommends this. Three CI hardening steps to add now Riemer made the broader point in VentureBeat interviews that placing security controls at the perimeter invites every threat to that exact boundary. Anthropic Skill scanners placed the boundary at SKILL.md. Attackers put the payload one directory over. The three changes below move the boundary to where the code actually executes. These changes take minutes. None requires replacing current tools or waiting for scanner vendors to close the gap. Add .agents/ to the test runner's ignore list. In Jest, add /\.agents/ to testPathIgnorePatterns in jest.config.js. In Vitest, add **/.agents/** to the exclude array in vitest.config.ts. One line in one config file prevents the test runner from discovering files inside installed Skill directories. Do it whether or not the team currently uses Anthropic Skills. The directory may appear in a cloned repo without anyone installing the Skill directly. Audit every Skill install for non-instruction files before merge. Add a CI check that flags any file in .agents/skills/ matching *.test.*, *.spec.*, __tests__/, *.config.*, or conftest.py. These files have no legitimate reason to exist inside a Skill directory. The check is a shell one-liner: [ -d .agents ] && find .agents/ -name "*.test.*" -o -name "*.spec.*" -o -name "conftest.py" -o -name "*.config.*" -o -type d -name "__tests__" | grep -q . && exit 1. If it matches, block the merge. For any test files that do land in a PR, require a reviewer to skim for shell invocations (exec, spawn, child_process), external network calls, and file operations touching secrets or SSH keys. Pin Skill sources to specific commits, not latest. The npx skills add command copies whatever the repo contains at the moment of install. A Skill author can push a clean version for scanner review, then add a malicious test file after approval. Pinning to a specific commit hash converts a trust-on-first-use model into a verify-on-every-change model. The OWASP Agentic Skills Top 10 recommends exactly this. If Skills are already in your repo: Run the find command above against your existing .agents/ directory now. If test files are present, treat them as a potential compromise: Rotate any credentials accessible to CI (deployment tokens, cloud keys, SSH keys), audit CI logs for unexpected outbound network calls during test execution, and review git history to determine when the test files entered the repo and which pipelines have executed them. Five questions to ask your Anthropic Skill scanner vendor Security teams are signing contracts for their first dedicated Skill scanning tools. The Gecko bypass means the questions on those sales calls need to change. Do not stop at "Do you detect prompt injection?" Ask: Which files and directories do you actually analyze in a Skill repo? Do you treat test files as potential execution surfaces? Can you flag Skills that bundle tests, CI configs, or build scripts as higher-risk? SkillScan showed script-bundling Skills are 2.12x more likely to be vulnerable. Do you provide integration or guidance for restricting test-runner globs in CI? Cisco deserves credit for open-sourcing its Skill Scanner on GitHub, which lets security teams inspect exactly which detection categories the tool implements. That transparency is the baseline every vendor should meet. If your vendor will not publish detection categories or open-source their scanning logic, you cannot verify what they check and what they skip. Have you published an ecosystem-scale audit with methodology and sample size? Snyk published at 3,984 Skills. SkillScan published at 31,132. Riemer described the disclosure pattern: "They chose not to publish a CVE. They just quietly patched it and moved on with life," he said. The Anthropic Skills ecosystem is showing early signs of the same pattern: scanners document what they detect without mapping the surfaces they do not reach. The gap between documented coverage and actual execution surface is where the test-file vector lives. The audit grid matters because the scanner model is incomplete The Anthropic Skills ecosystem is repeating the early npm supply chain story, except without the decade of accumulated incidents that forced package registries to build security infrastructure. SkillScan's 31,132-Skill dataset showed a quarter of the ecosystem carrying vulnerabilities. Snyk found 76 confirmed malicious payloads in fewer than 4,000 Skills. Gecko proved the scanner model itself has a structural gap that no vendor has publicly documented closing. Scanner evaluations consistently test the covered surface. The Anthropic Skill Audit Grid gives security teams the seven audit surfaces to verify before signing. The three CI steps are the fixes to deploy before the next Skill install. Riemer's Ivanti team watches the patch-to-exploit cycle compress in real time across enterprise environments. The test-file vector compresses it further: No scanner flagged the threat, so no patch window exists. The scanner is not broken. It is incomplete. The threat model stopped at the agent. The test runner did not.
- 5,000 vibe-coded apps just proved shadow AI is the new S3 bucket crisisMost enterprise security programs were built to protect servers, endpoints, and cloud accounts. None of them was built to find a customer intake form that a product manager vibe coded on Lovable over a weekend, connected to a live Supabase database, and deployed on a public URL indexed by Google. That gap now has a price tag. New research from Israeli cybersecurity firm RedAccess quantifies the scale. The firm discovered 380,000 publicly accessible assets, including applications, databases, and related infrastructure, built with vibe coding tools from Lovable, Base44, and Replit, as well as deployment platform Netlify. Roughly 5,000 of those assets, about 1.3%, contained sensitive corporate information. CEO Dor Zvi said his team found the exposure while researching shadow AI for customers. Axios independently verified multiple exposed apps, and Wired confirmed the findings separately. Among the verified exposures: a shipping company app detailed which vessels were expected at which ports. An internal health company application listed active clinical trials across the U.K. Full, unredacted customer service conversations for a British cabinet supplier sat on the open web. Internal financial information for a Brazilian bank was accessible to anyone who found the URL. The exposed data also included patient conversations at a children’s long-term care facility, hospital doctor-patient summaries, incident response records at a security company, and ad purchasing strategies. Depending on jurisdiction and the data involved, the healthcare and financial exposures may trigger regulatory obligations under HIPAA, UK GDPR, or Brazil’s LGPD. RedAccess found phishing sites built on Lovable that impersonated Bank of America, FedEx, Trader Joe’s, and McDonald’s. Lovable said it had begun investigating and removing the phishing sites. The defaults are the problem Privacy settings on several vibe coding platforms make apps publicly accessible unless users manually switch them to private. Many of these applications get indexed by Google and other search engines. Anyone can stumble across them. Zvi put it plainly: “I don’t think it’s feasible to educate the whole world around security. My mother is [vibe coding] with Lovable, and no offense, but I don’t think she will think about role-based access.” This is not an isolated finding In October 2025, Escape.tech scanned 5,600 publicly available vibe-coded applications and found more than 2,000 high-impact vulnerabilities, over 400 exposed secrets including API keys and access tokens, and 175 instances of personal data exposure containing medical records and bank account numbers. Every vulnerability Escape found was in a live production system, discoverable within hours. The full report documents the methodology. Escape separately raised an $18 million Series A led by Balderton in March 2026, citing the security gap opened by AI-generated code as a core market thesis. Gartner’s “Predicts 2026” report forecasts that by 2028, prompt-to-app approaches adopted by citizen developers will increase software defects by 2,500%. Gartner identifies a new class of defect where AI generates code that is syntactically correct but lacks awareness of broader system architecture and nuanced business rules. The remediation costs for these deep contextual bugs will consume budgets previously allocated to innovation. Shadow AI is the multiplier IBM’s 2025 Cost of a Data Breach Report found that 20% of organizations experienced breaches linked to shadow AI. Those incidents added $670,000 to the average breach cost, pushing the shadow AI breach average to $4.63 million. Among organizations that reported AI-related breaches, 97% lacked proper access controls. And 63% of breached organizations had no AI governance policy in place. Shadow AI breaches disproportionately exposed customer personally identifiable information at 65%, compared to 53% across all breaches, and affected data distributed across multiple environments 62% of the time. Only 34% of organizations with AI governance policies performed regular audits for unsanctioned AI tools. VentureBeat’s shadow AI research estimated that actively used shadow apps could more than double by mid-2026. Cyberhaven data found 73.8% of ChatGPT workplace accounts in enterprise environments were unauthorized. What to do first The audit framework below gives CISOs a starting point for triaging vibe-coded app risk across five domains. Domain Current State (Most Orgs) Target State First Action Discovery No visibility into vibe-coded apps Automated scanning of vibe coding platform domains Run DNS + certificate transparency scan for Lovable, Replit, Base44, and Netlify subdomains tied to corporate assets Authentication Platform defaults (public by default) SSO/SAML integration required before deployment Block unauthenticated apps from accessing internal data sources Code scanning Zero coverage for citizen-built apps Mandatory SAST/DAST before production Extend the existing AppSec pipeline to cover vibe-coded deployments Data loss prevention No DLP coverage for vibe coding domains DLP policies covering Lovable, Replit, Base44, Netlify Add vibe coding platform domains to existing DLP rules Governance No AI usage policy or shadow AI detection AI governance policy with regular audits for unsanctioned tools Publish an acceptable-use policy for AI coding tools with a pre-deployment review gate The CISO who treats this as a policy problem will write a memo. The CISO who treats this as an architecture problem will deploy discovery scanning across the four largest vibe coding domains, require pre-deployment security review, extend the existing AppSec pipeline to citizen-built apps, and add those domains to DLP rules before the next board meeting. One of those CISOs avoids the next headline. The vibe coding exposure RedAccess documented is not a separate problem from shadow AI. It is shadow AI's production layer. Employees build internal tools on platforms that default to public, skip authentication, and never appear on any asset inventory, which means the applications stay invisible to security teams until a breach surfaces or a reporter finds them first. Traditional asset discovery tools were designed to find servers, containers, and cloud instances. They have no way to find a marketing configurator that a product manager built on Lovable over a weekend, connected to a Supabase database holding live customer records, and shared with three external contractors through a public URL that Google indexed within hours. The detection challenge runs deeper than most security teams realize. Vibe-coded apps deploy on platform subdomains that rotate frequently and often sit behind CDN layers that mask origin infrastructure. Organizations running mature, secure web gateways, CASB, or DNS logging can detect employee access to these domains. But detecting access is not the same as inventorying what was deployed, what data it holds, or whether it requires authentication. Without explicit monitoring of the major vibe coding platforms, the apps themselves generate a limited signal in conventional SIEM or endpoint telemetry. They exist in a gap between network visibility and application inventory that most security stacks were never architected to cover. The platform responses tell the story Replit CEO Amjad Masad said RedAccess gave his company only 24 hours before going to the press. Base44 (via Wix) and Lovable both said RedAccess did not include the URLs or technical specifics needed to verify the findings. None of the platforms denied that the exposed applications existed. Wiz Research separately discovered in July 2025 that Base44 contained a platform-wide authentication bypass. Exposed API endpoints allowed anyone to create a verified account on private apps using nothing more than a publicly visible app_id. The flaw meant that showing up to a locked building and shouting a room number was enough to get the doors open. Wix fixed the vulnerability within 24 hours after Wiz reported it, but the incident exposed how thin the authentication layer is on platforms where millions of apps are being built by users who assume the platform handles security for them. The pattern is consistent across the vibe coding ecosystem. CVE-2025-48757 documented insufficient or missing Row-Level Security policies in Lovable-generated Supabase projects. Certain queries skipped access checks entirely, exposing data across more than 170 production applications. The AI generated the database layer. It did not generate the security policies that should have restricted who could read the data. Lovable disputes the CVE classification, stating that individual customers accept responsibility for protecting their application data. That dispute itself illustrates the core tension: platforms that market to nontechnical builders are shifting security responsibility to users who do not know it exists. What this means for security teams The RedAccess findings complete the picture. Professional agents face credential theft on one layer. Citizen platforms face data exposure on the other. The structural failure is the same. Security review happens after deployment or not at all. Identity and access management systems track human users and service accounts. They do not track the Lovable app a sales operations analyst deployed last Tuesday, connected to a live CRM database, and shared with three external contractors via a public URL. Nobody asks whether the database policies restrict who can read the data or whether the API endpoints require authentication. When those questions go unasked at AI-generation speed, the exposure scales faster than any human review process can match. The question for security leaders is not whether vibe-coded apps are inside their perimeter. The question is how many, holding what data, visible to whom. The RedAccess findings suggest the answer, for most organizations, is worse than anyone in the C-suite currently knows. The organizations that start scanning this week will find them. The ones that wait will read about themselves next.
- An AI agent rewrote a Fortune 50 security policy. Here's how to govern AI agents before one does the same.A CEO’s AI agent rewrote the company’s security policy. Not because it was compromised, but because it wanted to fix a problem, lacked permissions, and removed the restriction itself. Every identity check passed. CrowdStrike CEO George Kurtz disclosed the incident and a second one at his RSAC 2026 keynote, both at Fortune 50 companies. The credential was valid. The access was authorized. The action was catastrophic. That sequence breaks the core assumption underneath the IAM systems most enterprises run in production today: that a valid credential plus authorized access equals a safe outcome. Identity systems were built for one user, one session, one set of hands on a keyboard. Agents break all three assumptions at once. In an exclusive interview with VentureBeat at RSAC 2026, Matt Caulfield, VP of Identity and Duo at Cisco, (pictured above) walked through the architecture his team is building to close that gap and outlined a six-stage identity maturity model for governing agentic AI. The urgency is measurable: Cisco President Jeetu Patel told VentureBeat at the same conference that 85% of enterprises are running agent pilots while only 5% have reached production — an 80-point gap that the identity work is designed to close. The identity stack was built for a workforce that has fingerprints “Most of the existing IAM tools that we have at our disposal are just entirely built for a different era,” Caulfield told VentureBeat. “They were built for human scale, not really for agents.” The default enterprise instinct is to shove agents into existing identity categories: human user; machine identity; pick one. "Agents are a third kind of new type of identity," Caulfield said. "They're neither human. They're neither machine. They're somewhere in the middle where they have broad access to resources like humans, but they operate at machine scale and speed like machines, and they entirely lack any form of judgment." Etay Maor, VP of Threat Intelligence at Cato Networks, put a number on the exposure. He ran a live Censys scan and counted nearly 500,000 internet-facing OpenClaw instances. The week before, he found 230,000, discovering a doubling in seven days. Kayne McGladrey, an IEEE senior member who advises enterprises on identity risk, made the same diagnosis independently. Organizations are cloning human user accounts to agentic systems, McGladrey told VentureBeat, except agents consume far more permissions than humans would because of the speed, the scale, and the intent. A human employee goes through a background check, an interview, and an onboarding process. Agents skip all three. The onboarding assumptions baked into modern IAM do not apply. Scale compounds the failure. Caulfield pointed to projections where a trillion agents could operate globally. “We barely know how many people are in an average organization,” he said, “let alone the number of agents.” Access control verifies the badge. It does not watch what happens next. Zero trust still applies to agentic AI, Caulfield argued. But only if security teams push it past access and into action-level enforcement. “We really need to shift our thinking to more action-level control,” he told VentureBeat. “What action is that agent taking?” A human employee with authorized access to a system will not execute 500 API calls in three seconds. An agent will. Traditional zero trust verifies that an identity can reach an application. It doesn’t scrutinize what that identity does once inside. Carter Rees, VP of Artificial Intelligence at Reputation, identified the structural reason. The flat authorization plane of an LLM fails to respect user permissions, Rees told VentureBeat. An agent operating on that flat plane does not need to escalate privileges. It already has them. That is why access control alone cannot contain what agents do after authentication. CrowdStrike CTO Elia Zaitsev described the detection gap to VentureBeat. In most default logging configurations, an agent’s activity is indistinguishable from a human. Distinguishing the two requires walking the process tree, tracing whether a browser session was launched by a human or spawned by an agent in the background. Most enterprise logging cannot make that distinction. Caulfield’s identity layer and Zaitsev’s telemetry layer are solving two halves of the same problem. No single vendor closes both gaps. “At any moment in time, that agent can go rogue and can lose its mind,” Caulfield said. “Agents read the wrong website or email, and their intentions can just change overnight.” How the request lifecycle works when agents have their own identity Five vendors shipped agent identity frameworks at RSAC 2026, including Cisco, CrowdStrike, Palo Alto Networks, Microsoft, and Cato Networks. Caulfield walked through how Cisco's identity-layer approach works in practice. The Duo agent identity platform registers agents as first-class identity objects, with their own policies, authentication requirements, and lifecycle management. The enforcement routes all agent traffic through an AI gateway supporting both MCP and traditional REST or GraphQL protocols. When an agent makes a request, the gateway authenticates the user, verifies that the agent is permitted, encodes the authorization into an OAuth token, and then inspects the specific action and determines in real time whether it should proceed. “No solution to agent AI is really complete unless you have both pieces,” Caulfield told VentureBeat. “The identity piece, the access gateway piece. And then the third piece would be observability.” Cisco announced its intent to acquire Astrix Security on May 4, signaling that agent identity discovery is now a board-level investment thesis. The deal also suggests that even vendors building identity platforms recognize that the discovery problem is harder than expected. Six-stage identity maturity model for agentic AI When a company shows up claiming 500 agents in production, Caulfield doesn't accept the number. "How do you know it's 500 and not 5,000?" Most organizations don’t have a source of truth for agents. Caulfield outlined a six-stage engagement model. Discovery first: identify every agent, where it runs, and who deployed it. Onboarding: register agents in the identity directory, tie each one to an accountable human, and define permitted actions. Control and enforcement: place a gateway between agents and resources, inspect every request and response. Behavioral monitoring: record all agent activity, flag anomalies, and build the audit trail. Runtime isolation contains agents on endpoints when they go rogue. Compliance mapping ties agent controls to audit frameworks before the auditor shows up. The six stages are not proprietary to any single vendor. They describe the sequence every enterprise will follow regardless of which platform delivers each stage. Maor's Censys data complicates step one before it even starts. Organizations beginning discovery should assume their agent exposure is already visible to adversaries. Step four has its own problem. Zaitsev's process-tree work shows that even organizations logging agent activity may not be capturing the right data. And step three depends on something Rees found most enterprises lack: a gateway that inspects actions, not just access, because the LLM does not respect the permission boundaries the identity layer sets. Agentic identity prescriptive matrix What to audit at each maturity stage, what operational readiness looks like, and the red flag that means the stage is failing. Use this to evaluate any platform or combination of platforms. Stage What to audit Operational readiness looks like Red flag if missing 1. Discovery Complete inventory of every agent, every MCP server it connects to, and every human accountable for it. A queryable registry that returns agent count, owner, and connection map within 60 seconds of an auditor asking. No registry exists. Agent count is an estimate. No human is accountable for any specific agent. Adversaries can see your agent infrastructure from the public internet before you can. 2. Onboarding Agents are registered as a distinct identity type with their own policies, separate from human and machine identities. Each agent has a unique identity object in the directory, tied to an accountable human, with defined permitted actions and a documented purpose. Agents use cloned human accounts or shared service accounts. Permission sprawl starts at creation. No audit trail ties agent actions to a responsible human. 3. Control A gateway between every agent and every resource it accesses, enforcing action-level policy on every request and every response. Four checkpoints per request: authenticate the user, authorize the agent, inspect the action, inspect the response. No direct agent-to-resource connections exist. Agents connect directly to tools and APIs. The gateway (if it exists) checks access but not actions. The flat authorization plane of the LLM does not respect the permission boundaries the identity layer set. 4. Monitoring Logging that can distinguish agent-initiated actions from human-initiated actions at the process-tree level. SIEM can answer: Was this browser session started by a human or spawned by an agent? Behavioral baselines exist for each agent. Anomalies trigger alerts. Default logging treats agent and human activity as identical. Process-tree lineage is not captured. Agent actions are invisible in the audit trail. Behavioral monitoring is incomplete before it starts. 5. Isolation Runtime containment that limits the blast radius if an agent goes rogue, separate from human endpoint protection. A rogue agent can be contained in its sandbox without taking down the endpoint, the user session, or other agents on the same machine. No containment boundary exists between agents and the host. A single compromised agent can access everything the user can. Blast radius is the entire endpoint. 6. Compliance Documentation that maps agent identities, controls, and audit trails to the compliance framework that the auditor will use. When the auditor asks about agents, the security team produces a control catalog, an audit trail, and a governance policy written for agent identities specifically. Emerging AI-risk frameworks (CSA Agentic Profile) exist, but mainstream audit catalogs (SOC 2, ISO 27001, PCI DSS) have not operationalized agent identities. No control catalog maps to agents. The auditor improvises which human-identity controls apply. The security team answers with improvisation, not documentation. Source: VentureBeat analysis of RSAC 2026 interviews (Caulfield, Zaitsev, Maor) and independent practitioner validation (McGladrey, Rees). May 2026. Compliance frameworks have not caught up “If you were to go through an audit today as a chief security officer, the auditor’s probably gonna have to figure out, hey, there are agents here,” Caulfield told VentureBeat. “Which one of your controls is actually supposed to be applied to it? I don’t see the word agents anywhere in your policies.” McGladrey's practitioner experience confirms the gap. The Cloud Security Alliance published an NIST AI RMF Agentic Profile in April 2026, proposing autonomy-tier classification and runtime behavioral metrics. But SOC 2, ISO 27001, and PCI DSS have not operationalized agent identities. The compliance frameworks McGladrey works with inside enterprises were written for humans. Agent identities do not appear in any control catalog he has encountered. The gap is a lagging indicator; the risk is not. Security director action plan VentureBeat identified five actions from the combined findings of Caulfield, Zaitsev, Maor, McGladrey, and Rees. Run an agent census and assume adversaries already did. Every agent, every MCP server those agents touch, every human accountable. Maor's Censys data confirms agent infrastructure is already visible from the public internet. NIST's NCCoE reached the same conclusion in its February 2026 concept paper on AI agent identity and authorization. Stop cloning human accounts for agents. McGladrey found that enterprises default to copying human user profiles, and permission sprawl starts on day one. Agents need to be a distinct identity type with scope limits that reflect what they actually do. Audit every MCP and API access path. Five vendors shipped MCP gateways at RSAC 2026. The capability exists. What matters is whether agents route through one or connect directly to tools with no action-level inspection. Fix logging so it distinguishes agents from humans. Zaitsev's process-tree method reveals that agent-initiated actions are invisible in most default configurations. Rees found authorization planes so flat that access logs alone miss the actual behavior. Logging has to capture what agents did, not just what they were allowed to reach. Build the compliance case before the auditor shows up. The CSA published a NIST AI RMF Agentic Profile proposing agent governance extensions. Most audit catalogs have not caught up. Caulfield told VentureBeat that auditors will see agents in production and find no controls mapped to them. The documentation needs to exist before that conversation starts.