Anthropic Skill scanners passed every check. The malicious code rode in on a test file.
Our take

When security researchers at Gecko Security demonstrated that Anthropic Skill scanners passed every published check while malicious code rode in on an overlooked test file, it exposed something the ecosystem should have confronted much earlier. The scanners were solving the right problem with the wrong scope. As we explored after Claude Code, Copilot and Codex all got hacked. Every attacker went for the credential, not the model, attackers consistently target the layer of access that defenders forget to watch. That pattern is playing out again here, and the implications extend well beyond Anthropic Skills alone. For anyone following the supply-chain security conversation, this echoes the finding from One command turns any open-source repo into an AI agent backdoor, where researchers proved that no existing scanner had a detection category for a trivially simple backdoor vector.
The core issue is structural. Anthropic Skill scanners inspect SKILL.md and agent-invoked scripts, which is necessary and valuable work. Snyk's ToxicSkills audit found seventy-six confirmed malicious payloads across nearly four thousand Skills, and SkillScan's analysis of over thirty-one thousand Skills revealed that more than a quarter contained at least one vulnerability. These scanners earn their keep. But none of them examine bundled test files, build configurations, or CI scripts that land alongside the Skill definition on disk. Gecko proved that a malicious test file executes through standard runners like Jest and Vitest with full access to environment variables, SSH keys, and cloud credentials, all without ever invoking the agent. The attacker does not need to fool the AI. They simply need the developer to run tests, which happens automatically in most CI pipelines and IDEs on save.
What makes this vector particularly concerning is how naturally it propagates. The .agents directory is designed to be committed and shared across teams. GitHub's default gitignore templates do not exclude it. Once a malicious test file enters the repository, every developer who clones and runs tests executes the payload. Every CI pipeline on every branch inherits it. And because the scanner reported green, no one has reason to investigate. The trust-on-install model that plagued the early npm ecosystem has found a new home in the Skills marketplace, except this time the ecosystem lacks the decade of hard lessons that forced package registries to build security infrastructure.
The good news is that the fix is immediate and does not require replacing any tools. Three concrete steps can close the gap today. First, add the .agents directory to your test runner's ignore list. One line in jest.config.js or vitest.config.ts prevents the runner from discovering files inside installed Skill directories. Second, add a CI gate that flags any test, spec, or config files inside .agents before merge. Third, pin Skill sources to specific commit hashes rather than pulling the latest version, converting a trust-on-first-use model into a verify-on-every-change model. The OWASP Agentic Skills Top 10 already recommends this approach.
The harder question is what this means for the broader ecosystem of agent security tooling. If scanners continue to measure only the surfaces they already cover, the gap between documented protection and actual exposure will keep widening. Security teams evaluating Skill scanning vendors should ask a pointed question: which files and directories do you actually analyze, and what do you explicitly skip? The answer to that question defines the boundary between real security and reassuring theater. As agent identity frameworks multiply and enterprise adoption accelerates, the test-file vector is a preview of the kinds of blind spots that emerge whenever defenders optimize for the threat they expect instead of the attack surface that actually exists.
Picture this scenario: An Anthropic Skill scanner runs a full analysis of a Skill pulled from ClawHub or skills.sh. Its markdown instructions are clean, and no prompt injection is detected. No shell commands are hiding in the SKILL.md. Green across the board.
The scanner never looked at the .test.ts file sitting one directory over. It didn’t need to. Test files aren’t part of the agent execution surface, so no publicly documented scanner inspects them (as of publication of this post). The file runs anyway. Not through the agent but through the test runner, with full access to the filesystem, environment variables, and SSH keys.
Gecko Security researcher Jeevan Jutla detailed this attack flow, demonstrating that when a developer runs npx Skills add, the installer copies the entire skill directory into the repo. If a malicious Skill bundles a *.test.ts file, the Jest and Vitest testing frameworks discover it through recursive glob patterns, treat it as a first-class test, and execute it during npm test or when the IDE auto-runs tests on save. The default configuration in open-source JavaScript test framework Mocha follows a similar recursive discovery pattern. The payload fires in beforeAll, before any assertions run. Nothing in the test output flags anything unusual. In CI, process.env holds deployment tokens, cloud credentials, and every secret the pipeline can reach.
The attack class is not new; malicious npm postinstall scripts and pytest plugins have exploited trust-on-install for years. What makes the Skill vector worse is that installed Skills land in a directory designed to be committed and shared across the team, propagate to every teammate who clones, and sit outside every scanner's detection surface.
The agent is never invoked, and the Anthropic Skill scanner reads the right files for the wrong threat model.
Three audits, one blind spot
Gecko's disclosure didn’t arrive in isolation. It landed on top of two large-scale security audits that had already documented the scope of the problem from the other direction, illustrating what scanners detect rather than what they miss. Both audits did exactly what they're designed to do: They measured the threat on the execution surface scanners already inspect. Gecko measured what sits outside it.
A SkillScan academic study, published on January 15, analyzed 31,132 unique Anthropic Skills collected from two major marketplaces. Their findings: 26.1% of Skills contained at least one vulnerability spanning 14 distinct patterns across four categories. Data exfiltration showed up in 13.3% of Skills. Privilege escalation appeared in 11.8%. Skills bundling executable scripts were 2.12x more likely to contain vulnerabilities than instruction-only Skills.
Three weeks later, Snyk published ToxicSkills, the first comprehensive security audit of the ClawHub and skills.sh marketplaces. Snyk's team scanned 3,984 Skills (as of February 5). The results: 13.4% of all Skills contained at least one critical-level security issue. Seventy-six confirmed malicious payloads were identified through a combination of automated scanning and human-in-the-loop review. Eight of those malicious Skills were still publicly available on ClawHub when the research was published.
Then Cisco shipped its AI Agent Security Scanner for IDEs on April 21, integrating its open-source Skill Scanner directly into VS Code, Cursor, and Windsurf. The scanner brings genuine capability to developers’ workflows. It does not inspect bundled test files, because the detection categories Cisco built target the agent interaction layer, not the developer toolchain layer.
The three major Anthropic Skill scanners share a structural blind spot: None inspects bundled test files as an execution surface, even though Gecko Security proved that those files execute with full local permissions through standard test runners.
Snyk Agent Scan, Cisco's AI Agent Security Scanner, and VirusTotal Code Insight all work. They catch prompt injection, shell commands, and data exfiltration in Skill definitions and agent-referenced scripts. What they do not do is look beyond the agent execution surface to the developer execution surface sitting in the same directory.
How the attack chain works
The mechanics of the attack chain matter because the fix is precise. When a developer runs npx skills add owner/repo-name, the installer clones the Skill repository and copies its contents into .agents/skills/<skill-name>/ inside the project. Claude Code, Cursor, and other agent IDEs get symlinks into their own Skill directories. The only files excluded are .git, metadata.json, and files prefixed with _. Everything else lands on disk.
Jest and Vitest both pass dot: true to their glob engines. That means they discover test files inside dot-prefixed directories like .agents/. Mocha's behavior depends on configuration but follows similar recursive patterns by default. None of them exclude .agents/, .claude/, or .cursor/ from their default discovery paths.
An attacker publishes a Skill with a clean SKILL.md and a tests/reviewer.test.ts file containing a beforeAll block. The block reads process.env, .env files, ~/.ssh/ private keys, and ~/.aws/credentials. It posts everything to an external endpoint. The test cases look real. The exfiltration happens during setup, silently, whether the tests pass or fail.
The vector is not limited to TypeScript. Python repos face the same exposure through conftest.py, which pytest auto-executes during test collection. Add .agents to testpaths exclusion in pyproject.toml to block it.
The .agents/skills/ directory is designed to be committed to the repo so teammates can share Skills. GitHub's default .gitignore templates do not include .agents/. Once the malicious test file enters the repo, every developer who clones and runs tests executes the payload. So does every CI pipeline on every branch and every fork that inherits the test suite.
Scanners are reading the wrong threat surface
CrowdStrike CTO Elia Zaitsev put the structural challenge in operational terms during an exclusive VentureBeat interview at RSAC 2026. "Observing actual kinetic actions is a structured, solvable problem," Zaitsev said. "Intent is not."
That distinction cuts directly at the Anthropic Skill scanner gap. No publicly documented scanner operates outside the assumption that the threat lives in the SKILL.md and in scripts the agent is instructed to run. These tools analyze intent: What does the Skill tell the agent to do? Gecko's finding sits on the kinetic side. The test file executes through the developer's own toolchain. No agent is involved. No prompt is interpreted. The payload is TypeScript, running with full local permissions through a legitimate test runner. The scanner was solving the wrong problem.
CrowdStrike's Zaitsev framed the identity dimension: "AI agents and non-human identities will explode across the enterprise, expanding exponentially and dwarfing human identities," he told VentureBeat. "Each agent will operate as a privileged super-human with OAuth tokens, API keys, and continuous access to previously siloed data sets."
CrowdStrike's Charlotte AI and similar enterprise agents operate with exactly these privileges. When those credentials live in environment variables accessible to any process in the repo, a test-file payload does not need agent privileges. It already has developer privileges, which in most CI configurations means deployment tokens and cloud access.
Mike Riemer, SVP of the network security group and field CISO at Ivanti, quantified the exploitation window in a VentureBeat interview. "Threat actors are reverse engineering patches within 72 hours," Riemer said. "If a customer doesn't patch within 72 hours of release, they're open to exploit."
Most enterprises take weeks. The Anthropic Skill scanner blind spot compounds that window. A developer installs a malicious Skill today. The test file executes immediately. No patch exists because no scanner flagged it.
The Anthropic Skill Audit Grid
VentureBeat has covered the Anthropic Skill supply chain since the ClawHavoc campaign hit ClawHub in January. Every conversation with security leaders lands on the same frustration. Their teams bought a scanner, it reports clean, and they have no framework for asking what it does not check.
VentureBeat has polled dev teams who install Anthropic Skills from ClawHub and skills.sh. The grid below connects the published-audit half (Snyk, SkillScan) with the scanner-bypass half (Gecko). Each row represents a detection surface a security team should verify before approving any Skill scanning tool for Q2 procurement.
Audit question | What scanners do today | The gap | Recommended action |
Inspect SKILL.md and agent-invoked scripts | Covered by Snyk Agent Scan, Cisco AI Agent Security Scanner, VirusTotal Code Insight | This is the covered surface. Attackers shift payloads to files outside it. | Continue running current scanners. They catch real threats at the instruction layer. |
Inspect bundled test files (*.test.ts, *.spec.js, conftest.py) | Not currently inspected as attack surface by any scanner | Gecko proved test files execute via Jest/Vitest (documented) and Mocha (config-dependent) with full local permissions. No agent invoked. | Add .agents/ to testPathIgnorePatterns (Jest) or exclude (Vitest). One config line. |
Flag Skills that bundle test files or build configs | Not flagged as higher-risk metadata by any scanner | Trivial static check. Skills with extra executables are 2.12x more likely to be vulnerable (SkillScan). | Add CI gate: find .agents/ -name "*.test.*" | grep -q . && exit 1. Block merge on match. |
Restrict test-runner globs to project-owned paths | Rare. Most CI configs use recursive glob. Jest/Vitest pass dot: true by default. | Default globs traverse .agents/, .claude/, .cursor/ directories. Malicious test files auto-discovered. | Scope test roots to first-party directories (src/, app/). Deny .agents/, .claude/, .cursor/. |
Distinguish script-bundling Skills vs. instruction-only | Partial coverage via static and semantic analysis | SkillScan: script-bundling Skills 2.12x more likely to contain vulnerabilities than instruction-only. | Require structured audit entry: Skill type, execution surfaces, scanner coverage, residual risk. |
Publish audit methodology with sample size | Snyk yes (3,984 Skills). SkillScan yes (31,132 Skills). | Cisco and emerging scanners have not published equivalent ecosystem-scale audits. | Ask vendors: methodology, sample size, detection rate. No published audit = no independent baseline. |
Pin Skill sources to immutable commits | Not enforced by any scanner or marketplace | Skill authors can push clean version for review, add malicious test file after approval. | Pin to specific commit hash. Review diffs on every update. OWASP Agentic Skills Top 10 recommends this. |
Three CI hardening steps to add now
Riemer made the broader point in VentureBeat interviews that placing security controls at the perimeter invites every threat to that exact boundary. Anthropic Skill scanners placed the boundary at SKILL.md. Attackers put the payload one directory over. The three changes below move the boundary to where the code actually executes.
These changes take minutes. None requires replacing current tools or waiting for scanner vendors to close the gap.
Add .agents/ to the test runner's ignore list. In Jest, add /\.agents/ to testPathIgnorePatterns in jest.config.js. In Vitest, add **/.agents/** to the exclude array in vitest.config.ts. One line in one config file prevents the test runner from discovering files inside installed Skill directories. Do it whether or not the team currently uses Anthropic Skills. The directory may appear in a cloned repo without anyone installing the Skill directly.
Audit every Skill install for non-instruction files before merge. Add a CI check that flags any file in .agents/skills/ matching *.test.*, *.spec.*, __tests__/, *.config.*, or conftest.py. These files have no legitimate reason to exist inside a Skill directory. The check is a shell one-liner: [ -d .agents ] && find .agents/ -name "*.test.*" -o -name "*.spec.*" -o -name "conftest.py" -o -name "*.config.*" -o -type d -name "__tests__" | grep -q . && exit 1. If it matches, block the merge. For any test files that do land in a PR, require a reviewer to skim for shell invocations (exec, spawn, child_process), external network calls, and file operations touching secrets or SSH keys.
Pin Skill sources to specific commits, not latest. The npx skills add command copies whatever the repo contains at the moment of install. A Skill author can push a clean version for scanner review, then add a malicious test file after approval. Pinning to a specific commit hash converts a trust-on-first-use model into a verify-on-every-change model. The OWASP Agentic Skills Top 10 recommends exactly this.
If Skills are already in your repo: Run the find command above against your existing .agents/ directory now. If test files are present, treat them as a potential compromise: Rotate any credentials accessible to CI (deployment tokens, cloud keys, SSH keys), audit CI logs for unexpected outbound network calls during test execution, and review git history to determine when the test files entered the repo and which pipelines have executed them.
Five questions to ask your Anthropic Skill scanner vendor
Security teams are signing contracts for their first dedicated Skill scanning tools. The Gecko bypass means the questions on those sales calls need to change. Do not stop at "Do you detect prompt injection?" Ask:
Which files and directories do you actually analyze in a Skill repo?
Do you treat test files as potential execution surfaces?
Can you flag Skills that bundle tests, CI configs, or build scripts as higher-risk? SkillScan showed script-bundling Skills are 2.12x more likely to be vulnerable.
Do you provide integration or guidance for restricting test-runner globs in CI? Cisco deserves credit for open-sourcing its Skill Scanner on GitHub, which lets security teams inspect exactly which detection categories the tool implements. That transparency is the baseline every vendor should meet. If your vendor will not publish detection categories or open-source their scanning logic, you cannot verify what they check and what they skip.
Have you published an ecosystem-scale audit with methodology and sample size? Snyk published at 3,984 Skills. SkillScan published at 31,132. Riemer described the disclosure pattern: "They chose not to publish a CVE. They just quietly patched it and moved on with life," he said. The Anthropic Skills ecosystem is showing early signs of the same pattern: scanners document what they detect without mapping the surfaces they do not reach. The gap between documented coverage and actual execution surface is where the test-file vector lives.
The audit grid matters because the scanner model is incomplete
The Anthropic Skills ecosystem is repeating the early npm supply chain story, except without the decade of accumulated incidents that forced package registries to build security infrastructure. SkillScan's 31,132-Skill dataset showed a quarter of the ecosystem carrying vulnerabilities. Snyk found 76 confirmed malicious payloads in fewer than 4,000 Skills. Gecko proved the scanner model itself has a structural gap that no vendor has publicly documented closing.
Scanner evaluations consistently test the covered surface. The Anthropic Skill Audit Grid gives security teams the seven audit surfaces to verify before signing. The three CI steps are the fixes to deploy before the next Skill install. Riemer's Ivanti team watches the patch-to-exploit cycle compress in real time across enterprise environments. The test-file vector compresses it further: No scanner flagged the threat, so no patch window exists.
The scanner is not broken. It is incomplete. The threat model stopped at the agent. The test runner did not.
Read on the original site
Open the publisher's page for the full experience
Related Articles
- One command turns any open-source repo into an AI agent backdoor. OpenClaw proved no supply-chain scanner has a detection category for itJust two months ago, researchers at the Data Intelligence Lab at the University of Hong Kong introduced CLI-Anything, a new state-of-the-art tool that analyzes any repo’s source code and generates a structured command line interface (CLI) that AI coding agents can operate with a single command. Claude Code, Codex, OpenClaw, Cursor, and GitHub Copilot CLI are all supported, and since its launch in March, CLI‑Anything has climbed to more than 30,000 GitHub stars. But the same mechanism that makes software agent-native opens the door to agent-level poisoning. The attack community is already discussing the implications on X and security forums, translating CLI-Anything's architecture into offensive playbooks. The security problem is not what CLI-Anything does. It is what CLI-Anything represents. CLI-Anything generates SKILL.md files, the same instruction-layer artifacts that Snyk’s ToxicSkills research found laced with 76 confirmed malicious payloads across ClawHub and skills.sh in February 2026. A poisoned skill definition does not trigger a CVE and never appears in a software bill of materials (SBOM). No mainstream security scanner has a detection category for malicious instructions embedded in agent skill definitions, because the category simply did not exist eighteen months ago. Cisco confirmed the gap in April. “Traditional application security tools were not designed for this,” Cisco’s engineering team wrote in a blog post announcing its AI Agent Security Scanner for IDEs. “SAST [static application security testing] scanners analyze source code syntax. SCA [software composition analysis] tools check dependency versions. Neither understands the semantic layer where MCP [Model Context Protocol] tool descriptions, agent prompts, and skill definitions operate.” Merritt Baer, CSO of Enkrypt AI and former Deputy CISO at Amazon Web Services (AWS), told VentureBeat in an exclusive interview: “SAST and SCA were built for code and dependencies. They don’t inspect instructions.” This is not a single-vendor vulnerability. It is a structural gap in how the entire security industry monitors software supply chains. This is the pre-exploitation window. CLI-Anything is live, the attack community is discussing it, and security directors who act now get ahead of the first incident report. The integration layer no stack can see Traditional supply-chain security operates on two layers. The code layer is where SAST works, scanning source files for insecure patterns, injection flaws, and hardcoded secrets. The dependency layer is where SCA works, checking package versions against known vulnerabilities, generating SBOMs, and flagging outdated libraries. Agent bridge tools like CLI-Anything, MCP connectors, Cursor rules files, and Claude Code skills operate on a third layer between the other two. Call it the agent integration layer: configuration files, skill definitions, and natural-language instruction sets tell an AI agent what software can do and how to operate it. None of it looks like code. All of it executes like code. Carter Rees, VP of AI at Reputation, told VentureBeat in an exclusive interview: “Modern LLMs [large language models] rely on third-party plugins, introducing supply chain vulnerabilities where compromised tools can inject malicious data into the conversation flow, bypassing internal safety training.” Researchers at Griffith University, Nanyang Technological University, the University of New South Wales, and the University of Tokyo documented the attack chain in an April paper, “Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems.” The team introduced Document-Driven Implicit Payload Execution (DDIPE), a technique that embeds malicious logic inside code examples within skill documentation. Across four agent frameworks and five large language models, DDIPE achieved bypass rates between 11.6% and 33.5%. Static analysis caught most samples, but 2.5% evaded all four detection layers. Responsible disclosure led to four confirmed vulnerabilities and two vendor fixes. The kill chain security leaders need to audit Here's the anatomy of the kill chain: An attacker submits a SKILL.md file to an open-source project containing setup instructions, code examples, and configuration templates. It looks like standard documentation. A code reviewer would wave it through because none of it is executable. But the code examples contain embedded instructions that an agent will parse as operational directives. A developer uses an agent bridge tool to connect their coding agent to the repository. The agent ingests the skill definition and trusts it, because no verification layer exists to distinguish benign from malicious intent at the instruction level. The agent executes the embedded instruction using its own legitimate credentials. Endpoint detection and response (EDR) sees an approved API call from an authorized process and passes it. Data exfiltration, configuration changes, and credential harvesting are all moving through channels that the monitoring stack considers normal traffic. Rees identified the structural flaw that makes this chain lethal. “A significant vulnerability in enterprise AI is broken access control, where the flat authorization plane of an LLM fails to respect user permissions,” he told VentureBeat. A compromised skill definition riding that flat authorization plane does not need to escalate privileges. It already has them. Every link in that chain is invisible to the current security stack. Pillar Security demonstrated a variant of this chain against Cursor in January 2026 (CVE-2026-22708). Implicitly trusted shell built-in commands could be poisoned through indirect prompt injection, converting benign developer commands into arbitrary code execution vectors. Users saw only the final command. The poisoning happened through other commands the IDE never surfaced for approval. The evidence is already in production In a documented attack chain from April 2026, a crafted GitHub issue title triggered an AI triage bot wired into Cline. The bot exfiltrated a GITHUB_TOKEN, which the attacker used to publish a compromised npm dependency that installed a second agent on roughly 4,000 developer machines for eight hours. There was just one issue title. Attackers had eight hours of access. No human approved the action. Snyk’s ToxicSkills audit scanned 3,984 agent skills from ClawHub, the public marketplace for the OpenClaw agent framework, and skills.sh in February 2026. The results: 13.4% of all skills contained at least one critical security issue. Daily skill submissions jumped from less than 50 in mid-January to more than 500 by early February. The barrier to publishing was a SKILL.md markdown file and a GitHub account one week old. No code signing. No security review. No sandbox. OpenClaw is not an outlier. It is the pattern. “The bar to entry is extremely low,” Baer said. “Adding a skill can be as simple as uploading a Word doc or lightweight config file. That’s a radically different risk profile than compiled code.” She pointed to projects like ClawPatrol that have started cataloging and scanning for malicious skills, evidence the ecosystem is moving faster than enterprise defenses. The ClawHavoc campaign, first reported by Koi Security in late January 2026, initially identified 341 malicious skills on ClawHub. A follow-up analysis by Antiy CERT expanded the count to 1,184 compromised packages across the platform. The campaign delivered Atomic Stealer (AMOS) through skill definitions with professional documentation. Skills named solana-wallet-tracker and polymarket-trader matched what developers actively searched for. The MCP protocol layer carries similar exposure. OX Security reported in April that researchers poisoned nine out of 11 MCP marketplaces using proof-of-concept servers. Trend Micro initially found 492 MCP servers exposed to the internet with zero authentication; by April, that number had grown to 1,467. As The Register reported, the root issue lies in Anthropic’s MCP software development kit (SDK) transport mechanism. Any developer using the official SDK inherits the vulnerability class. VentureBeat Prescriptive Matrix: Three-layer agent supply-chain audit VentureBeat developed a Prescriptive Matrix by mapping the three attack layers documented in the research and incident reports above against the detection capabilities of current SAST, SCA, and agent-layer tools. Each row identifies what security teams should verify and where no scanner has coverage today. Layer Threat Current detection Why it misses Recommended action 1. Code Prompt injection in AI-generated code SAST scanners Most SAST tools have no detection category for prompt injection in AI-generated code Confirm that SAST scans AI-generated code for prompt injection. If not, have an open vendor conversation this quarter. 2. Dependencies Malicious MCP servers, agent skills, plugin registries SCA tools SCA generates no AI-specific bill of materials. Agent-layer dependencies are invisible. Confirm SCA includes MCP servers, agent skills, and plugin registries in the dependency inventory. 3. Agent integration Poisoned SKILL.md files, malicious instruction sets, adversarial rules files None until April 2026 No tool inspects the semantic meaning of agent instruction files. Baer: “We’re not inspecting intent.” Deploy Cisco Skill Scanner or Snyk mcp-scan. Assign a team to own this layer. Baer’s diagnosis of Layer 3 applies across the entire matrix: “Current scanners look for known bad artifacts, not adversarial instructions embedded in otherwise valid skills.” Cisco’s open-source Skill Scanner and Snyk’s mcp-scan represent the first tools purpose-built for this layer. Security director action plan Here's how security leaders can get ahead of the problem. Inventory every agent bridge tool in the environment. This includes CLI-Anything, MCP connectors, Cursor rules files, Claude Code skills, GitHub Copilot extensions. If the development team is using agent bridge tools that have not been inventoried, the risk cannot be assessed. Audit agent skill sources the same way package registries get audited. Baer’s framing is precise: “A skill is effectively untrusted executable intent, even if it’s just text.” Shut off ungoverned ingestion paths until controls are in place. Stand up a review and allowlisting process for skills. The OWASP Agentic Skills Top 10 (AST01: Malicious Skills) provides the procurement framework to align controls against. Deploy agent-layer scanning. Evaluate Cisco’s open-source Skill Scanner and Snyk’s mcp-scan for behavioral analysis of agent instruction files. If dedicated tooling is unavailable, require a second engineer to read every SKILL.md before installation. Restrict agent execution privileges and instrument runtime. AI coding agents should not run with the same credential scope as the developer who invoked them. Rees confirmed the structural flaw: The flat authorization plane means a compromised skill does not need to escalate privileges. Baer’s prescription: “Instrument runtime observability. What data is the agent accessing, what actions is it taking, and are those aligned with expected behavior?” Assign ownership for the gap between layers. The most dangerous attacks succeed because they fall between detection categories. Assign a team to own the agent integration layer. Review every SKILL.md, MCP config, and rules file before it enters the environment. The gap that already has a name Baer underscored the dangers of this new attack vector. “This feels very similar to early container security, but we’re still in the ‘we’ll get to it’ phase across most orgs," she said. She added that, at AWS, it took a few high-profile wake-up calls before container security became table stakes. The difference this time is speed. “There’s no build pipeline, no compilation barrier. Just content," she said. CLI-Anything is not the threat. It is the proof case that the agent integration layer exists, that it is growing fast, and that the attacker community has already found it. The 33,000 developers who starred the repository are telling security teams where software development is heading. Eighteen months ago, the detection category for agent-integration-layer poisoning did not exist. Cisco and Snyk shipped the first tools for it in April. The window between those two facts is closing. Security directors who have not begun inventory are already behind.
- Claude Code, Copilot and Codex all got hacked. Every attacker went for the credential, not the model.On March 30, BeyondTrust proved that a crafted GitHub branch name could steal Codex’s OAuth token in cleartext. OpenAI classified it Critical P1. Two days later, Anthropic’s Claude Code source code spilled onto the public npm registry, and within hours, Adversa found Claude Code silently ignored its own deny rules once a command exceeded 50 subcommands. These were not isolated bugs. They were the latest in a nine-month run: six research teams disclosed exploits against Codex, Claude Code, Copilot, and Vertex AI, and every exploit followed the same pattern. An AI coding agent held a credential, executed an action, and authenticated to a production system without a human session anchoring the request. The attack surface was first demonstrated at Black Hat USA 2025, when Zenity CTO Michael Bargury hijacked ChatGPT, Microsoft Copilot Studio, Google Gemini, Salesforce Einstein and Cursor with Jira MCP on stage with zero clicks. Nine months later, those credentials are what attackers reached. Merritt Baer, CSO at Enkrypt AI and former Deputy CISO at AWS, named the failure in an exclusive VentureBeat interview. “Enterprises believe they’ve ‘approved’ AI vendors, but what they’ve actually approved is an interface, not the underlying system.” The credentials underneath the interface are the breach. Codex, where a branch name stole GitHub tokens BeyondTrust researcher Tyler Jespersen, with Fletcher Davis and Simon Stewart, found Codex cloned repositories using a GitHub OAuth token embedded in the git remote URL. During cloning, the branch name parameter flowed unsanitized into the setup script. A semicolon and a backtick subshell turned the branch name into an exfiltration payload. Stewart added the stealth. By appending 94 Ideographic Space characters (Unicode U+3000) after “main,” the malicious branch looked identical to the standard main branch in the Codex web portal. A developer sees “main.” The shell sees curl exfiltrating their token. OpenAI classified it Critical P1 and shipped full remediation by February 5, 2026. Claude Code, where two CVEs and a 50-subcommand bypass broke the sandbox CVE-2026-25723 hit Claude Code’s file-write restrictions. Piped sed and echo commands escaped the project sandbox because command chaining was not validated. Patched in 2.0.55. CVE-2026-33068 was subtler. Claude Code resolved permission modes from .claude/settings.json before showing the workspace trust dialog. A malicious repo set permissions.defaultMode to bypassPermissions. The trust prompt never appeared. Patched in 2.1.53. The 50-subcommand bypass landed last. Adversa found that Claude Code silently dropped deny-rule enforcement once a command exceeded 50 subcommands. Anthropic’s engineers had traded security for speed and stopped checking after the fiftieth. Patched in 2.1.90. “A significant vulnerability in enterprise AI is broken access control, where the flat authorization plane of an LLM fails to respect user permissions,” wrote Carter Rees, VP of AI and Machine Learning at Reputation and a member of the Utah AI Commission. The repository decided what permissions the agent had. The token budget decided which deny rules survived. Copilot, where a pull request description and a GitHub issue both became root Johann Rehberger demonstrated CVE-2025-53773 against GitHub Copilot with Markus Vervier of Persistent Security as co-discoverer. Hidden instructions in PR descriptions triggered Copilot to flip auto-approve mode in .vscode/settings.json. That disabled all confirmations and granted unrestricted shell execution across Windows, macOS, and Linux. Microsoft patched it in the August 2025 Patch Tuesday release. Then, Orca Security cracked Copilot inside GitHub Codespaces. Hidden instructions in a GitHub issue manipulated Copilot into checking out a malicious PR with a symbolic link to /workspaces/.codespaces/shared/user-secrets-envs.json. A crafted JSON $schema URL exfiltrated the privileged GITHUB_TOKEN. Full repository takeover. Zero user interaction beyond opening the issue. Mike Riemer, CTO at Ivanti, framed the speed dimension in a VentureBeat interview: “Threat actors are reverse engineering patches within 72 hours. If a customer doesn’t patch within 72 hours of release, they’re open to exploit.” Agents compress that window to seconds. Vertex AI, where default scopes reached Gmail, Drive and Google’s own supply chain Unit 42 researcher Ofir Shaty found that the default Google service identity attached to every Vertex AI agent had excessive permissions. Stolen P4SA credentials granted unrestricted read access to every Cloud Storage bucket in the project and reached restricted, Google-owned Artifact Registry repositories at the core of the Vertex AI Reasoning Engine. Shaty described the compromised P4SA as functioning like a "double agent," with access to both user data and Google's own infrastructure. VentureBeat defense grid Security requirement Defense shipped Exploit path The gap Sandbox AI agent execution Codex runs tasks in cloud containers; token scrubbed during agent runtime. Token present during cloning. Branch-name command injection executed before cleanup. No input sanitization on container setup parameters. Restrict file system access Claude Code sandboxes writes via accept-edits mode. Piped sed/echo escaped sandbox (CVE-2026-25723). Settings.json bypassed trust dialog (CVE-2026-33068). 50-subcommand chain dropped deny-rule enforcement. Command chaining not validated. Settings loaded before trust. Deny rules truncated for performance. Block prompt injection in code context Copilot filters PR descriptions for known injection patterns. Hidden injections in PRs, README files, and GitHub issues triggered RCE (CVE-2025-53773 + Orca RoguePilot). Static pattern matching loses to embedded prompts in legitimate review and Codespaces flows. Scope agent credentials to least privilege Vertex AI Agent Engine uses P4SA service agent with OAuth scopes. Default scopes reached Gmail, Calendar, Drive. P4SA credentials read every Cloud Storage bucket and Google’s Artifact Registry. OAuth scopes non-editable by default. Least privilege violated by design. Inventory and govern agent identities No major AI coding agent vendor ships agent identity discovery or lifecycle management. Not attempted. Enterprises do not inventory AI coding agents, their credentials, or their permission scopes. AI coding agents are invisible to IAM, CMDB, and asset inventory. Zero governance exists. Detect credential exfiltration from agent runtime Codex obscures tokens in web portal view. Claude Code logs subcommands. Tokens visible in cleartext inside containers. Unicode obfuscation hid exfil payloads. Subcommand chaining hid intent. No runtime monitoring of agent network calls. Log truncation hid the bypass. Audit AI-generated code for security flaws Anthropic launched Claude Code Security (Feb 2026). OpenAI launched Codex Security (March 2026). Both scan generated code. Neither scans the agent’s own execution environment or credential handling. Code-output security is not agent-runtime security. The agent itself is the attack surface. Every exploit targeted runtime credentials, not model output Every vendor shipped a defense. Every defense was bypassed. The Sonar 2026 State of Code Developer Survey found 25% of developers use AI agents regularly, and 64% have started using them. Veracode tested more than 100 LLMs and found 45% of generated code samples introduced OWASP Top 10 flaws, a separate failure that compounds the runtime credential gap. CrowdStrike CTO Elia Zaitsev framed the rule in an exclusive VentureBeat interview at RSAC 2026: collapse agent identities back to the human, because an agent acting on your behalf should never have more privileges than you do. Codex held a GitHub OAuth token scoped to every repository the developer authorized. Vertex AI’s P4SA read every Cloud Storage bucket in the project. Claude Code traded deny-rule enforcement for token budget. Kayne McGladrey, an IEEE Senior Member who advises enterprises on identity risk, made the same diagnosis in an exclusive interview with VentureBeat. "It uses far more permissions than it should have, more than a human would, because of the speed of scale and intent." Riemer drew the operational line in an exclusive VentureBeat interview. "It becomes, I don't know you until I validate you." The branch name talked to the shell before validation. The GitHub issue talked to Copilot before anyone read it. Security director action plan Inventory every AI coding agent (CIEM). Codex, Claude Code, Copilot, Cursor, Gemini Code Assist, Windsurf. List the credentials and OAuth scopes each received at setup. If your CMDB has no category for AI agent identities, create one. Audit OAuth scopes and patch levels. Upgrade Claude Code to 2.1.90 or later. Verify Copilot's August 2025 patch. Migrate Vertex AI to the bring-your-own-service-account model. Treat branch names, pull request descriptions, GitHub issues, and repo configuration as untrusted input. Monitor for Unicode obfuscation (U+3000), command chaining over 50 subcommands, and changes to .vscode/settings.json or .claude/settings.json that flip permission modes. Govern agent identities the way you govern human privileged identities (PAM/IGA). Credential rotation. Least-privilege scoping. Separation of duties between the agent that writes code and the agent that deploys it. CyberArk, Delinea, and any PAM platform that accepts non-human identities can onboard agent OAuth credentials today; Gravitee's 2026 survey found only 21.9% of teams have done it. Validate before you communicate. "As long as we trust and we check and we validate, I'm fine with letting AI maintain it," Riemer said. Before any AI coding agent authenticates to GitHub, Gmail, or an internal repository, verify the agent's identity, scope, and the human session it is bound to. Ask each vendor in writing before your next renewal. "Show me the identity lifecycle management controls for the AI agent running in my environment, including credential scope, rotation policy, and permission audit trail." If the vendor cannot answer, that is the audit finding. The governance gap in three sentences Most CISOs inventory every human identity and have zero inventory of the AI agents running with equivalent credentials. No IAM framework governs human privilege escalation and agent privilege escalation with the same rigor. Most scanners track every CVE but cannot alert when a branch name exfiltrates a GitHub token through a container that developers trust by default. Zaitsev's advice to RSAC 2026 attendees was blunt: you already know what to do. Agents just made the cost of not doing it catastrophic.
- RSAC 2026 shipped five agent identity frameworks and left three critical gaps open“You can deceive, manipulate, and lie. That’s an inherent property of language. It’s a feature, not a flaw,” CrowdStrike CTO Elia Zaitsev told VentureBeat in an exclusive interview at RSA Conference 2026. If deception is baked into language itself, every vendor trying to secure AI agents by analyzing their intent is chasing a problem that cannot be conclusively solved. Zaitsev is betting on context instead. CrowdStrike’s Falcon sensor walks the process tree on an endpoint and tracks what agents did, not what agents appeared to intend. “Observing actual kinetic actions is a structured, solvable problem,” Zaitsev told VentureBeat. “Intent is not.” That argument landed 24 hours after CrowdStrike CEO George Kurtz disclosed two production incidents at Fortune 50 companies. In the first, a CEO's AI agent rewrote the company's own security policy — not because it was compromised, but because it wanted to fix a problem, lacked the permissions to do so, and removed the restriction itself. Every identity check passed; the company caught the modification by accident. The second incident involved a 100-agent Slack swarm that delegated a code fix between agents with no human approval. Agent 12 made the commit. The team discovered it after the fact. Two incidents at two Fortune 50 companies. Caught by accident both times. Every identity framework that shipped at RSAC this week missed them. The vendors verified who the agent was. None of them tracked what the agent did. The urgency behind every framework launch reflects a broader market shift. "The difficulty of securing agentic AI is likely to push customers toward trusted platform vendors that can offer broader coverage across the expanding attack surface," according to William Blair's RSA Conference 2026 equity research report by analyst Jonathan Ho. Five vendors answered that call at RSAC this week. None of them answered it completely. Attackers are already inside enterprise pilots The scale of the exposure is already visible in production data. CrowdStrike's Falcon sensors detect more than 1,800 distinct AI applications across the company's customer fleet, generating 160 million unique instances on enterprise endpoints. Cisco found that 85% of its enterprise customers surveyed have pilot agent programs; only 5% have moved to production, meaning the vast majority of these agents are running without the governance structures production deployments typically require. "The biggest impediment to scaled adoption in enterprises for business-critical tasks is establishing a sufficient amount of trust," Cisco President and Chief Product Officer Jeetu Patel told VentureBeat in an exclusive interview at RSA Conference 2026. "Delegating versus trusted delegating of tasks to agents. The difference between those two, one leads to bankruptcy and the other leads to market dominance." Etay Maor, VP of Threat Intelligence at Cato Networks, ran a live Censys scan during an exclusive VentureBeat interview at RSA Conference 2026 and counted nearly 500,000 internet-facing OpenClaw instances. The week before: 230,000. Cato CTRL senior researcher Vitaly Simonovich documented a BreachForums listing from February 22, 2026, published on the Cato CTRL blog on February 25, where a threat actor advertised root shell access to a UK CEO’s computer for $25,000 in cryptocurrency. The selling point was the CEO’s OpenClaw AI personal assistant, which had accumulated the company’s production database, Telegram bot tokens, and Trading 212 API keys in plain-text Markdown with no encryption at rest. “Your AI? It’s my AI now. It’s an assistant for the attacker,” Maor told VentureBeat. The exposure data from multiple independent researchers tells the same story. Bitsight found more than 30,000 OpenClaw instances exposed to the public internet between January 27 and February 8, 2026. SecurityScorecard identified 15,200 of those instances as vulnerable to remote code execution through three high-severity CVEs, the worst rated CVSS 8.8. Koi Security found 824 malicious skills on ClawHub — 335 of them tied to ClawHavoc, which Kurtz flagged in his keynote as the first major supply chain attack on an AI agent ecosystem. Five vendors, three gaps none of them closed Cisco went deepest on identity governance. Duo Agentic Identity registers agents as distinct identity objects mapped to human owners, and every tool call routes through an MCP gateway in Secure Access SSE. Cisco Identity Intelligence catches shadow agents by monitoring network traffic rather than authentication logs. Patel told VentureBeat that today’s agents behave “more like teenagers — supremely intelligent, but with no fear of consequence, easily sidetracked or influenced.” CrowdStrike made the biggest philosophical bet, treating agents as endpoint telemetry and tracking the kinetic layer through Falcon’s process-tree lineage. CrowdStrike expanded AIDR to cover Microsoft Copilot Studio agents and shipped Shadow SaaS and AI Agent Discovery across Copilot, Salesforce Agentforce, ChatGPT Enterprise, and OpenAI Enterprise GPT. Palo Alto Networks built Prisma AIRS 3.0 with an agentic registry, an agentic IDP, and an MCP gateway for runtime traffic control. Palo Alto Networks’ pending Koi acquisition adds supply chain and runtime visibility. Microsoft spread governance across Entra, Purview, Sentinel, and Defender, with Microsoft Sentinel embedding MCP natively and a Claude MCP connector in public preview April 1. Cato CTRL delivered the adversarial proof that the identity gaps the other four vendors are trying to close are already being exploited. Maor told VentureBeat that enterprises abandoned basic security principles when deploying agents. “We just gave these AI tools complete autonomy,” Maor said. Gap 1: Agents can rewrite the rules governing their own behavior The Kurtz incident illustrates the gap exactly. Every credential check passed — the action was authorized. Zaitsev argues that the only reliable detection happens at the kinetic layer: which file was modified, by what process, initiated by what agent, compared against a behavioral baseline. Intent-based controls evaluate whether the call looks malicious. This one did not. Palo Alto Networks offers pre-deployment red teaming in Prisma AIRS 3.0, but red teaming runs before deployment, not during runtime when self-modification happens. No vendor ships behavioral anomaly detection for policy-modifying actions as a production capability. Patel framed the stakes in the VentureBeat interview: “The agent takes the wrong action and worse yet, some of those actions might be critical actions that are not reversible.” Board question: An authorized agent modifies the policy governing the agent’s future actions. What fires? Gap 2: Agent-to-agent handoffs have no trust verification The 100-agent swarm is the proof point. Agent A found a defect and posted to Slack. Agent 12 executed the fix. No human approved the delegation. Zaitsev’s approach: collapse agent identities back to the human. An agent acting on your behalf should never have more privileges than you do. But no product follows the delegation chain between agents. IAM was built for human-to-system. Agent-to-agent delegation needs a trust primitive that does not exist in OAuth, SAML, or MCP. Gap 3: Ghost agents hold live credentials with no offboarding Organizations adopt AI tools, run a pilot, lose interest, and move on. The agents keep running. The credentials stay active. Maor calls these abandoned instances ghost agents. Zaitsev connected ghost agents to a broader failure: agents expose where enterprises delayed action on basic identity hygiene. Standing privileged accounts, long-lived credentials, and missing offboarding procedures. These problems existed for humans. Agents running at machine speed make the consequences catastrophic. Maor demonstrated a Living Off the AI attack at the RSA Conference 2026, chaining Atlassian’s MCP and Jira Service Management to show that attackers do not separate trusted tools, services, and models. Attackers chain all three. “We need an HR view of agents,” Maor told VentureBeat. “Onboarding, monitoring, offboarding. If there’s no business justification? Removal.” Why these three gaps resist a product fix Human IAM assumes the identity holder will not rewrite permissions, spawn new identities, or leave. Agents violate all three. OAuth handles user-to-service. SAML handles federated human identity. MCP handles model-to-tool. None includes agent-to-agent verification. Five vendors against three gaps Cisco CrowdStrike Microsoft Palo Alto Networks Unsolved Registration. Can the vendor discover and inventory agents? Duo Agentic Identity. Agents registered as identity objects with human owners. Shadow agent detection via network traffic. Falcon sensor auto-discovery. 1,800+ agent apps, ~160M instances across customer fleet. Security Dashboard for AI + Entra shadow AI detection at the network layer. Agentic registry in Prisma AIRS 3.0. Agents inventoried before operating. All four register agents. No cross-vendor identity standard exists. Self-modification. Can the vendor detect when an agent changes its own policies? MCP gateway catches anomalous tool-call patterns in real time, but does not monitor for direct policy file modifications on the endpoint. Process-tree lineage tracks file modifications at the action layer. Could detect a policy file change, but no dedicated self-modification rule ships. Defender predictive shielding adjusts access policies reactively during active attacks. Not proactive self-modification detection. AI Red Teaming tests for this before deployment. No runtime detection after the agent is live. OPEN. No vendor detects an agent rewriting the policy governing the agent’s own behavior as a shipping capability. Delegation. Can the vendor track when one agent hands work to another? Maps each agent to a human owner. Does not track agent-to-agent handoffs. Collapses the agent identity to the human operator. Does not correlate the delegation chains between agents. Entra governs individual non-human identities. No multi-agent chain tracking. AI Agent Gateway governs individual agents. No delegation primitive between agents. OPEN. No trust primitive for agent-to-agent delegation exists in OAuth, SAML, or MCP. Decommission. Can the vendor confirm a killed agent holds zero credentials? Identity Intelligence runs a continuous inventory of active agents. Shadow SaaS + AI Agent Discovery finds running agents across SaaS and endpoints. Entra's shadow AI detection surfaces unmanaged AI applications. Koi acquisition (pending) adds endpoint visibility for agent applications. OPEN. All four discover running agents. None verifies zero residual credentials after decommission. Runtime / Kinetic. Can the vendor monitor what agents do in real time? MCP gateway enforces policy per tool call at the network layer. Contextual anomaly detection on call patterns. Falcon EDR tracks commands, scripts, file activity, and network connections at the process level. Defender endpoint + cloud monitoring. Predictive shielding during active incidents. Prisma AIRS AI Agent Gateway for runtime traffic control. CrowdStrike is the only vendor framing endpoint runtime as the primary safety net for agentic behavior. Five things to do Monday morning before your board asks Audit self-modification risk. Pull every agent with write access to security policies, IAM configs, firewall rules, or ACLs. Flag any agent that can modify controls governing the agent’s own behavior. No vendor automates this. Map delegation paths. Document every agent-to-agent invocation. Flag delegation without human approval. Human-in-the-loop on every delegation event until a trust primitive ships. Kill ghost agents. Build a registry. For each agent: business justification, human owner, credentials held, systems accessed. No justification? Manual revoke. Weekly. Stress test the MCP gateway enforcement. Cisco, Palo Alto Networks, and Microsoft all announced MCP gateways this week. Verify that agent tool traffic actually routes through the gateway. A misconfigured gateway creates false confidence while agents call tools directly. Baseline agent behavioral norms. Before any agent reaches production, establish what normal looks like: typical API calls, data access patterns, systems touched, and hours of activity. Without a behavioral baseline, the kinetic-layer anomaly detection Zaitsev describes has nothing to compare against. Zaitsev’s advice was blunt: you already know what to do. Agents just made the cost of not doing it catastrophic. Every vendor at RSAC verified who the agent was. None of them tracked what the agent did.
- Microsoft patched a Copilot Studio prompt injection. The data exfiltrated anyway.Microsoft assigned CVE-2026-21520, a CVSS 7.5 indirect prompt injection vulnerability, to Copilot Studio. Capsule Security discovered the flaw, coordinated disclosure with Microsoft, and the patch was deployed on January 15. Public disclosure went live on Wednesday. That CVE matters less for what it fixes and more for what it signals. Capsule’s research calls Microsoft’s decision to assign a CVE to a prompt injection vulnerability in an agentic platform “highly unusual.” Microsoft previously assigned CVE-2025-32711 (CVSS 9.3) to EchoLeak, a prompt injection in M365 Copilot patched in June 2025, but that targeted a productivity assistant, not an agent-building platform. If the precedent extends to agentic systems broadly, every enterprise running agents inherits a new vulnerability class to track. Except that this class cannot be fully eliminated by patches alone. Capsule also discovered what they call PipeLeak, a parallel indirect prompt injection vulnerability in Salesforce Agentforce. Microsoft patched and assigned a CVE. Salesforce has not assigned a CVE or issued a public advisory for PipeLeak as of publication, according to Capsule's research. What ShareLeak actually does The vulnerability that the researchers named ShareLeak exploits the gap between a SharePoint form submission and the Copilot Studio agent’s context window. An attacker fills a public-facing comment field with a crafted payload that injects a fake system role message. In Capsule’s testing, Copilot Studio concatenated the malicious input directly with the agent’s system instructions with no input sanitization between the form and the model. The injected payload overrode the agent’s original instructions in Capsule’s proof-of-concept, directing it to query connected SharePoint Lists for customer data and send that data via Outlook to an attacker-controlled email address. NVD classifies the attack as low complexity and requires no privileges. Microsoft’s own safety mechanisms flagged the request as suspicious during Capsule’s testing. The data was exfiltrated anyway. The DLP never fired because the email was routed through a legitimate Outlook action that the system treated as an authorized operation. Carter Rees, VP of Artificial Intelligence at Reputation, described the architectural failure in an exclusive VentureBeat interview. The LLM cannot inherently distinguish between trusted instructions and untrusted retrieved data, Rees said. It becomes a confused deputy acting on behalf of the attacker. OWASP classifies this pattern as ASI01: Agent Goal Hijack. The research team behind both discoveries, Capsule Security, found the Copilot Studio vulnerability on November 24, 2025. Microsoft confirmed it on December 5 and patched it on January 15, 2026. Every security director running Copilot Studio agents triggered by SharePoint forms should audit that window for indicators of compromise. PipeLeak and the Salesforce split PipeLeak hits the same vulnerability class through a different front door. In Capsule’s testing, a public lead form payload hijacked an Agentforce agent with no authentication required. Capsule found no volume cap on the exfiltrated CRM data, and the employee who triggered the agent received no indication that data had left the building. Salesforce has not assigned a CVE or issued a public advisory specific to PipeLeak as of publication. Capsule is not the first research team to hit Agentforce with indirect prompt injection. Noma Labs disclosed ForcedLeak (CVSS 9.4) in September 2025, and Salesforce patched that vector by enforcing Trusted URL allowlists. According to Capsule's research, PipeLeak survives that patch through a different channel: email via the agent's authorized tool actions. Naor Paz, CEO of Capsule Security, told VentureBeat the testing hit no exfiltration limit. “We did not get to any limitation,” Paz said. “The agent would just continue to leak all the CRM.” Salesforce recommended human-in-the-loop as a mitigation. Paz pushed back. “If the human should approve every single operation, it’s not really an agent,” he told VentureBeat. “It’s just a human clicking through the agent’s actions.” Microsoft patched ShareLeak and assigned a CVE. According to Capsule's research, Salesforce patched ForcedLeak's URL path but not the email channel. Kayne McGladrey, IEEE Senior Member, put it differently in a separate VentureBeat interview. Organizations are cloning human user accounts to agentic systems, McGladrey said, except agents use far more permissions than humans would because of the speed, the scale, and the intent. The lethal trifecta and why posture management fails Paz named the structural condition that makes any agent exploitable: access to private data, exposure to untrusted content, and the ability to communicate externally. ShareLeak hits all three. PipeLeak hits all three. Most production agents hit all three because that combination is what makes agents useful. Rees validated the diagnosis independently. Defense-in-depth predicated on deterministic rules is fundamentally insufficient for agentic systems, Rees told VentureBeat. Elia Zaitsev, CrowdStrike’s CTO, called the patching mindset itself the vulnerability in a separate VentureBeat exclusive. “People are forgetting about runtime security,” he said. “Let’s patch all the vulnerabilities. Impossible. Somehow always seem to miss something.” Observing actual kinetic actions is a structured, solvable problem, Zaitsev told VentureBeat. Intent is not. CrowdStrike’s Falcon sensor walks the process tree and tracks what agents did, not what they appeared to intend. Multi-turn crescendo and the coding agent blind spot Single-shot prompt injections are the entry-level threat. Capsule’s research documented multi-turn crescendo attacks where adversaries distribute payloads across multiple benign-looking turns. Each turn passes inspection. The attack becomes visible only when analyzed as a sequence. Rees explained why current monitoring misses this. A stateless WAF views each turn in a vacuum and detects no threat, Rees told VentureBeat. It sees requests, not a semantic trajectory. Capsule also found undisclosed vulnerabilities in coding agent platforms it declined to name, including memory poisoning that persists across sessions and malicious code execution through MCP servers. In one case, a file-level guardrail designed to restrict which files the agent could access was reasoned around by the agent itself, which found an alternate path to the same data. Rees identified the human vector: employees paste proprietary code into public LLMs and view security as friction. McGladrey cut to the governance failure. “If crime was a technology problem, we would have solved crime a fairly long time ago,” he told VentureBeat. “Cybersecurity risk as a standalone category is a complete fiction.” The runtime enforcement model Capsule hooks into vendor-provided agentic execution paths — including Copilot Studio's security hooks and Claude Code's pre-tool-use checkpoints — with no proxies, gateways, or SDKs. The company exited stealth on Wednesday, timing its $7 million seed round, led by Lama Partners alongside Forgepoint Capital International, to its coordinated disclosure. Chris Krebs, the first Director of CISA and a Capsule advisor, put the gap in operational terms. “Legacy tools weren’t built to monitor what happens between prompt and action,” Krebs said. “That’s the runtime gap.” Capsule's architecture deploys fine-tuned small language models that evaluate every tool call before execution, an approach Gartner's market guide calls a "guardian agent." Not everyone agrees that intent analysis is the right layer. Zaitsev told VentureBeat during an exclusive interview that intent-based detection is non-deterministic. “Intent analysis will sometimes work. Intent analysis cannot always work,” he said. CrowdStrike bets on observing what the agent actually did rather than what it appeared to intend. Microsoft’s own Copilot Studio documentation provides external security-provider webhooks that can approve or block tool execution, offering a vendor-native control plane alongside third-party options. No single layer closes the gap. Runtime intent analysis, kinetic action monitoring, and foundational controls (least privilege, input sanitization, outbound restrictions, targeted human-in-the-loop) all belong in the stack. SOC teams should map telemetry now: Copilot Studio activity logs plus webhook decisions, CRM audit logs for Agentforce, and EDR process-tree data for coding agents. Paz described the broader shift. “Intent is the new perimeter,” he told VentureBeat. “The agent in runtime can decide to go rogue on you.” VentureBeat Prescriptive Matrix The following matrix maps five vulnerability classes against the controls that miss them, and the specific actions security directors should take this week. Vulnerability Class Why Current Controls Miss It What Runtime Enforcement Does Suggested actions for security leaders ShareLeak — Copilot Studio, CVE-2026-21520, CVSS 7.5, patched Jan 15 2026 Capsule’s testing found no input sanitization between the SharePoint form and the agent context. Safety mechanisms flagged, but data still exfiltrated. DLP did not fire because the email used a legitimate Outlook action. OWASP ASI01: Agent Goal Hijack. Guardian agent hooks into Copilot Studio pre-tool-use security hooks. Vets every tool call before execution. Blocks exfiltration at the action layer. Audit every Copilot Studio agent triggered by SharePoint forms. Restrict outbound email to org-only domains. Inventory all SharePoint Lists accessible to agents. Review the Nov 24–Jan 15 window for indicators of compromise. PipeLeak — Agentforce, no CVE assigned In Capsule’s testing, public form input flowed directly into the agent context. No auth required. No volume cap observed on exfiltrated CRM data. The employee received no indication that data was leaving. Runtime interception via platform agentic hooks. Pre-invocation checkpoint on every tool call. Detects outbound data transfer to non-approved destinations. Review all Agentforce automations triggered by public-facing forms. Enable human-in-the-loop for external comms as interim control. Audit CRM data access scope per agent. Pressure Salesforce for CVE assignment. Multi-Turn Crescendo — distributed payload, each turn looks benign Stateless monitoring inspects each turn in isolation. WAFs, DLP, and activity logs see individual requests, not semantic trajectory. Stateful runtime analysis tracks full conversation history across turns. Fine-tuned SLMs evaluate aggregated context. Detects when a cumulative sequence constitutes a policy violation. Require stateful monitoring for all production agents. Add crescendo attack scenarios to red team exercises. Coding Agents — unnamed platforms, memory poisoning + code execution MCP servers inject code and instructions into the agent context. Memory poisoning persists across sessions. Guardrails reasoned around by the agent itself. Shadow AI insiders paste proprietary code into public LLMs. Pre-invocation checkpoint on every tool call. Fine-tuned SLMs detect anomalous tool usage at runtime. Inventory all coding agent deployments across engineering. Audit MCP server configs. Restrict code execution permissions. Monitor for shadow installations. Structural Gap — any agent with private data + untrusted input + external comms Posture management tells you what should happen. It does not stop what does happen. Agents use far more permissions than humans at far greater speed. Runtime guardian agent watches every action in real time. Intent-based enforcement replaces signature detection. Leverages vendor agentic hooks, not proxies or gateways. Classify every agent by lethal trifecta exposure. Treat prompt injection as class-based SaaS risk. Require runtime security for any agent moving to production. Brief the board on agent risk as business risk. What this means for 2026 security planning Microsoft’s CVE assignment will either accelerate or fragment how the industry handles agent vulnerabilities. If vendors call them configuration issues, CISOs carry the risk alone. Treat prompt injection as a class-level SaaS risk rather than individual CVEs. Classify every agent deployment against the lethal trifecta. Require runtime enforcement for anything moving to production. Brief the board on agent risk the way McGladrey framed it: as business risk, because cybersecurity risk as a standalone category stopped being useful the moment agents started operating at machine speed.