OpenClaw has 500,000 instances and no enterprise kill switch
Our take

“Your AI? It’s my AI now.” The line came from Etay Maor, VP of Threat Intelligence at Cato Networks, in an exclusive interview with VentureBeat at RSAC 2026 — and it describes exactly what happened to a U.K. CEO whose OpenClaw instance ended up for sale on BreachForums. Maor's argument is that the industry handed AI agents the kind of autonomy it would never extend to a human employee, discarding zero trust, least privilege, and assume-breach in the process.
The proof arrived on BreachForums three weeks before Maor’s interview. On February 22, a threat actor using the handle “fluffyduck” posted a listing advertising root shell access to the CEO’s computer for $25,000 in Monero or Litecoin. The shell was not the selling point. The CEO’s OpenClaw AI personal assistant was. The buyer would get every conversation the CEO had with the AI, the company’s full production database, Telegram bot tokens, Trading 212 API keys, and personal details the CEO disclosed to the assistant about family and finances. The threat actor noted the CEO was actively interacting with OpenClaw in real time, making the listing a live intelligence feed rather than a static data dump.
Cato CTRL senior security researcher Vitaly Simonovich documented the listing on February 25. The CEO’s OpenClaw instance stored everything in plain-text Markdown files under ~/.openclaw/workspace/ with no encryption at rest. The threat actor didn't need to exfiltrate anything; the CEO had already assembled it. When the security team discovered the breach, there was no native enterprise kill switch, no management console, and no way to inventory how many other instances were running across the organization.
OpenClaw runs locally with direct access to the host machine’s file system, network connections, browser sessions, and installed applications. The coverage to date has tracked its velocity, but what it hasn't mapped is the threat surface. The four vendors who used RSAC 2026 to ship responses still haven't produced the one control enterprises need most: a native kill switch.
The threat surface by the numbers
Metric | Numbers | Source |
Internet-facing instances | ~500,000 (March 24 live check) | Etay Maor, Cato Networks (exclusive RSAC 2026 interview) |
Exposed instances with security risks | 30,000+ observed during scan window | |
Exploitable via known RCE | 15,200 instances | |
High-severity CVEs | 3 (highest CVSS: 8.8) | |
Malicious skills on ClawHub | 341 in Koi audit (335 from ClawHavoc); 824 by mid-Feb | |
ClawHub skills with critical flaws | 13.4% of 3,984 analyzed | |
API tokens exposed (Moltbook) | 1.5 million |
Maor ran a live Censys check during an exclusive VentureBeat interview at RSAC 2026. “The first week it came out, there were about 6,300 instances. Last week, I checked: 230,000 instances. Let’s check now… almost half a million. Almost doubled in one week,” Maor said. Three high-severity CVEs define the attack surface: CVE-2026-24763 (CVSS 8.8, command injection via Docker PATH handling), CVE-2026-25157 (CVSS 7.7, OS command injection), and CVE-2026-25253 (CVSS 8.8, token exfiltration to full gateway compromise). All three CVEs have been patched, but OpenClaw has no enterprise management plane, no centralized patching mechanism, and no fleet-wide kill switch. Individual administrators must update each instance manually, and most have not.
The defender-side telemetry is just as alarming. CrowdStrike's Falcon sensors already detect more than 1,800 distinct AI applications across its customer fleet — from ChatGPT to Copilot to OpenClaw — generating around 160 million unique instances on enterprise endpoints. ClawHavoc, a malicious skill distributed through the ClawHub marketplace, became the primary case study in the OWASP Agentic Skills Top 10. CrowdStrike CEO George Kurtz flagged it in his RSAC 2026 keynote as the first major supply chain attack on an AI agent ecosystem.
AI agents got root access. Security got nothing.
Maor framed the visibility failure through the OODA loop (observe, orient, decide, act) during the RSAC 2026 interview. Most organizations are failing at the first step: security teams can't see which AI tools are running on their networks, which means the productivity tools employees bring in quietly become shadow AI that attackers exploit. The BreachForums listing proved the end state. The CEO’s OpenClaw instance became a centralized intelligence hub with SSO sessions, credential stores, and communication history aggregated into one location. “The CEO’s assistant can be your assistant if you buy access to this computer,” Maor told VentureBeat. “It’s an assistant for the attacker.”
Ghost agents amplify the exposure. Organizations adopt AI tools, run a pilot, lose interest, and move on — leaving agents running with credentials intact. “We need an HR view of agents. Onboarding, monitoring, offboarding. If there’s no business justification? Removal,” Maor told VentureBeat. “We’re not left with any ghost agents on our network, because that’s already happening.”
Cisco moved toward an OpenClaw kill switch
Cisco President and Chief Product Officer Jeetu Patel framed the stakes during an exclusive VentureBeat interview at RSAC 2026. “I think of them more like teenagers. They’re supremely intelligent, but they have no fear of consequence,” Patel said of AI agents. “The difference between delegating and trusted delegating of tasks to an agent … one of them leads to bankruptcy. The other one leads to market dominance.”
Cisco launched three free, open-source security tools for OpenClaw at RSAC 2026. DefenseClaw packages Skills Scanner, MCP Scanner, AI BoM, and CodeGuard into a single open-source framework running inside NVIDIA’s OpenShell runtime, which NVIDIA launched at GTC the week before RSAC. “Every single time you actually activate an agent in an Open Shell container, you can now automatically instantiate all the security services that we have built through Defense Claw,” Patel told VentureBeat. AI Defense Explorer Edition is a free, self-serve version of Cisco’s algorithmic red-teaming engine, testing any AI model or agent for prompt injection and jailbreaks across more than 200 risk subcategories. The LLM Security Leaderboard ranks foundation models by adversarial resilience rather than performance benchmarks. Cisco also shipped Duo Agentic Identity to register agents as identity objects with time-bound permissions, Identity Intelligence to discover shadow agents through network monitoring, and the Agent Runtime SDK to embed policy enforcement at build time.
Palo Alto made agentic endpoints a security category of their own
Palo Alto Networks CEO Nikesh Arora characterized OpenClaw-class tools as creating a new supply chain running through unregulated, unsecured marketplaces during an exclusive March 18 pre-RSA briefing with VentureBeat. Koi found 341 malicious skills on ClawHub in its initial audit, with the total growing to 824 as the registry expanded. Snyk found 13.4% of analyzed skills contained critical security flaws. Palo Alto Networks built Prisma AIRS 3.0 around a new agentic registry that requires every agent to be logged before operating, with credential validation, MCP gateway traffic control, agent red-teaming, and runtime monitoring for memory poisoning. The pending Koi acquisition adds supply chain visibility specifically for agentic endpoints.
Cato CTRL delivered the adversarial proof
Cato Networks’ threat intelligence arm Cato CTRL presented two sessions at RSAC 2026. The 2026 Cato CTRL Threat Report, published separately, includes a proof-of-concept “Living Off AI” attack targeting Atlassian’s MCP and Jira Service Management. Maor’s research provides the independent adversarial validation that vendor product announcements cannot deliver on their own. The platform vendors are building governance for sanctioned agents. Cato CTRL documented what happens when the unsanctioned agent on the CEO’s laptop gets sold on the dark web.
Monday morning action list
Regardless of vendor stack, four controls apply immediately: bind OpenClaw to localhost only and block external port exposure, enforce application allowlisting through MDM to prevent unauthorized installations, rotate every credential on machines where OpenClaw has been running, and apply least-privilege access to any account an AI agent has touched.
Discover the install base. CrowdStrike’s Falcon sensor, Cato’s SASE platform, and Cisco Identity Intelligence all detect shadow AI. For teams without premium tooling, query endpoints for the ~/.openclaw/ directory using native EDR or MDM file-search policies. If the enterprise has no endpoint visibility at all, run Shodan and Censys queries against corporate IP ranges.
Patch or isolate. Check every discovered instance against CVE-2026-24763, CVE-2026-25157, and CVE-2026-25253. Instances that cannot be patched should be network-isolated. There is no fleet-wide patching mechanism.
Audit skill installations. Review installed skills against Cisco’s Skills Scanner or the Snyk and Koi research. Any skill from an unverified source should be removed immediately.
Enforce DLP and ZTNA controls. Cato’s ZTNA controls restrict unapproved AI applications. Cisco Secure Access SSE enforces policy on MCP tool calls. Palo Alto’s Prisma Access Browser controls data flow at the browser layer.
Kill ghost agents. Build a registry of every AI agent running. Document business justification, human owner, credentials held, and systems accessed. Revoke credentials for agents with no justification. Repeat weekly.
Deploy DefenseClaw for sanctioned use. Run OpenClaw inside NVIDIA’s OpenShell runtime with Cisco’s DefenseClaw to scan skills, verify MCP servers, and instrument runtime behavior automatically.
Red-team before deploying. Use Cisco AI Defense Explorer Edition (free) or Palo Alto Networks’ agent red-teaming in Prisma AIRS 3.0. Test the workflow, not just the model.
The OWASP Agentic Skills Top 10, published using ClawHavoc as its primary case study, provides a standards-grade framework for evaluating these risks. Four vendors shipped responses at RSAC 2026. None of them is a native enterprise kill switch for unsanctioned OpenClaw deployments. Until one exists, the Monday morning action list above is the closest thing to one.
Read on the original site
Open the publisher's page for the full experience
Related Articles
- One command turns any open-source repo into an AI agent backdoor. OpenClaw proved no supply-chain scanner has a detection category for itJust two months ago, researchers at the Data Intelligence Lab at the University of Hong Kong introduced CLI-Anything, a new state-of-the-art tool that analyzes any repo’s source code and generates a structured command line interface (CLI) that AI coding agents can operate with a single command. Claude Code, Codex, OpenClaw, Cursor, and GitHub Copilot CLI are all supported, and since its launch in March, CLI‑Anything has climbed to more than 30,000 GitHub stars. But the same mechanism that makes software agent-native opens the door to agent-level poisoning. The attack community is already discussing the implications on X and security forums, translating CLI-Anything's architecture into offensive playbooks. The security problem is not what CLI-Anything does. It is what CLI-Anything represents. CLI-Anything generates SKILL.md files, the same instruction-layer artifacts that Snyk’s ToxicSkills research found laced with 76 confirmed malicious payloads across ClawHub and skills.sh in February 2026. A poisoned skill definition does not trigger a CVE and never appears in a software bill of materials (SBOM). No mainstream security scanner has a detection category for malicious instructions embedded in agent skill definitions, because the category simply did not exist eighteen months ago. Cisco confirmed the gap in April. “Traditional application security tools were not designed for this,” Cisco’s engineering team wrote in a blog post announcing its AI Agent Security Scanner for IDEs. “SAST [static application security testing] scanners analyze source code syntax. SCA [software composition analysis] tools check dependency versions. Neither understands the semantic layer where MCP [Model Context Protocol] tool descriptions, agent prompts, and skill definitions operate.” Merritt Baer, CSO of Enkrypt AI and former Deputy CISO at Amazon Web Services (AWS), told VentureBeat in an exclusive interview: “SAST and SCA were built for code and dependencies. They don’t inspect instructions.” This is not a single-vendor vulnerability. It is a structural gap in how the entire security industry monitors software supply chains. This is the pre-exploitation window. CLI-Anything is live, the attack community is discussing it, and security directors who act now get ahead of the first incident report. The integration layer no stack can see Traditional supply-chain security operates on two layers. The code layer is where SAST works, scanning source files for insecure patterns, injection flaws, and hardcoded secrets. The dependency layer is where SCA works, checking package versions against known vulnerabilities, generating SBOMs, and flagging outdated libraries. Agent bridge tools like CLI-Anything, MCP connectors, Cursor rules files, and Claude Code skills operate on a third layer between the other two. Call it the agent integration layer: configuration files, skill definitions, and natural-language instruction sets tell an AI agent what software can do and how to operate it. None of it looks like code. All of it executes like code. Carter Rees, VP of AI at Reputation, told VentureBeat in an exclusive interview: “Modern LLMs [large language models] rely on third-party plugins, introducing supply chain vulnerabilities where compromised tools can inject malicious data into the conversation flow, bypassing internal safety training.” Researchers at Griffith University, Nanyang Technological University, the University of New South Wales, and the University of Tokyo documented the attack chain in an April paper, “Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems.” The team introduced Document-Driven Implicit Payload Execution (DDIPE), a technique that embeds malicious logic inside code examples within skill documentation. Across four agent frameworks and five large language models, DDIPE achieved bypass rates between 11.6% and 33.5%. Static analysis caught most samples, but 2.5% evaded all four detection layers. Responsible disclosure led to four confirmed vulnerabilities and two vendor fixes. The kill chain security leaders need to audit Here's the anatomy of the kill chain: An attacker submits a SKILL.md file to an open-source project containing setup instructions, code examples, and configuration templates. It looks like standard documentation. A code reviewer would wave it through because none of it is executable. But the code examples contain embedded instructions that an agent will parse as operational directives. A developer uses an agent bridge tool to connect their coding agent to the repository. The agent ingests the skill definition and trusts it, because no verification layer exists to distinguish benign from malicious intent at the instruction level. The agent executes the embedded instruction using its own legitimate credentials. Endpoint detection and response (EDR) sees an approved API call from an authorized process and passes it. Data exfiltration, configuration changes, and credential harvesting are all moving through channels that the monitoring stack considers normal traffic. Rees identified the structural flaw that makes this chain lethal. “A significant vulnerability in enterprise AI is broken access control, where the flat authorization plane of an LLM fails to respect user permissions,” he told VentureBeat. A compromised skill definition riding that flat authorization plane does not need to escalate privileges. It already has them. Every link in that chain is invisible to the current security stack. Pillar Security demonstrated a variant of this chain against Cursor in January 2026 (CVE-2026-22708). Implicitly trusted shell built-in commands could be poisoned through indirect prompt injection, converting benign developer commands into arbitrary code execution vectors. Users saw only the final command. The poisoning happened through other commands the IDE never surfaced for approval. The evidence is already in production In a documented attack chain from April 2026, a crafted GitHub issue title triggered an AI triage bot wired into Cline. The bot exfiltrated a GITHUB_TOKEN, which the attacker used to publish a compromised npm dependency that installed a second agent on roughly 4,000 developer machines for eight hours. There was just one issue title. Attackers had eight hours of access. No human approved the action. Snyk’s ToxicSkills audit scanned 3,984 agent skills from ClawHub, the public marketplace for the OpenClaw agent framework, and skills.sh in February 2026. The results: 13.4% of all skills contained at least one critical security issue. Daily skill submissions jumped from less than 50 in mid-January to more than 500 by early February. The barrier to publishing was a SKILL.md markdown file and a GitHub account one week old. No code signing. No security review. No sandbox. OpenClaw is not an outlier. It is the pattern. “The bar to entry is extremely low,” Baer said. “Adding a skill can be as simple as uploading a Word doc or lightweight config file. That’s a radically different risk profile than compiled code.” She pointed to projects like ClawPatrol that have started cataloging and scanning for malicious skills, evidence the ecosystem is moving faster than enterprise defenses. The ClawHavoc campaign, first reported by Koi Security in late January 2026, initially identified 341 malicious skills on ClawHub. A follow-up analysis by Antiy CERT expanded the count to 1,184 compromised packages across the platform. The campaign delivered Atomic Stealer (AMOS) through skill definitions with professional documentation. Skills named solana-wallet-tracker and polymarket-trader matched what developers actively searched for. The MCP protocol layer carries similar exposure. OX Security reported in April that researchers poisoned nine out of 11 MCP marketplaces using proof-of-concept servers. Trend Micro initially found 492 MCP servers exposed to the internet with zero authentication; by April, that number had grown to 1,467. As The Register reported, the root issue lies in Anthropic’s MCP software development kit (SDK) transport mechanism. Any developer using the official SDK inherits the vulnerability class. VentureBeat Prescriptive Matrix: Three-layer agent supply-chain audit VentureBeat developed a Prescriptive Matrix by mapping the three attack layers documented in the research and incident reports above against the detection capabilities of current SAST, SCA, and agent-layer tools. Each row identifies what security teams should verify and where no scanner has coverage today. Layer Threat Current detection Why it misses Recommended action 1. Code Prompt injection in AI-generated code SAST scanners Most SAST tools have no detection category for prompt injection in AI-generated code Confirm that SAST scans AI-generated code for prompt injection. If not, have an open vendor conversation this quarter. 2. Dependencies Malicious MCP servers, agent skills, plugin registries SCA tools SCA generates no AI-specific bill of materials. Agent-layer dependencies are invisible. Confirm SCA includes MCP servers, agent skills, and plugin registries in the dependency inventory. 3. Agent integration Poisoned SKILL.md files, malicious instruction sets, adversarial rules files None until April 2026 No tool inspects the semantic meaning of agent instruction files. Baer: “We’re not inspecting intent.” Deploy Cisco Skill Scanner or Snyk mcp-scan. Assign a team to own this layer. Baer’s diagnosis of Layer 3 applies across the entire matrix: “Current scanners look for known bad artifacts, not adversarial instructions embedded in otherwise valid skills.” Cisco’s open-source Skill Scanner and Snyk’s mcp-scan represent the first tools purpose-built for this layer. Security director action plan Here's how security leaders can get ahead of the problem. Inventory every agent bridge tool in the environment. This includes CLI-Anything, MCP connectors, Cursor rules files, Claude Code skills, GitHub Copilot extensions. If the development team is using agent bridge tools that have not been inventoried, the risk cannot be assessed. Audit agent skill sources the same way package registries get audited. Baer’s framing is precise: “A skill is effectively untrusted executable intent, even if it’s just text.” Shut off ungoverned ingestion paths until controls are in place. Stand up a review and allowlisting process for skills. The OWASP Agentic Skills Top 10 (AST01: Malicious Skills) provides the procurement framework to align controls against. Deploy agent-layer scanning. Evaluate Cisco’s open-source Skill Scanner and Snyk’s mcp-scan for behavioral analysis of agent instruction files. If dedicated tooling is unavailable, require a second engineer to read every SKILL.md before installation. Restrict agent execution privileges and instrument runtime. AI coding agents should not run with the same credential scope as the developer who invoked them. Rees confirmed the structural flaw: The flat authorization plane means a compromised skill does not need to escalate privileges. Baer’s prescription: “Instrument runtime observability. What data is the agent accessing, what actions is it taking, and are those aligned with expected behavior?” Assign ownership for the gap between layers. The most dangerous attacks succeed because they fall between detection categories. Assign a team to own the agent integration layer. Review every SKILL.md, MCP config, and rules file before it enters the environment. The gap that already has a name Baer underscored the dangers of this new attack vector. “This feels very similar to early container security, but we’re still in the ‘we’ll get to it’ phase across most orgs," she said. She added that, at AWS, it took a few high-profile wake-up calls before container security became table stakes. The difference this time is speed. “There’s no build pipeline, no compilation barrier. Just content," she said. CLI-Anything is not the threat. It is the proof case that the agent integration layer exists, that it is growing fast, and that the attacker community has already found it. The 33,000 developers who starred the repository are telling security teams where software development is heading. Eighteen months ago, the detection category for agent-integration-layer poisoning did not exist. Cisco and Snyk shipped the first tools for it in April. The window between those two facts is closing. Security directors who have not begun inventory are already behind.
- Claude Code, Copilot and Codex all got hacked. Every attacker went for the credential, not the model.On March 30, BeyondTrust proved that a crafted GitHub branch name could steal Codex’s OAuth token in cleartext. OpenAI classified it Critical P1. Two days later, Anthropic’s Claude Code source code spilled onto the public npm registry, and within hours, Adversa found Claude Code silently ignored its own deny rules once a command exceeded 50 subcommands. These were not isolated bugs. They were the latest in a nine-month run: six research teams disclosed exploits against Codex, Claude Code, Copilot, and Vertex AI, and every exploit followed the same pattern. An AI coding agent held a credential, executed an action, and authenticated to a production system without a human session anchoring the request. The attack surface was first demonstrated at Black Hat USA 2025, when Zenity CTO Michael Bargury hijacked ChatGPT, Microsoft Copilot Studio, Google Gemini, Salesforce Einstein and Cursor with Jira MCP on stage with zero clicks. Nine months later, those credentials are what attackers reached. Merritt Baer, CSO at Enkrypt AI and former Deputy CISO at AWS, named the failure in an exclusive VentureBeat interview. “Enterprises believe they’ve ‘approved’ AI vendors, but what they’ve actually approved is an interface, not the underlying system.” The credentials underneath the interface are the breach. Codex, where a branch name stole GitHub tokens BeyondTrust researcher Tyler Jespersen, with Fletcher Davis and Simon Stewart, found Codex cloned repositories using a GitHub OAuth token embedded in the git remote URL. During cloning, the branch name parameter flowed unsanitized into the setup script. A semicolon and a backtick subshell turned the branch name into an exfiltration payload. Stewart added the stealth. By appending 94 Ideographic Space characters (Unicode U+3000) after “main,” the malicious branch looked identical to the standard main branch in the Codex web portal. A developer sees “main.” The shell sees curl exfiltrating their token. OpenAI classified it Critical P1 and shipped full remediation by February 5, 2026. Claude Code, where two CVEs and a 50-subcommand bypass broke the sandbox CVE-2026-25723 hit Claude Code’s file-write restrictions. Piped sed and echo commands escaped the project sandbox because command chaining was not validated. Patched in 2.0.55. CVE-2026-33068 was subtler. Claude Code resolved permission modes from .claude/settings.json before showing the workspace trust dialog. A malicious repo set permissions.defaultMode to bypassPermissions. The trust prompt never appeared. Patched in 2.1.53. The 50-subcommand bypass landed last. Adversa found that Claude Code silently dropped deny-rule enforcement once a command exceeded 50 subcommands. Anthropic’s engineers had traded security for speed and stopped checking after the fiftieth. Patched in 2.1.90. “A significant vulnerability in enterprise AI is broken access control, where the flat authorization plane of an LLM fails to respect user permissions,” wrote Carter Rees, VP of AI and Machine Learning at Reputation and a member of the Utah AI Commission. The repository decided what permissions the agent had. The token budget decided which deny rules survived. Copilot, where a pull request description and a GitHub issue both became root Johann Rehberger demonstrated CVE-2025-53773 against GitHub Copilot with Markus Vervier of Persistent Security as co-discoverer. Hidden instructions in PR descriptions triggered Copilot to flip auto-approve mode in .vscode/settings.json. That disabled all confirmations and granted unrestricted shell execution across Windows, macOS, and Linux. Microsoft patched it in the August 2025 Patch Tuesday release. Then, Orca Security cracked Copilot inside GitHub Codespaces. Hidden instructions in a GitHub issue manipulated Copilot into checking out a malicious PR with a symbolic link to /workspaces/.codespaces/shared/user-secrets-envs.json. A crafted JSON $schema URL exfiltrated the privileged GITHUB_TOKEN. Full repository takeover. Zero user interaction beyond opening the issue. Mike Riemer, CTO at Ivanti, framed the speed dimension in a VentureBeat interview: “Threat actors are reverse engineering patches within 72 hours. If a customer doesn’t patch within 72 hours of release, they’re open to exploit.” Agents compress that window to seconds. Vertex AI, where default scopes reached Gmail, Drive and Google’s own supply chain Unit 42 researcher Ofir Shaty found that the default Google service identity attached to every Vertex AI agent had excessive permissions. Stolen P4SA credentials granted unrestricted read access to every Cloud Storage bucket in the project and reached restricted, Google-owned Artifact Registry repositories at the core of the Vertex AI Reasoning Engine. Shaty described the compromised P4SA as functioning like a "double agent," with access to both user data and Google's own infrastructure. VentureBeat defense grid Security requirement Defense shipped Exploit path The gap Sandbox AI agent execution Codex runs tasks in cloud containers; token scrubbed during agent runtime. Token present during cloning. Branch-name command injection executed before cleanup. No input sanitization on container setup parameters. Restrict file system access Claude Code sandboxes writes via accept-edits mode. Piped sed/echo escaped sandbox (CVE-2026-25723). Settings.json bypassed trust dialog (CVE-2026-33068). 50-subcommand chain dropped deny-rule enforcement. Command chaining not validated. Settings loaded before trust. Deny rules truncated for performance. Block prompt injection in code context Copilot filters PR descriptions for known injection patterns. Hidden injections in PRs, README files, and GitHub issues triggered RCE (CVE-2025-53773 + Orca RoguePilot). Static pattern matching loses to embedded prompts in legitimate review and Codespaces flows. Scope agent credentials to least privilege Vertex AI Agent Engine uses P4SA service agent with OAuth scopes. Default scopes reached Gmail, Calendar, Drive. P4SA credentials read every Cloud Storage bucket and Google’s Artifact Registry. OAuth scopes non-editable by default. Least privilege violated by design. Inventory and govern agent identities No major AI coding agent vendor ships agent identity discovery or lifecycle management. Not attempted. Enterprises do not inventory AI coding agents, their credentials, or their permission scopes. AI coding agents are invisible to IAM, CMDB, and asset inventory. Zero governance exists. Detect credential exfiltration from agent runtime Codex obscures tokens in web portal view. Claude Code logs subcommands. Tokens visible in cleartext inside containers. Unicode obfuscation hid exfil payloads. Subcommand chaining hid intent. No runtime monitoring of agent network calls. Log truncation hid the bypass. Audit AI-generated code for security flaws Anthropic launched Claude Code Security (Feb 2026). OpenAI launched Codex Security (March 2026). Both scan generated code. Neither scans the agent’s own execution environment or credential handling. Code-output security is not agent-runtime security. The agent itself is the attack surface. Every exploit targeted runtime credentials, not model output Every vendor shipped a defense. Every defense was bypassed. The Sonar 2026 State of Code Developer Survey found 25% of developers use AI agents regularly, and 64% have started using them. Veracode tested more than 100 LLMs and found 45% of generated code samples introduced OWASP Top 10 flaws, a separate failure that compounds the runtime credential gap. CrowdStrike CTO Elia Zaitsev framed the rule in an exclusive VentureBeat interview at RSAC 2026: collapse agent identities back to the human, because an agent acting on your behalf should never have more privileges than you do. Codex held a GitHub OAuth token scoped to every repository the developer authorized. Vertex AI’s P4SA read every Cloud Storage bucket in the project. Claude Code traded deny-rule enforcement for token budget. Kayne McGladrey, an IEEE Senior Member who advises enterprises on identity risk, made the same diagnosis in an exclusive interview with VentureBeat. "It uses far more permissions than it should have, more than a human would, because of the speed of scale and intent." Riemer drew the operational line in an exclusive VentureBeat interview. "It becomes, I don't know you until I validate you." The branch name talked to the shell before validation. The GitHub issue talked to Copilot before anyone read it. Security director action plan Inventory every AI coding agent (CIEM). Codex, Claude Code, Copilot, Cursor, Gemini Code Assist, Windsurf. List the credentials and OAuth scopes each received at setup. If your CMDB has no category for AI agent identities, create one. Audit OAuth scopes and patch levels. Upgrade Claude Code to 2.1.90 or later. Verify Copilot's August 2025 patch. Migrate Vertex AI to the bring-your-own-service-account model. Treat branch names, pull request descriptions, GitHub issues, and repo configuration as untrusted input. Monitor for Unicode obfuscation (U+3000), command chaining over 50 subcommands, and changes to .vscode/settings.json or .claude/settings.json that flip permission modes. Govern agent identities the way you govern human privileged identities (PAM/IGA). Credential rotation. Least-privilege scoping. Separation of duties between the agent that writes code and the agent that deploys it. CyberArk, Delinea, and any PAM platform that accepts non-human identities can onboard agent OAuth credentials today; Gravitee's 2026 survey found only 21.9% of teams have done it. Validate before you communicate. "As long as we trust and we check and we validate, I'm fine with letting AI maintain it," Riemer said. Before any AI coding agent authenticates to GitHub, Gmail, or an internal repository, verify the agent's identity, scope, and the human session it is bound to. Ask each vendor in writing before your next renewal. "Show me the identity lifecycle management controls for the AI agent running in my environment, including credential scope, rotation policy, and permission audit trail." If the vendor cannot answer, that is the audit finding. The governance gap in three sentences Most CISOs inventory every human identity and have zero inventory of the AI agents running with equivalent credentials. No IAM framework governs human privilege escalation and agent privilege escalation with the same rigor. Most scanners track every CVE but cannot alert when a branch name exfiltrates a GitHub token through a container that developers trust by default. Zaitsev's advice to RSAC 2026 attendees was blunt: you already know what to do. Agents just made the cost of not doing it catastrophic.
- CrowdStrike, Cisco and Palo Alto Networks all shipped agentic SOC tools at RSAC 2026 — the agent behavioral baseline gap survived all threeCrowdStrike CEO George Kurtz highlighted in his RSA Conference 2026 keynote that the fastest recorded adversary breakout time has dropped to 27 seconds. The average is now 29 minutes, down from 48 minutes in 2024. That is how much time defenders have before a threat spreads. Now CrowdStrike sensors detect more than 1,800 distinct AI applications running on enterprise endpoints, representing nearly 160 million unique application instances. Every one generates detection events, identity events, and data access logs flowing into SIEM systems architected for human-speed workflows. Cisco found that 85% of surveyed enterprise customers have AI agent pilots underway. Only 5% moved agents into production, according to Cisco President and Chief Product Officer Jeetu Patel in his RSAC blog post. That 80-point gap exists because security teams cannot answer the basic questions agents force. Which agents are running, what are they authorized to do, and who is accountable when one goes wrong. “The number one threat is security complexity. But we’re running towards that direction in AI as well,” Etay Maor, VP of Threat Intelligence at Cato Networks, told VentureBeat at RSAC 2026. Maor has attended the conference for 16 consecutive years. “We’re going with multiple point solutions for AI. And now you’re creating the next wave of security complexity.” Agents look identical to humans in your logs In most default logging configurations, agent-initiated activity looks identical to human-initiated activity in security logs. “It looks indistinguishable if an agent runs Louis’s web browser versus if Louis runs his browser,” Elia Zaitsev, CTO of CrowdStrike, told VentureBeat in an exclusive interview at RSAC 2026. Distinguishing the two requires walking the process tree. “I can actually walk up that process tree and say, this Chrome process was launched by Louis from the desktop. This Chrome process was launched from Louis’s Claude Cowork or ChatGPT application. Thus, it’s agentically controlled.” Without that depth of endpoint visibility, a compromised agent executing a sanctioned API call with valid credentials fires zero alerts. The exploit surface is already being tested. During his keynote, Kurtz described ClawHavoc, the first major supply chain attack on an AI agent ecosystem, targeting ClawHub, OpenClaw's public skills registry. Koi Security's February audit found 341 malicious skills out of 2,857; a follow-up analysis by Antiy CERT identified 1,184 compromised packages historically across the platform. Kurtz noted ClawHub now hosts 13,000 skills in its registry. The infected skills contained backdoors, reverse shells, and credential harvesters; Kurtz said in his keynote that some erased their own memory after installation and could remain latent before activating. "The frontier AI creators will not secure itself," Kurtz said. "The frontier labs are following the same playbook. They're building it. They're not securing it." Two agentic SOC architectures, one shared blind spot Approach A: AI agents inside the SIEM. Cisco and Splunk announced six specialized AI agents for Splunk Enterprise Security: Detection Builder, Triage, Guided Response, Standard Operating Procedures (SOP), Malware Threat Reversing, and Automation Builder. Malware Threat Reversing is currently available in Splunk Attack Analyzer and Detection Studio is generally available as a unified workspace; the remaining five agents are in alpha or prerelease through June 2026. Exposure Analytics and Federated Search follow the same timeline. Upstream of the SOC, Cisco's DefenseClaw framework scans OpenClaw skills and MCP servers before deployment, while new Duo IAM capabilities extend zero trust to agents with verified identities and time-bound permissions. “The biggest impediment to scaled adoption in enterprises for business-critical tasks is establishing a sufficient amount of trust,” Patel told VentureBeat. “Delegating and trusted delegating, the difference between those two, one leads to bankruptcy. The other leads to market dominance.” Approach B: Upstream pipeline detection. CrowdStrike pushed analytics into the data ingestion pipeline itself, integrating its Onum acquisition natively into Falcon’s ingestion system for real-time analytics, detection, and enrichment before events reach the analyst’s queue. Falcon Next-Gen SIEM now ingests Microsoft Defender for Endpoint telemetry natively, so Defender shops do not need additional sensors. CrowdStrike also introduced federated search across third-party data stores and a Query Translation Agent that converts legacy Splunk queries to accelerate SIEM migration. Falcon Data Security for the Agentic Enterprise applies cross-domain data loss prevention to data agents' access at runtime. CrowdStrike’s adversary-informed cloud risk prioritization connects agent activity in cloud workloads to the same detection pipeline. Agentic MDR through Falcon Complete adds machine-speed managed detection for teams that cannot build the capability internally. “The agentic SOC is all about, how do we keep up?” Zaitsev said. “There’s almost no conceivable way they can do it if they don’t have their own agentic assistance.” CrowdStrike opened its platform to external AI providers through Charlotte AI AgentWorks, announced at RSAC 2026, letting customers build custom security agents on Falcon using frontier AI models. Launch partners include Accenture, Anthropic, AWS, Deloitte, Kroll, NVIDIA, OpenAI, Salesforce, and Telefónica Tech. IBM validated buyer demand through a collaboration integrating Charlotte AI with its Autonomous Threat Operations Machine for coordinated, machine-speed investigation and containment. The ecosystem contenders. Palo Alto Networks, in an exclusive pre-RSAC briefing with VentureBeat, outlined Prisma AIRS 3.0, extending its AI security platform to agents with artifact scanning, agent red teaming, and a runtime that catches memory poisoning and excessive permissions. The company introduced an agentic identity provider for agent discovery and credential validation. Once Palo Alto Networks closes its proposed acquisition of Koi, the company adds agentic endpoint security. Cortex delivers agentic security orchestration across its customer base. Intel announced that CrowdStrike’s Falcon platform is being optimized for Intel-powered AI PCs, leveraging neural processing units and silicon-level telemetry to detect agent behavior on the device. Kurtz framed AIDR, AI Detection and Response, as the next category beyond EDR, tracking agent-speed activity across endpoints, SaaS, cloud, and AI pipelines. He said that “humans are going to have 90 agents that work for them on average” as adoption scales but did not specify a timeline. The gap no vendor closed What security leaders need Approach A: agents inside the SIEM (Cisco/Splunk) Approach B: upstream pipeline detection (CrowdStrike) Gap neither closes Triage at agent volume Six AI agents handle triage, detection, and response inside Splunk ES Onum-powered pipeline detects and enriches threats before the analyst sees them Neither baselines normal agent behavior before flagging anomalies Agent vs. human differentiation Duo IAM tracks agent identities but does not differentiate agent from human activity in SOC telemetry Process tree lineage distinguishes at runtime. AIDR extends to agent-specific detection No vendor’s announced capabilities include an out-of-the-box agent behavioral baseline 27-second response window Guided Response Agent executes containment at machine speed In-pipeline detection reduces queue volume. Agentic MDR adds managed response Human-in-the-loop governance has not been reconciled with machine-speed response in either approach Legacy SIEM portability Native Splunk integration preserves existing workflows Query Translation Agent converts Splunk queries. Native Defender ingestion lets Microsoft shops migrate Neither addresses teams running multiple SIEMs during migration Agent supply chain DefenseClaw scans skills and MCP servers pre-deployment. Explorer Edition red-teams agents EDR AI Runtime Protection catches compromised skills post-deployment. Charlotte AI AgentWorks enables custom agents Neither covers the full lifecycle. Pre-deployment scanning misses runtime exploits and vice versa The matrix makes one thing visible that the keynotes did not. No vendor shipped an agent behavioral baseline. Both approaches automate triage and accelerate detection. Based on VentureBeat's review of announced capabilities, neither defines what normal agent behavior looks like in a given enterprise environment. Teams running Microsoft Sentinel and Copilot for Security represent a third architecture not formally announced as a competing approach at RSAC this week, but CISOs in Microsoft-heavy environments need to test whether Sentinel's native agent telemetry ingestion and Copilot's automated triage close the same gaps identified above. Maor cautioned that the vendor response recycles a pattern he has tracked for 16 years. “I hope we don’t have to go through this whole cycle,” he told VentureBeat. “I hope we learned from the past. It doesn’t really look like it.” Zaitsev’s advice was blunt. “You already know what to do. You’ve known what to do for five, ten, fifteen years. It’s time to finally go do it.” Five things to do Monday morning These steps apply regardless of your SOC platform. None requires ripping and replacing current tools. Start with visibility, then layer in controls as agent volume grows. Inventory every agent on your endpoints. CrowdStrike detects 1,800 AI applications across enterprise devices. Cisco’s Duo Identity Intelligence discovers agentic identities. Palo Alto Networks’ agentic IDP catalogs agents and maps them to human owners. If you run a different platform, start with an EDR query for known agent directories and binaries. You cannot set policy for agents you do not know exist. Determine whether your SOC stack can differentiate agent from human activity. CrowdStrike’s Falcon sensor and AIDR do this through process tree lineage. Palo Alto Networks’ agent runtime catches memory poisoning at execution. If your tools cannot make this distinction, your triage rules are applying the wrong behavioral models. Match the architectural approach to your current SIEM. Splunk shops gain agent capabilities through Approach A. Teams evaluating migration get pipeline detection with Splunk query translation and native Defender ingestion through Approach B. Palo Alto Networks’ Cortex delivers a third option. Teams on Microsoft Sentinel, Google Chronicle, Elastic, or other platforms should evaluate whether their SIEM can ingest agent-specific telemetry at this volume. Build an agent behavioral baseline before your next board meeting. No vendor ships one. Define what your agents are authorized to do: which APIs, which data stores, which actions, at which times. Create detection rules for anything outside that scope. Pressure-test your agent supply chain. Cisco’s DefenseClaw and Explorer Edition scan and red-team agents before deployment. CrowdStrike’s runtime detection catches compromised agents post-deployment. Both layers are necessary. Kurtz said in his keynote that ClawHavoc compromised over a thousand ClawHub skills with malware that erased its own memory after installation. If your playbook does not account for an authorized agent executing unauthorized actions at machine speed, rewrite it. The SOC was built to protect humans using machines. It now protects machines using machines. The response window shrank from 48 minutes to 27 seconds. Any agent generating an alert is now a suspect, not just a sensor. The decisions security leaders make in the next 90 days will determine whether their SOC operates in this new reality or gets buried under it.
- AI agent credentials live in the same box as untrusted code. Two new architectures show where the blast radius actually stops.Four separate RSAC 2026 keynotes arrived at the same conclusion without coordinating. Microsoft's Vasu Jakkal told attendees that zero trust must extend to AI. Cisco's Jeetu Patel called for a shift from access control to action control, saying in an exclusive interview with VentureBeat that agents behave "more like teenagers, supremely intelligent, but with no fear of consequence." CrowdStrike's George Kurtz identified AI governance as the biggest gap in enterprise technology. Splunk's John Morgan called for an agentic trust and governance model. Four companies. Four stages. One problem. Matt Caulfield, VP of Product for Identity and Duo at Cisco, put it bluntly in an exclusive VentureBeat interview at RSAC. "While the concept of zero trust is good, we need to take it a step further," Caulfield said. "It's not just about authenticating once and then letting the agent run wild. It's about continuously verifying and scrutinizing every single action the agent's trying to take, because at any moment, that agent can go rogue." Seventy-nine percent of organizations already use AI agents, according to PwC's 2025 AI Agent Survey. Only 14.4% reported full security approval for their entire agent fleet, per the Gravitee State of AI Agent Security 2026 report of 919 organizations in February 2026. A CSA survey presented at RSAC found that only 26% have AI governance policies. CSA's Agentic Trust Framework describes the resulting gap between deployment velocity and security readiness as a governance emergency. Cybersecurity leaders and industry executives at RSAC agreed on the problem. Then two companies shipped architectures that answer the question differently. The gap between their designs reveals where the real risk sits. The monolithic agent problem that security teams are inheriting The default enterprise agent pattern is a monolithic container. The model reasons, calls tools, executes generated code, and holds credentials in one process. Every component trusts every other component. OAuth tokens, API keys, and git credentials sit in the same environment where the agent runs code it wrote seconds ago. A prompt injection gives the attacker everything. Tokens are exfiltrable. Sessions are spawnable. The blast radius is not the agent. It is the entire container and every connected service. The CSA and Aembit survey of 228 IT and security professionals quantifies how common this remains: 43% use shared service accounts for agents, 52% rely on workload identities rather than agent-specific credentials, and 68% cannot distinguish agent activity from human activity in their logs. No single function claimed ownership of AI agent access. Security said it was a developer's responsibility. Developers said it was a security responsibility. Nobody owned it. CrowdStrike CTO Elia Zaitsev, in an exclusive VentureBeat interview, said the pattern should look familiar. "A lot of what securing agents look like would be very similar to what it looks like to secure highly privileged users. They have identities, they have access to underlying systems, they reason, they take action," Zaitsev said. "There's rarely going to be one single solution that is the silver bullet. It's a defense in depth strategy." CrowdStrike CEO George Kurtz highlighted ClawHavoc (a supply chain campaign targeting the OpenClaw agentic framework) at RSAC during his keynote. Koi Security named the campaign on February 1, 2026. Antiy CERT confirmed 1,184 malicious skills tied to 12 publisher accounts, according to multiple independent analyses of the campaign. Snyk's ToxicSkills research found that 36.8% of the 3,984 ClawHub skills scanned contain security flaws at any severity level, with 13.4% rated critical. Average breakout time has dropped to 29 minutes. Fastest observed: 27 seconds. (CrowdStrike 2026 Global Threat Report) Anthropic separates the brain from the hands Anthropic's Managed Agents, launched April 8 in public beta, split every agent into three components that do not trust each other: a brain (Claude and the harness routing its decisions), hands (disposable Linux containers where code executes), and a session (an append-only event log outside both). Separating instructions from execution is one of the oldest patterns in software. Microservices, serverless functions, and message queues. Credentials never enter the sandbox. Anthropic stores OAuth tokens in an external vault. When the agent needs to call an MCP tool, it sends a session-bound token to a dedicated proxy. The proxy fetches real credentials from the vault, makes the external call, and returns the result. The agent never sees the actual token. Git tokens get wired into the local remote at sandbox initialization. Push and pull work without the agent touching the credential. For security directors, this means a compromised sandbox yields nothing an attacker can reuse. The security gain arrived as a side effect of a performance fix. Anthropic decoupled the brain from the hands so inference could start before the container booted. Median time to first token dropped roughly 60%. The zero-trust design is also the fastest design. That kills the enterprise objection that security adds latency. Session durability is the third structural gain. A container crash in the monolithic pattern means total state loss. In Managed Agents, the session log persists outside both brain and hands. If the harness crashes, a new one boots, reads the event log, and resumes. No state lost turns into a productivity gain over time. Managed Agents include built-in session tracing through the Claude Console. Pricing: $0.08 per session-hour of active runtime, idle time excluded, plus standard API token costs. Security directors can now model agent compromise cost per session-hour against the cost of the architectural controls. Nvidia locks the sandbox down and monitors everything inside it Nvidia's NemoClaw, released March 16 in early preview, takes the opposite approach. It does not separate the agent from its execution environment. It wraps the entire agent inside four stacked security layers and watches every move. Anthropic and Nvidia are the only two vendors to have shipped zero-trust agent architectures publicly as of this writing; others are in development. NemoClaw stacks five enforcement layers between the agent and the host. Sandboxed execution uses Landlock, seccomp, and network namespace isolation at the kernel level. Default-deny outbound networking forces every external connection through explicit operator approval via YAML-based policy. Access runs with minimal privileges. A privacy router directs sensitive queries to locally-running Nemotron models, cutting token cost and data leakage to zero. The layer that matters most to security teams is intent verification: OpenShell's policy engine intercepts every agent action before it touches the host. The trade-off for organizations evaluating NemoClaw is straightforward. Stronger runtime visibility costs more operator staffing. The agent does not know it is inside NemoClaw. In-policy actions return normally. Out-of-policy actions get a configurable denial. Observability is the strongest layer. A real-time Terminal User Interface logs every action, every network request, every blocked connection. The audit trail is complete. The problem is cost: operator load scales linearly with agent activity. Every new endpoint requires manual approval. Observation quality is high. Autonomy is low. That ratio gets expensive fast in production environments running dozens of agents. Durability is the gap nobody's talking about. Agent state persists as files inside the sandbox. If the sandbox fails, the state goes with it. No external session recovery mechanism exists. Long-running agent tasks carry a durability risk that security teams need to price into deployment planning before they hit production. The credential proximity gap Both architectures are a real step up from the monolithic default. Where they diverge is the question that matters most to security teams: how close do credentials sit to the execution environment? Anthropic removes credentials from the blast radius entirely. If an attacker compromises the sandbox through prompt injection, they get a disposable container with no tokens and no persistent state. Exfiltrating credentials requires a two-hop attack: influence the brain's reasoning, then convince it to act through a container that holds nothing worth stealing. Single-hop exfiltration is structurally eliminated. NemoClaw constrains the blast radius and monitors every action inside it. Four security layers limit lateral movement. Default-deny networking blocks unauthorized connections. But the agent and generated code share the same sandbox. Nvidia's privacy router keeps inference credentials on the host, outside the sandbox. But messaging and integration tokens (Telegram, Slack, Discord) are injected into the sandbox as runtime environment variables. Inference API keys are proxied through the privacy router and not passed into the sandbox directly. The exposure varies by credential type. Credentials are policy-gated, not structurally removed. That distinction matters most for indirect prompt injection, where an adversary embeds instructions in content the agent queries as part of legitimate work. A poisoned web page. A manipulated API response. The intent verification layer evaluates what the agent proposes to do, not the content of data returned by external tools. Injected instructions enter the reasoning chain as trusted context. With proximity to execution. In the Anthropic architecture, indirect injection can influence reasoning but cannot reach the credential vault. In the NemoClaw architecture, injected context sits next to both reasoning and execution inside the shared sandbox. That is the widest gap between the two designs. NCC Group's David Brauchler, Technical Director and Head of AI/ML Security, advocates for gated agent architectures built on trust segmentation principles where AI systems inherit the trust level of the data they process. Untrusted input, restricted capabilities. Both Anthropic and Nvidia move in this direction. Neither fully arrives. The zero-trust architecture audit for AI agents The audit grid covers three vendor patterns across six security dimensions, five actions per row. It distills to five priorities: Audit every deployed agent for the monolithic pattern. Flag any agent holding OAuth tokens in its execution environment. The CSA data shows 43% use shared service accounts. Those are the first targets. Require credential isolation in agent deployment RFPs. Specify whether the vendor removes credentials structurally or gates them through policy. Both reduce risk. They reduce it by different amounts with different failure modes. Test session recovery before production. Kill a sandbox mid-task. Verify state survives. If it does not, long-horizon work carries a data-loss risk that compounds with task duration. Staff for the observability model. Anthropic's console tracing integrates with existing observability workflows. NemoClaw's TUI requires an operator-in-the-loop. The staffing math is different. Track indirect prompt injection roadmaps. Neither architecture fully resolves this vector. Anthropic limits the blast radius of a successful injection. NemoClaw catches malicious proposed actions but not malicious returned data. Require vendor roadmap commitments on this specific gap. Zero trust for AI agents stopped being a research topic the moment two architectures shipped. The monolithic default is a liability. The 65-point gap between deployment velocity and security approval is where the next class of breaches will start.