Claude’s next enterprise battle is not models: it’s the agent control plane
Our take
The next significant battle in enterprise AI isn't just about which model performs best; it's about controlling the infrastructure where AI agents operate. Recent VB Pulse data reveals that Microsoft and OpenAI are leading in enterprise agent orchestration, while Anthropic has made its first measurable entry into this space. As enterprises shift their focus from model quality to the orchestration layer, the stakes rise. It's no longer just about chatbots; it's about who governs the agent control plane.
The competitive landscape of enterprise AI is rapidly evolving, moving beyond the initial model war to a more nuanced battleground focused on agent orchestration. Recent data from VB Pulse indicates that Microsoft and OpenAI currently dominate this space, with Anthropic making its first measurable entry. This shift highlights a critical juncture in the AI ecosystem, where the focus is increasingly on who controls the infrastructure that enables AI agents to function effectively. As we explore this development, it is essential to consider how it impacts enterprise productivity and security, particularly in light of other technological advancements, such as those discussed in articles like Osaurus brings both local and cloud AI models to your Mac and Meridian Ventures launched a $35M fund with a focus on MBA-deferred founders.
The findings suggest that enterprises are not merely choosing AI chatbots based on performance but are making strategic decisions about where the operational machinery of AI resides. This emerging focus on orchestration reflects a deeper understanding of the complexities involved in deploying AI agents across various workflows. As organizations grapple with ensuring security and compliance, the ability to govern agent actions becomes paramount. The shift toward prioritizing security, permissions, and governance underscores the need for a robust control layer that can manage these complexities, which is crucial for organizations transitioning to AI-driven workflows.
Anthropic's recent uptick in orchestration usage, while not yet significant, signals its potential to disrupt the established players. The transition from model deployment to agent orchestration is indicative of a broader trend where enterprises are looking for integrated solutions that provide not just AI capabilities but also the necessary oversight and control mechanisms. As Tom Findling noted, the convergence of models and agent frameworks has matured to a point where enterprises must consider the governance of AI agents critically. This shift could redefine how businesses approach their data management strategies, aligning with insights shared in the context of AI infrastructure and its implications on security and operational efficiency.
Looking ahead, the question remains: how will enterprises balance the need for innovative AI solutions with the complexities of governance and security? As more organizations adopt AI agents, the emphasis on a unified identity layer becomes crucial. Without it, companies risk fragmentation and potential security breaches. The growing recognition that orchestration must extend beyond the model to encompass the entirety of an agent's operational scope presents an opportunity for emerging technologies to fill this gap. Companies like Anthropic, with their focus on providing a comprehensive agent runtime, may find themselves at the forefront of this transformation, especially as enterprises seek to avoid lock-in with a single provider.
As we move into 2026, the dynamics of agent orchestration will likely evolve, prompting enterprises to reassess their strategies for deploying AI agents across diverse workflows. The integration of security, governance, and operational oversight will be critical in determining which providers not only survive but thrive in this competitive landscape. The potential for cross-vendor collaboration could reshape the future of AI orchestration, driving innovation while ensuring that the complexities of governance are adequately addressed. This evolution presents a compelling narrative for businesses looking to leverage AI in a responsible and effective manner.

New VB Pulse data shows Microsoft and OpenAI leading enterprise agent orchestration, but Anthropic’s first measurable foothold points to a larger fight over who controls the infrastructure where AI agents run.
For the last two years, the enterprise AI race has mostly been framed as a model war: OpenAI’s GPT series versus Anthropic’s Claude versus Google’s Gemini, with smaller and open-source alternatives also coming in from the U.S. and China.
But the next strategic fight may not be over which model answers a prompt best. It may be over who controls the layer where agents plan, call tools, access data, run workflows and prove to security teams that they did not do anything they were not supposed to do.
New VB Pulse survey data suggests the category is already taking shape. Our independent Enterprise Agentic Orchestration tracker, a survey that records the preferences of qualified, verified technical-decision maker respondents at enterprises at regular intervals, found that Microsoft Copilot Studio and Azure AI Studio led with 38.6% primary-platform adoption in February, up from 35.7% in January.
OpenAI’s Assistants and Responses API held second place, rising from 23.2% to 25.7%.
Anthropic remained far smaller, but it made its first appearance in the tracker: moving from 0% in January to 5.7% in February for Anthropic tool use and workflows.
The underlying move is small — four respondents out of a total 70 in this cohort, with more to come — but strategically interesting because it marks the first sign in this tracker of Claude usage moving from the model layer into native orchestration.
That distinction matters. Enterprises are not merely choosing chatbots. They are deciding where the live operational machinery of AI work will sit: inside Microsoft’s stack, inside OpenAI’s API layer, inside Anthropic’s managed runtime, inside an open framework, or across a hybrid mix of all of them.
“This is the convergence moment for enterprise AI,” said Tom Findling, CEO and cofounder of AI cybsersecurity startup Conifers, in a statement to VentureBeat. “Models and agent frameworks have matured enough together that enterprises are now shifting focus beyond model quality to the control plane around it. In security operations, we’re seeing the competitive advantage move toward platforms that can orchestrate agents, leverage enterprise context, and provide governance and auditability across customer environments.”
Anthropic’s number is still small to start — but the increase is not
The Anthropic number, by itself, should not be overread. A move from zero to 5.7% is not a juggernaut. It is not proof that Anthropic has captured enterprise orchestration.
It is not even enough to say Anthropic has a durable lead in any part of this market. Microsoft owns the early enterprise distribution advantage, and OpenAI has a much larger installed base in orchestration than Anthropic.
But small numbers can matter when they appear at the start of a new market structure. Anthropic’s emergence in orchestration comes as the broader VB Pulse data shows Claude also gaining massive enterprise adoption at the model layer.
In our VB Pulse Q1 Foundation Models and Intelligence Platforms tracker, Anthropic rose from 23.9% in January to 28.6% in February and then even more dramatically to 56.2% in March among qualified enterprise respondents, with the March reading flagged as directional only, because the sample was only 16 respondents.
The story, then, is not that Anthropic is winning orchestration today. It is that Anthropic’s model momentum may be starting to spill into the orchestration layer.
That is where the strategic stakes get higher.
A model is easier to swap than an agent runtime
A model is relatively easy to swap, at least in theory. A company can route one workload to Claude, another to GPT, another to Gemini and another to a smaller open model.
In fact, the VB Pulse Foundation Models tracker over the same Q1 period shows that multi-model strategy is the enterprise consensus: respondents increasingly report adopting multiple models and building orchestration layers that route across them by task, cost and risk profile.
An agent runtime is different. Once a company’s workflows, tool permissions, credentials, audit logs, memory, sandboxed execution and operational monitoring live inside one provider’s environment, switching providers becomes less like changing models and more like changing infrastructure.
That is the real reason Anthropic’s 5.7% foothold is worth watching
Anthropic has already made clear that it wants to provide more than the model. Its Claude Managed Agents documentation describes a public beta for a managed agent harness with secure sandboxing, built-in tools and API-run sessions, while Anthropic’s engineering post frames the architecture around decoupling the model from the surrounding agent machinery: the session, the harness and the sandbox.
In plain English, Anthropic is trying to host the environment where Claude agents remember context, use tools, run code, operate inside sandboxes and persist across long-running workflows. That is no longer just inference. That is operational infrastructure.
The pitch is obvious: most enterprises do not want to stitch together their own agent stack from scratch. They want agents that can act, but they also want permission boundaries, audit trails, workflow reliability and ways to stop the system when something goes wrong.
Security is becoming the buying criterion
The VB Pulse orchestration tracker shows that buyers are prioritizing exactly those concerns. Security and permissions ranked as the top orchestration platform selection criterion in both January and February, at 39.3% and 37.1%.
Control over agent execution rose from 17.9% to 22.9%, while flexibility across models and tools fell from 35.7% to 25.7%. The market appears to be shifting from optionality toward governance.
That shift is not surprising. A chatbot can be wrong and still remain mostly contained. An agent that can send emails, modify documents, query databases, call APIs or execute workflows has a much larger blast radius. The enterprise question is not only whether the agent is smart enough.
It is who gave it permission, what it touched, what it changed, whether those actions were logged, and whether the company can unwind the damage if something goes wrong.
Ev Kontsevoy, cofounder and CEO of Teleport, an identity and digital infrastructure solutions company, argues that the industry is still putting too much emphasis on orchestration itself and not enough on identity: “The race to own the agent orchestration layer is real,” Kontsevoy said. “It’s also solving the wrong problem first. Orchestration without identity only multiplies chaos. Without identity, you don’t know what an agent can access, what it actually did, or how to revoke its access when it operates outside policy. A unified identity layer is a prerequisite to deploying agents — one or many — in infrastructure.”
Syam Nair, Chief Product Officer at the intelligent data infrastructure company NetApp, believes data management is key in all cases to secure AI agent orchestration across the enterprise. As he said in a statement to VentureBeat: "Effective agent management requires built-in intelligence and a continuously updated understanding of both data and, critically, its metadata. This visibility allows organizations to define and enforce clear policies so data is used only by the right agents, for the right purposes. Making this work at scale is a crossfunctional effort. Security, storage, and data science teams must work together to implement policies that safeguard company data, while creating a strong data foundation for AI."
He continued: "The CIOs and technology leaders that are successful are the ones who take the input, policies, and vision from all these teams into account as they build a data infrastructure that minimizes risk and drives business value."
Microsoft has the distribution edge
That is why Microsoft’s early lead makes sense. Copilot Studio and Azure AI Studio sit inside an enterprise stack many companies already use: Microsoft 365, Teams, Entra ID, Azure and existing procurement relationships.
The VB Pulse Orchestration Tracker for Q1 2026 describes Microsoft as the enterprise default, with no other platform within 13 percentage points in February.
David Weston, CVP, AI Security, Microsoft, provided some insight on why, writing in a statement to VentureBeat: "Without a unified control layer, you start to see fragmentation – agents operating in silos, inconsistent governance, and gaps in security. What customers are asking for is a way to bring order to that complexity. With Agent 365, we’re providing a single control plane to observe, govern, and secure agents across Microsoft, partner, and third-party ecosystems, all grounded in enterprise data and identity."
OpenAI’s second-place position is also unsurprising. Its Assistants and Responses API gave developers an early way to build agent-like systems using OpenAI’s models and tooling. In the orchestration tracker, OpenAI is not surging, but it is still ticking up steadily: 23.2% in January to 25.7% in February.
Anthropic is the newcomer at the orchestration layer. But its timing may be favorable. The VB Pulse Foundation Models tracker for Q1 2026 suggests enterprises increasingly see Claude as a fit for higher-stakes workloads where safety, instruction following, long context and governance matter.
The orchestration tracker suggests those same buyers are now moving from agent experiments toward production workflows, where security, permissions and task reliability become the gating issues.
That creates a possible path for Anthropic: not to beat Microsoft as the default enterprise platform, at least not immediately, but to become the agent runtime for companies that already trust Claude for sensitive or complex workloads.
The risk is lock-in
The risk for enterprises is lock-in.
The orchestration tracker found that a hybrid control plane — combining provider-native orchestration with external orchestration — was the leading expected architecture, holding around 35% to 36% across the two substantive waves.
Provider-managed-only approaches grew modestly but remained a minority. The report’s conclusion is blunt: enterprises are not willing to give full orchestration control to any single provider.
It makes total sense as enterprises seek to leverage the "best-in-breed" models, harnesses, and tools from multiple vendors, especially as their needs differ widely across sector, business, and size.
"Most enterprises will operate in a multi-model, multi-agent environment, which makes an independent control plane essential," agreed Felix Van de Maele, CEO of Collibra, a unified data governance startup for AI, in a statement to VentureBeat. "That is why we built AI Command Center: to give organizations the visibility, governance, and real-time oversight needed to manage AI systems and agents across the full lifecycle."
That caution shows up in the risk data. When asked about risks if agent control lives inside a model provider platform, respondents cited security and permissioning limitations as the top concern. Vendor lock-in was the second-largest concern and the only one that increased from January to February, rising from 23.2% to 25.7%.
This is the tension at the heart of the agent market. Enterprises want managed infrastructure because building reliable agents is hard. But the more a provider manages, the more it may own.
Dr. Rania Khalaf, chief AI officer at WSO2 — the subsidiary of EQT that offers open source, customizable AI stacks for enterprises — said enterprises will need an agent control plane that sits apart from individual frameworks, harnesses and runtimes because agents combine the unpredictability of LLMs with the ability to take actions that have consequences.
“Teams want the freedom to use the best model and framework for each job — Claude for coding, Gemini for writing, LangGraph or CrewAI for dynamic modular behavior — and that heterogeneity makes consistent governance untenable in integrated platforms that lock into one ecosystem,” Khalaf said.
From LLMOps to Agent Ops
Khalaf said the industry is also moving from MLOps to LLMOps to “Agent Ops,” where governance has to cover the whole agent, not just the model call.
“A guardrail on an LLM call can catch hallucination or toxic output, but it will not catch an agent thrashing in an unbreakable, costly loop, which is why governance now has to extend out from the LLM interaction to the scope of the agent,” she said.
The practical implication is that enterprises need to separate policy and control from the agent logic itself. Khalaf pointed to the recent example of an agent deleting a production database despite being told not to, arguing that the failure showed the limits of relying on prompt-level instructions where hard identity and access controls are needed.
“Pulling guardrails, evals, policies, bindings, and agent identity out of the core agent logic allows them to be configured per deployment and per environment, owned by the appropriate teams in security, product, and compliance, without fragmenting the governance layer as different teams choose different models and frameworks,” Khalaf said.
MCP is open. The runtime may still be sticky
That is where Anthropic’s Model Context Protocol, or MCP, complicates the story. MCP is not a walled garden; Anthropic introduced it as an open standard for connecting AI systems to data and tools, and Anthropic’s documentation describes MCP as an open-source standard for connecting AI applications to external systems.
But openness at the protocol layer does not automatically eliminate lock-in at the runtime layer. An enterprise could use an open protocol to connect tools while still becoming dependent on a provider’s managed sessions, logs, sandboxes, permissions model, workflow state and deployment environment. In other words, MCP may reduce integration friction, while managed agent infrastructure could still increase switching costs.
Khalaf said Microsoft’s lead likely reflects its M365 and Azure distribution, while Anthropic’s emerging foothold could reflect a different architectural bet around open protocols such as MCP. But she argued the long-term direction is not a single-provider stack.
“Enterprises serious about running agents in production will end up multi-vendor across these layers,” Khalaf said, “which is why the open and interoperable control plane matters more than the current percentages might suggest.”
The next cycle may be cross-vendor collaboration
That same tension — between provider-native convenience and cross-vendor reality — is where Arick Goomanovsky, CEO and cofounder of universal AI agent orchestrator startup BAND, sees the next competitive cycle forming.
“Enterprises now run agents everywhere: individual assistants and coding agents, multi-agent systems in production, agents embedded in Agentforce and ServiceNow, and third-party agents consumed as agent-as-a-service,” Goomanovsky said. “None of them collaborate across those boundaries by default.”
Goomanovsky argues that the missing layer is not just orchestration inside a single model provider, but a cross-vendor collaboration layer that lets agents from different ecosystems act together.
“What’s emerging in parallel is demand for an agentic collaboration harness - an interaction layer that lets agents from Microsoft, OpenAI, Anthropic, and internal teams operate as one workforce,” he said. “Orchestration inside any single vendor is still a walled garden so the next competitive cycle is cross-vendor agent collaboration.”
Independent frameworks face an enterprise packaging problem
There is also a warning sign for independent orchestration frameworks. LangChain and LangGraph fell from 5.4% to 1.4% as the primary orchestration platform in the qualified enterprise sample.
External orchestration abstracted entirely from model providers also fell from 8.9% to 2.9%.
Scott Likens, Global Chief AI Engineer at professional services giant PwC, has a front row seat to this trend as the company spearheads and assists clients with their AI transformations.
As he told VentureBeat in a statement: "Right now, most enterprises are still operating in fragmented environments, with orchestration spread across platforms, business applications, and internally developed tooling. Over time, the market will likely move toward more unified orchestration models, but interoperability, governance and security will remain critical because enterprises are unlikely to standardize on a single agent ecosystem."
The report argues that fully independent orchestration frameworks may not yet have the enterprise packaging — security certifications, support, compliance documentation and vendor accountability — that procurement teams require.
That does not mean open frameworks are irrelevant. It does suggest that enterprise buyers may increasingly consume open or developer-first orchestration through managed products, cloud-provider partnerships or internal control planes rather than as standalone frameworks.
The agent market starts to look like cloud infrastructure
This is where the agent market starts to look less like the early chatbot market and more like enterprise cloud infrastructure. The winning vendors will not only have capable models. They will have identity integration, permission controls, audit logs, observability, workflow tooling, sandboxing, evaluation and a credible answer to who owns the control plane.
Indeed, the orchestration layer is but one part of the stack that the enterprise must fill in, and enterprises may actually decide to have different orchestration layers for agents working in different departments and functions.
As Nithya Lakshmanan, Chief Product Officer at revenue team AI orchestration startup Outreach.ai wrote in a statement to VentureBeat: "General-purpose orchestration platforms coordinate agent activity well, but they don't carry the workflow-specific context that determines whether an agent's action is correct for a given situation. In revenue workflows, an agent acting on incomplete deal history or missing buyer context will underperform and erode trust with users. The teams getting the most out of multi-agent systems are treating domain-specific data as the governance layer, with orchestration sitting on top. Most enterprises have chosen their orchestration stack, and what they're now figuring out is how those platforms get access to the workflow context they need to make agents useful inside specific business functions."
That is why Anthropic — which is increasingly launching its own domain-specific agents for finance and design, among other categories — is worth following closely. The company does not need to win the entire orchestration market tomorrow for its strategy to matter. It only needs to persuade a growing set of Claude enterprise customers to let Anthropic handle more of the surrounding machinery: tools, workflows, memory, execution and governance.
If it succeeds, Claude becomes more than a model in a multi-model portfolio. It becomes part of the infrastructure where enterprise work gets done.
That would put Anthropic in a more direct fight with OpenAI and Microsoft — not just over model quality, but over the operating layer of AI agents.
The narrow but important read
The safe interpretation of the VB Pulse data is narrow but important: Anthropic is not yet a major enterprise orchestration platform. Microsoft is. OpenAI is much closer. But Anthropic has registered its first measurable foothold at the orchestration layer, just as the market is deciding who should control agent execution.
For enterprise buyers, that may be the question that matters most in 2026. Not which model is best, but which provider gets to run the agent — and how hard it will be to leave once the agent is running.
Read on the original site
Open the publisher's page for the full experience
Related Articles
- Anthropic’s Claude Managed Agents gives enterprises a new one-stop shop but raises vendor 'lock-in' riskAnthropic announced a new platform last week, Claude Managed Agents, aiming to cut out the more complex parts of AI agent deployment for enterprises and competes with existing orchestration frameworks. Claude Managed Agents is also an architectural shift: enterprises, already burdened with orchestrating an increasing number of agents, can now choose to embed the orchestration logic in the AI model layer. While this comes with some potential advantages, such as speed (Anthropic proposes its customers can deploy agents in days instead of weeks or months), it also, of course, then also turns more control over the enterprise's AI agent deployments and operations to the model provider — in this case, Anthropic — potentially resulting in greater "lock in" for the enterprise customer, leaving them more subject to Anthropic's terms, conditions, and any subsequent platform changes. But maybe that is worth it for your enterprise, as Anthropic further claims that its platform “handles the complexity” by letting users define agent tasks, tools and guardrails with a built-in orchestration harness, all without the need for sandboxing code execution, checkpointing, credential management, scoped permissions and end-to-end tracing. The framework manages state, execution graphs and routing and brings managed agents to a vendor-controlled runtime loop. Even before the release of Claude Managed Agents, new directional VentureBeat research showed that Anthropic was gaining traction at the orchestration level as enterprises adopted its native tooling. Claude Managed Agents represents a new attempt by the firm to widen its footprint as the orchestration method of choice for organizations. Anthropic is surging in orchestration interest Orchestration has emerged as an important segment for enterprises to address as they scale AI systems and deploy agentic workflows. VentureBeat directional research of several dozen firms for the first quarter of 2026 found that enterprises mostly chose existing frameworks, such as Microsoft’s Copilot Studio/Azure AI Studio, with 38.6% of respondents in February reporting using Microsoft’s platform. VentureBeat surveyed 56 organizations with more than 100 employees in January and 70 in February. OpenAI closely followed at 25.7%. Both showed strong growth between the first two months of the year. Anthropic, driven by increased interest in its offerings, such as Claude Code, over the past year, is putting up a fight. Adoption of the Anthropic tool-use and workflows API increased from 0% to 5.7% between January and February. This tracks closely with the growing adoption of Anthropic’s foundation models, showing that enterprises using Claude turn to the company’s native orchestration tooling instead of adding a third-party framework. While VentureBeat surveyed before the launch of Claude Managed Agents, we can extrapolate that the new tool will build on that growth, especially if it promises a more straightforward way to deploy agents. Collapsing the external orchestration layer Enterprises may find that a streamlined, internal harness for agents compelling, but it does mean giving up certain controls. Session data is stored in a database managed by Anthropic, increasing the risk that enterprises become locked into a system run by a single company. This may be less desirable for some firms and compete with their desires to move away from the locked-in software-as-a-service (SaaS) applications in the current stacks, which many hope that AI will facilitate. The specter of vendor lock-in means agent execution becomes more model-driven rather than direct by the organization, happens in an environment enterprises don’t fully control, and behavior becomes harder to guarantee. It also opens the possibility of giving agents conflicting instructions, especially if the only way for users to exert any control over agents is to prompt them with more context. Agents could have two control planes: one defined by the enterprises’ orchestration system through instructions and the other as an embedded skill from the Claude runtime. This could pose an issue for highly sensitive and regulated workflows, such as financial analysis or customer-facing tasks. Pricing, control and competitive set Balancing control with ease is one thing; enterprises also consider the cost structure of Claude Managed Agents. Claude Managed Agents introduces a hybrid pricing model that blends token-based billing with a usage-based runtime fee. This makes Managed Agets more dynamic, though less predictable, when determining cost structures. Enterprises will be charged a standard rate of $0.08 per hour when agents are actively running. For example, at $0.70 per hour, a one-hour session could cost up to $37 to process 10,000 support tickets, depending on how long each agent runs and how many steps it takes to complete a task. Microsoft, currently the leader according to VentureBeat's directional survey, offers several orchestration offerings. Copilot Studio uses a capacity-based billing structure, so enterprises pay for blocks of interactions between users and agents rather than the number of steps an agent takes. Microsoft's approach tends to be more predictable than Anthropic's pricing plan: Copilot Studio starts at $200 per month for 25,000 messages. Compared to similar competitors like OpenAI's Agents SDK, the picture becomes murky. Agents SDK is technically free to use as an open-source project. However, OpenAI bills for the underlying API usage. Agents built and orchestration with Agents SDK using GPT-5.4, for example, will cost $2.50 per 1 million input tokens and $15 per 1 million output tokens. The enterprise decision Claude Managed Agents does give enterprises who find the actual deployment of production agents too complicated a reprieve. It reduces their engineering overhead while adding speed and simplicity in a fast-changing enterprise environment. But that comes with a choice: lose control, observability and portability and risk further vendor lock-in. Anthropic just made a case for why its ecosystem is becoming not just the foundation model of choice for enterprises, but also the orchestration infrastructure. It becomes more imperative for enterprises to balance ease with lesser control.
- Anthropic wants to own your agent's memory, evals, and orchestration — and that should make enterprises nervousJust a few weeks after announcing Claude Managed Agents, Anthropic has updated the platform with three new capabilities that collapse infrastructure layers like memory, evaluation, and multi-agent orchestration, into a single runtime. This move could threaten the standalone tools that many enterprises cobble together. The new capabilities — 'Dreaming,' 'Outcomes,' and 'Multi-Agent Orchestration' — aim to make agents inside Claude Managed Agents “more capable at handling complex tasks with minimal steering,” Anthropic said in a press release. Dreaming deals with memory, where agents “reflect” on their many sessions and curate memories so they learns and surface unknown patterns. Outcomes allows teams to define and set specific rubrics to measure an agent's success, while Multi-Agent Orchestration breaks jobs down so a lead agent can delegate to other agents. Claude Managed Agents ideally provides enterprises with a simpler path to deploy agents and embeds orchestration logic in the model layer. It’s an end-to-end platform to manage state, execution graphs, and routing. With the addition of Dreaming, Outcomes and Multi-agent Orchestration, Claude Managed Agents expands capabilities even further and directly competes with tools like LangGraph or CrewAI, as well as external evaluation frameworks, RAG memory architectures, and QA loops. An integration threat Enterprises must now ask: Should we ditch our flexible, modular system in favor of an agent platform that brings almost everything in-house? Anthropic designed Claude Managed Agents to share context, state, and traceability in one place. This means the platform sees every decision agents make, rather than enterprises having to wire separate systems together. It sounds practical to have one platform that does everything. But not all enterprises want a full-service system. Claude Managed Agents already faces criticism that it encourages vendor lock-in because it owns most of the architecture and tools that govern agents. In the current paradigm, an organization may run Managed Agents but keep multi-agent orchestration, memory, or evaluations in a separate space ensures flexibility. The platform offers a fully-hosted runtime, which means memory and orchestration run on infrastructure the enterprise does not own. This can become a compliance nightmare for some organizations that have to prove data residency. Another problem to consider is that enterprises already in the middle of large-scale AI transformations must cobble together workarounds to deal with the constraints of their tech stack. Not every workflow is easily replaceable by switching to Claude Managed Agents. Dreaming and outcomes against current tools Most enterprises have a fragmented approach to AI deployment. For example, they may use LangGraph or Crew AI for agent routing and workflow management, Pinecone as a vector database for long-term memory, DeepEval for external evaluation, and a human-in-the-loop quality assurance to review some tasks. Anthropic hopes to do away with all of that. With Dreaming, Anthropic approaches memory by allowing users to actively rewrite it between sessions, so the agent essentially learns from its mistakes. Anthropic says this capability is useful for long-running states and orchestration. Current systems often handle memory persistence by storing embeddings, retrieving relevant context, and adding more state over time. Outcomes addresses the evaluation portion by detailing expectations for agents. Instead of external quality checks, which are often done by a team of humans, Anthropic is bringing evaluation into the orchestration layer rather than above it. But it’s the Multi-Agent Orchestration capability that pits Claude Managed Agents against orchestration frameworks from Microsoft, LangChain, CrewAI, and others. Model providers like Anthropic and OpenAI have already begun pushing aggressively into this space, arguing that bringing this to the model layer gives teams better control. Big decisions to make Enterprises face a big decision, and this one could depend on where they are in agent maturity. If an organization is still experimenting with agents and has not deployed many in production, they may find moving to Claude Managed Agents and configuring Dreaming and Outcomes to their needs much easier. This is the stage of development where, even if enterprises are using a third-party orchestrator like LangChain, they’re still customizing it. But for those who are already further along in the process, the calculation becomes trickier. It’s now a matter of parallel evaluation and better understanding of their processes. Businesses, though, will face the same decision even if they don’t intend to use Claude Managed Agents. Anthropic has signaled that other model and platform providers will likely shift their product roadmaps to a similar model that keeps everything locked in the same system — because models may become interchangeable, but the tooling and orchestration infrastructure will not.
- The AI governance mirage: Why 72% of enterprises don’t have the control and security they think they doDecision makers at 72% of organizations claim to have two or more AI platforms that they identify as their "primary" layer, according to a survey of 40 enterprise companies conducted by VentureBeat last month, revealing real gaps in security and control. For enterprise management and technical leaders, and especially security leaders, these multiple AI platforms extend the attack surfaces of most enterprises at a time when AI-driven attacks have become increasingly potent. The multiple platforms — which include offerings from hyperscaler or AI labs like Microsoft Azure, Google, OpenAI or Anthropic, or big application companies like Epic, Workday or ServiceNow — reflect a state of sprawl that has emerged as these big software providers rush to offer their own AI to their enterprise customers. Those customers, in their own rush to scale AI, are finding they aren’t building a singular strategy — in fact they may be building a collection of contradictions. The strategic paradox: why leading enterprises are building around their vendors For example, take the strategic paradox faced by Mass General Brigham (MGB) hospital system, which has 90,000 employees and is the largest employer in Massachusetts. The hospital system last year had to shut down an uncontrolled number of internal proof of concepts that had sprouted up as employees had gotten carried away with AI projects, said CTO Nallan “Sri” Sriraman at the VentureBeat AI Impact event in Boston on March 26, which focused on the challenges of scaling AI. Instead, the company decided it was better to wait for the software giants it already uses to deliver on their AI roadmaps. Since these companies have so many resources, and were making AI a top priority themselves, it made no sense for MGB to try to build its own AI layer that would be duplicative, he said. "Why are we building it ourselves?" he asked. "Leverage it." Yet, even then, Sriraman’s team has been forced to build workarounds, where those companies haven’t done enough. For example, MGB has just completed a “full-scaled” custom build around Microsoft’s Copilot — to get essentially everything offered by that tool — by putting a "skin" around Copilot to handle the safety and data privacy concerns the major model providers haven't yet mastered. Specifically, MGB needed a way for employees to prompt the AI and not have their protected health information (PHI) leaked back to the Copilot LLM provider, OpenAI. The new secure platform, which can support up to 30,000 users, is really the ultimate contradiction: Even though the company has a mandate to leverage the AI provided by the bigger companies, it needs to build around its failures. The contradiction goes even further. These software vendors used by MGB — which also include Epic, Workday and ServiceNow — are all now building agents for their AI, all operating differently. So MGB has to invest in building a “control plane that coordinates and orchestrates all of these agents,” Sriraman said. “That’s where our investment is going to be.” He noted that companies like his are “discovering and experimenting as the landscape keeps shifting." The marketplace is "still nascent," he said, which makes decisions difficult. The "six blind men" problem Sriraman explained the current vendor landscape with an analogy: "When you ask six blind men to touch an elephant and say, what does this elephant look like?" Sriraman said. "You're gonna get six different answers." What emerges from the research VentureBeat conducted in the first quarter, along with conversations like the one in Boston, is a situation that we at VentureBeat are calling a “governance mirage.” While many enterprises say they have adequate governance, in reality they haven’t created clear accountability or specific guardrails, evaluations or security processes to ensure that governance. The data of disconnect: confidence vs. systematic oversight The research comes from surveys across January, February and March by VentureBeat of enterprise companies with 100 or more employees, with 40 to 70 qualified respondents per topic area — covering agentic orchestration, AI security, RAG and governance. The data lacks statistical significance in many areas and should be treated as directional. The research on governance found that a majority, or 56%, of respondents said they are “very confident” that they’d detect a misbehaving AI model, suggesting that most decision-makers believe they have sufficient basic governance at their companies. However, nearly a third of respondents have no systematic mechanism to detect AI misbehavior until it surfaces through users or audits. In a world where telemetry leakage accounts for 34% of GenAI incidents (Wiz), and the global average breach cost has hit $4.4M (IBM 2025 Cost of a Data Breach), finding out after the damage is done is the default for too many companies. Moreover, 43% of respondents say a central team owns AI governance. That sounds reassuring — until you look at what’s happening everywhere else. Twenty-three percent say governance is unclear or actively contested between teams. Twenty percent say each platform team governs independently. Six percent say no one has formally addressed it. The rest said they were unsure who owned it. More telling is the barrier data. When asked about the single biggest obstacle to governing AI across platforms, “no single owner or accountable team” ranked second at 29% — just behind vendor opacity. Accountability structure and lack of vendor transparency are the two dominant failure modes, and they compound each other: Without a central owner, no one has the mandate to demand transparency from the vendors. The day-two bill: managing sprawl, creep, and lock-in The scaling trap: Red Hat’s warning Brian Gracely, Senior Director at Red Hat, who also spoke at the VentureBeat Boston event last month, addressed the infrastructure side of this sprawl, warning that many enterprises are falling into a trap of deceptive initial wins. Gracely noted that the barrier to entry is almost nonexistent at the start, with nearly anyone able to spin up a project using a credit card and an API key. "Day zero is very, very easy," Gracely said. "Day two is when the bill comes due." Red Hat is positioning its software layer (OpenShift AI) as the necessary buffer to prevent enterprises from getting buried in a single provider's proprietary ecosystem. Gracely’s point is direct: If your control system is built entirely inside one cloud provider’s toolset, you are effectively "renting a cage." The illusion of speed in the early pilot phase often hides a technical debt that becomes obvious the moment you try to move your AI work to a different platform. Gracely illustrated this with a recent example. A senior leader from Red Hat’s centralized CTO office spent part of her vacation contributing to an open-source agent project called OpenClaw, which became widely popular in the first quarter. Within days of her name appearing as a project maintainer, Red Hat was fielding calls from major New York banks. Their problem was immediate: They realized they already had upwards of 10,000 employees bringing "claws" — agent-based tools — into their infrastructure with zero centralized oversight. Breaches caused by employees working on these sorts of unapproved technologies are costly. These so-called “shadow AI” incidents cost on average $670K more than standard incidents, according to IBM. Red Hat’s Gracely noted that while organizations can try to shut down these unapproved ports, they eventually have to figure out how to make them productive and secure — a task that requires a serious investment in an orchestration or platform layer. The dynamic defensive: MassMutual’s refusal to bet While some enterprise companies seek an "AI operating system" that oversees all of their AI technologies and apps, others are simply refusing to sign the check. Sears Merritt, CIO and head of enterprise technology at MassMutual, is managing the governance conundrum by intentionally staying in a state of high-velocity flexibility. "Things are so dynamic, it’s hard to know which of the AI vendors will end up on top," Merritt said at the Boston event. For that reason, MassMutual is refusing to enter any long-term contracts with AI vendors. Merritt’s strategy of “dynamic defensive” highlights a core finding of our research: Vendor popularity is changing radically month to month. Anthropic, for example, went from 0% in January to nearly 6% in February, in the number of respondents reporting what agent orchestration technology they were using. Again, the sample size was small, at 70 respondents. Still, even if directional, the dynamic landscape suggests picking a "primary" winner today is a fool’s errand. The January figure likely reflects survey composition: Respondents represent the broader enterprise market, not the developer community where Anthropic has seen its strongest early traction. Until recently, most organizations had signed up early with leaders like Microsoft and OpenAI as their main orchestration providers, due to their early lead with Copilot. Our finding that Anthropic is just now pushing into enterprise agent orchestration may be a confirmation of the recent excitement around that platform. One possible explanation is that enterprises already using Claude for model inference are now routing through Anthropic's native tooling rather than third-party frameworks — though the sample is too small to draw firm conclusions. The rise of “platform creep” The leading providers are also shifting toward "managed agents," as reflected by Anthropic’s recent announcement. This offering suggests possible continued platform creep, whereby providers like OpenAI and Anthropic take over more and more of the AI infrastructure — most specifically, in this case, the memory of agentic session details. And there the trap is set. Once your session data and orchestration live inside a provider's proprietary database, you aren't just using a model; you are living in its ecosystem. Moreover, persistent agent memory is a prime target for memory poisoning via injected instructions that influence every future interaction. And when that memory lives in a provider's database, you lose your own forensic capability. The security irony: The fox guarding the hen house We are seeing this platform creep in our data as well. The most jarring finding in our Q1 data is what we call the "Security Irony": the fact that the providers most responsible for creating enterprise AI risk are the same ones enterprises are using to manage it. Respondents said the top selection criterion for AI orchestration platforms was “security and permissions generally” (37.1%), beating out other criteria like cost, flexibility, control and ease of development. Yet, the market is choosing convenience over sovereignty. According to our survey, 26% of enterprises in February were using OpenAI as their primary security solution — the very same provider whose models create the risks they are trying to secure. That trend only seemed to strengthen in March, though, as stated before, we want to be careful. Our sample size is small, and this data should only be taken as directional. It’s not clear whether enterprises are choosing OpenAI as a security solution, or just relying on its built-in security features offered by Microsoft Azure (which partnered with OpenAI when it pushed its Copilot solution aggressively in 2024) because customers were already on that platform. Beyond the data, there are anecdotal signs that OpenAI's enterprise position may be shifting. Anthropic's Claude Code drew significant attention among developers early this year alongside the Claude 4.6 model. The subsequent announcement of Mythos, its security-focused model, prompted interest from enterprise security teams given its ability to identify vulnerabilities. OpenAI has also announced a security-focused model, GPT-5.4-Cyber. Our data may also point to a drop in OpenAI’s relative position in a few enterprise AI categories. One area was data-retrieval, where OpenAI again leads among third-party providers, but we saw an increase in the number of respondents instead using in-house solutions for retrieval — perhaps a sign that AI models and agents are getting better at natively being able to use tools to call directly to companies’ existing databases, and that custom code is often a way companies are building this in. However, here again we feel our data is at best directional for now. We are asking the fox to guard the hen house. Hyperscaler security features (like those from OpenAI, Azure, and Google) are winning, because they are already integrated into the platforms enterprises are using. But it creates a single-provider dependency. As agents gain the power to modify documents, call APIs and access databases, the “governance mirage" suggests we have control, while the data shows we are simply clicking "I agree" on whatever the hyperscalers offer. The resulting risks, however, include content injection, privilege escalation and data exfiltration. The path forward: toward a unified control plane The search for the "Dynatrace for AI" So, what is the way out? Sriraman argued that the industry desperately needs a "central observability platform" — a "Dynatrace for AI" — that provides full end-to-end visibility, including model drift and safety prompting, agent behavior analytics, privilege escalation alerts, and forensic logging. He is currently working with a number of potential providers to deliver on this. The “swivel chair” warning Sriraman warned that without a unified control plane, enterprises are at risk of sliding back into a fragmented "swivel chair" world — reminiscent of the early, inefficient days of Robotic Process Automation (RPA) — where employees are forced to constantly jump between different siloed AI tools to finish a single workflow. "We don’t want to create a world where you have to switch to do something here and then go back to the platform to do something else," he said. But that desire for a single control plane conflicts with the desire to avoid lock-in. Our data shows the market has settled on the “hybrid control plane.” In other words, the most popular situation among our respondents (at 34.3%), was to use model provider-native solutions like Copilot Studio or OpenAI assistants for some workflows, while also running external options like LangGraph or custom orchestration for others. Smaller numbers of companies reported being more dogmatic here, whether that be deliberately removing the model provider from the orchestration layer entirely, relying only on custom orchestration tools, or relying only on the model provider’s technology Enterprises trust no single provider enough to give them full control, yet they lack the engineering capacity to build entirely from scratch. The bottom line: The “big red button” Visibility and integration are only half the battle. In a high-stakes industry like healthcare, Sriraman argues that any legitimate control plane must also offer a hard-stop capability. "We need a big red button," he said. "Kill it. We should be able to have that … without that, don't put anything in the operational setting." In fact, such a kill switch was formally called for by the security community group OWASP as part of a recommended security framework. The “governance mirage” is the belief that you can scale AI without deciding who owns the control and security plane. If you are one of the 72% of organizations claiming multiple "primary" platforms, be careful because you may not have a strategy; you may have a conflict of interest. It suggests that the winner of the war between the AI behemoths — OpenAI, Anthropic, Google, Microsoft, etc. — won’t necessarily be the one with the best model, but the one that manages to sit above the models and help enterprises enforce a single version of the truth. That may be difficult to achieve, though, given that companies won’t want lock-in with a single player. The data suggests enterprises are already resisting that outcome — and may need to formalize that resistance. Enterprises arguably need to own their control plane with independent security instrumentation, not wait for a vendor to win that role for them.
- Google and AWS split the AI agent stack between control and executionThe era of enterprises stitching together prompt chains and shadow agents is nearing its end as more options for orchestrating complex multi-agent systems emerge. As organizations move AI agents into production, the question remains: "how will we manage them?" Google and Amazon Web Services offer fundamentally different answers, illustrating a split in the AI stack. Google’s approach is to run agentic management on the system layer, while AWS’s harness method sets up in the execution layer. The debate on how to manage and control gained new energy this past month as competing companies released or updated their agent builder platforms—Anthropic with the new Claude Managed Agents and OpenAI with enhancements to the Agents SDK—giving developer teams options for managing agents. AWS with new capabilities added to Bedrock AgentCore is optimizing for velocity—relying on harnesses to bring agents to product faster—while still offering identity and tool management. Meanwhile, Google’s Gemini Enterprise adopts a governance-focused approach using a Kubernetes-style control plane. Each method offers a glimpse into how agents move from short-burst task helpers to longer-running entities within a workflow. Upgrades and umbrellas To understand where each company stands, here’s what’s actually new. Google released a new version of Gemini Enterprise, bringing its enterprise AI agent offerings—Gemini Enterprise Platform and Gemini Enterprise Application—under one umbrella. The company has rebranded Vertex AI as Gemini Enterprise Platform, though it insists that, aside from the name change and new features, it’s still fundamentally the same interface. “We want to provide a platform and a front door for companies to have access to all the AI systems and tools that Google provides,” Maryam Gholami, senior director, product management for Gemini Enterprise, told VentureBeat in an interview. “The way you can think about it is that the Gemini Enterprise Application is built on top of the Gemini Enterprise Agent Platform, and the security and governance tools are all provided for free as part of Gemini Enterprise Application subscription.” On the other hand, AWS added a new managed agent harness to Bedrock Agentcore. The company said in a press release shared with VentureBeat that the harness “replaces upfront build with a config-based starting point powered by Strands Agents, AWS’s open source agent framework.” Users define what the agent does, the model it uses and the tools it calls, and AgentCore does the work to stitch all of that together to run the agent. Agents are now becoming systems The shift toward stateful, long-running autonomous agents has forced a rethink of how AI systems behave. As agents move from short-lived tasks to long-running workflows, a new class of failure is emerging: state drift. As agents continue operating, they accumulate state—memory, too, responses and evolving context. Over time, that state becomes outdated. Data sources change, or tools can return conflicting responses. But the agent becomes more vulnerable to inconsistencies and becomes less truthful. Agent reliability becomes a systems problem, and managing that drift may need more than faster execution; it may require visibility and control. It’s this failure point that platforms like Gemini Enterprise and AgentCore try to prevent. Though this shift is already happening, Gholami admitted that customers will dictate how they want to run and control any long-running agent. “We are going to learn a lot from customers where they would be using long-running agents, where they just assign a task to these autonomous agents to just go ahead and do,” Gholami said. “Of course, there are tricks and balances to get right and the agent may come back and ask for more input.” The new AI stack What’s becoming increasingly clear is that the AI stack is separating into distinct layers, solving different problems. AWS and, to a certain extent, Anthropic and OpenAI, optimize for faster deployment. Claude Managed Agents abstracts much of the backend work for standing up an agent, while the Agents SDK now includes support for sandboxes and a ready-made harness. These approaches aim to lower the barrier to getting agents up and running. Google offers a centralized control panel to manage identity, enforce policies and monitor long-running behaviors. Enterprises likely need both. As some practitioners see it, their businesses have to have a serious conversation on how much risk they are willing to take. “The main takeaway for enterprise technology leaders considering these technologies at the moment may be formulated this way: while the agent harness vs. runtime question is often perceived as build vs. buy, this is primarily a matter of risk management. If you can afford to run your agents through a third-party runtime because they do not affect your revenue streams, that is okay. On the contrary, in the context of more critical processes, the latter option will be the only one to consider from a business perspective,” Rafael Sarim Oezdemir, head of growth at EZContacts, told VentureBeat in an email. Iterating quickly lets teams experiment and discover what agents can do, while centralized control adds a layer of trust. What enterprises need is to ensure they are not locked into systems designed purely for a single way of executing agents.