Google's Managed Agents API promises one-call deployment at the cost of execution layer control
Our take

At Google I/O, the unveiling of Managed Agents in the Gemini API marks a pivotal moment in the evolution of AI-driven applications. This service promises to simplify the intricacies of agent deployment into a single API call, potentially saving weeks of groundwork that traditionally bogs down development teams. Such progress is particularly notable in a time when companies like AWS are enhancing their offerings with platforms like Bedrock AgentCore. As seen with OpenAI barrels toward IPO that may happen in September and Airbnb gets into hotels, expands AI for host onboarding and customer support, the competitive landscape is increasingly focused on delivering streamlined, integrated solutions that enhance productivity and user experience.
The introduction of Managed Agents suggests that Google is confident in its ecosystem's capacity to manage the full execution layer, integrating aspects of infrastructure that were previously considered separate. This shift reflects a broader trend toward vertical integration in technology platforms, where companies aim to control more of the user experience by merging various operational layers. While competitors like Anthropic are positioning their models as orchestration layers, Google's approach seeks to absorb these functions into the model itself, allowing developers to concentrate on refining agent behavior rather than getting bogged down in the technicalities of setup and infrastructure.
However, this all-in-one model brings with it a set of architectural questions. Should agent management remain at the execution layer, or is it more prudent to maintain some separation to ensure flexibility and control? The consolidation of responsibilities could lead to a more seamless product experience, yet it also risks creating dependency on a single platform. Concerns have been raised by industry leaders, such as Arie Trouw, regarding the potential downsides of shifting from deterministic to probabilistic services. This change could lead to unpredictable outcomes, raising legitimate questions about the reliability of automated systems as they become more complex.
As the industry moves toward adopting more integrated solutions like Google’s Managed Agents, it’s crucial to consider the implications for developers and businesses. While the promise of reduced complexity and increased speed to market is appealing, organizations must weigh these benefits against the risks of over-reliance on a single provider. The transformations we are witnessing in AI and machine learning infrastructure are not merely technical evolutions; they are reshaping how teams conceptualize and implement solutions.
Looking forward, the challenge will be maintaining a balance between innovation and control. As enterprises begin to adopt these new tools, they must ask themselves how much autonomy they are willing to relinquish in exchange for ease of use. The direction Google is taking with Managed Agents could very well redefine the standards for agent deployment, but it will also necessitate a thoughtful examination of the trade-offs involved. Will businesses embrace this shift toward a more integrated model, or will they seek to retain a degree of independence in their operational strategies? As this landscape continues to evolve, the answers to these questions will be critical for shaping the future of AI in enterprise solutions.
At Google I/O, the company unveiled Managed Agents in its Gemini API — a service that promises to collapse weeks of agent deployment work into a single API call. It's also a sign that Google believes its ecosystem, including the newly launched Antigravity CLI, is ready to own the execution layer end-to-end.
Before a single agent is written, teams are already spending days on the unglamorous work: standing up execution environments, managing sandboxes, wiring tool call infrastructure. Model providers like Anthropic have launched platforms to handle much of that work — but Google's approach is different.
Google said in a blog post that Managed Agents in the Gemini API abstracts “away the complexity so that you can focus on your product experience and agent behavior.” The service is available in preview via new custom templates in Google AI Studio.
The growth has introduced a real architectural question: should agent management live at the execution layer — embedded in the model or its harness — or at the infrastructure layer, as a separate runtime?
Comparing Google’s approach
Until recently, agent orchestration relied on frameworks that sat above the model, directing agents and letting teams control routing and execution separately. That layer is now being absorbed by the platforms themselves.
Recent platforms like Claude Managed Agents embed orchestration at the model layer rather than on a separate runtime platform. The idea is that the model owns the reasoning and orchestration layers, and enterprises have control over execution.
AWS, through new capabilities on Bedrock AgentCore, adds managed harnesses that stitch together the upfront tasks for deploying agents. Google's approach goes further, optimizing the model, harness, and sandbox together and running everything in secure Google-managed environments.
René Sultan of Ramp, cited in Google's announcement, said the shift is concrete: "The real shift with Gemini Managed Agents is that the agent runtime moves into the platform. With the sandbox, infrastructure and execution loop managed for you, developers can focus on productizing the agent's domain-specific behavior and iterating at a completely different pace."
The new orchestration reality
Enterprises starting fresh with agents could find the platform offerings from Anthropic and Google strong, especially since they remove much of the difficulty of deploying agents while still maintaining some control. Google, however, is pushing for a more vertically integrated system, while Anthropic is betting on the model layer as an orchestration plane, and AWS focuses on authorization.
But this also brings some risks, according to XYO founder and chief executive Arie Trouw.
“An additional risk is that developers will switch out what previously were deterministic services for what will now be probabilistic services, which can introduce unpredictable outcomes for the users at best, or data corruption at worst,” Trouw told VentureBeat in an email. “This is the classic example of having an amazing hammer and everything starting to look like nails. I've seen this pattern repeatedly as a developer and business founder myself in the past few decades.”
Read on the original site
Open the publisher's page for the full experience
Related Articles
- Google and AWS split the AI agent stack between control and executionThe era of enterprises stitching together prompt chains and shadow agents is nearing its end as more options for orchestrating complex multi-agent systems emerge. As organizations move AI agents into production, the question remains: "how will we manage them?" Google and Amazon Web Services offer fundamentally different answers, illustrating a split in the AI stack. Google’s approach is to run agentic management on the system layer, while AWS’s harness method sets up in the execution layer. The debate on how to manage and control gained new energy this past month as competing companies released or updated their agent builder platforms—Anthropic with the new Claude Managed Agents and OpenAI with enhancements to the Agents SDK—giving developer teams options for managing agents. AWS with new capabilities added to Bedrock AgentCore is optimizing for velocity—relying on harnesses to bring agents to product faster—while still offering identity and tool management. Meanwhile, Google’s Gemini Enterprise adopts a governance-focused approach using a Kubernetes-style control plane. Each method offers a glimpse into how agents move from short-burst task helpers to longer-running entities within a workflow. Upgrades and umbrellas To understand where each company stands, here’s what’s actually new. Google released a new version of Gemini Enterprise, bringing its enterprise AI agent offerings—Gemini Enterprise Platform and Gemini Enterprise Application—under one umbrella. The company has rebranded Vertex AI as Gemini Enterprise Platform, though it insists that, aside from the name change and new features, it’s still fundamentally the same interface. “We want to provide a platform and a front door for companies to have access to all the AI systems and tools that Google provides,” Maryam Gholami, senior director, product management for Gemini Enterprise, told VentureBeat in an interview. “The way you can think about it is that the Gemini Enterprise Application is built on top of the Gemini Enterprise Agent Platform, and the security and governance tools are all provided for free as part of Gemini Enterprise Application subscription.” On the other hand, AWS added a new managed agent harness to Bedrock Agentcore. The company said in a press release shared with VentureBeat that the harness “replaces upfront build with a config-based starting point powered by Strands Agents, AWS’s open source agent framework.” Users define what the agent does, the model it uses and the tools it calls, and AgentCore does the work to stitch all of that together to run the agent. Agents are now becoming systems The shift toward stateful, long-running autonomous agents has forced a rethink of how AI systems behave. As agents move from short-lived tasks to long-running workflows, a new class of failure is emerging: state drift. As agents continue operating, they accumulate state—memory, too, responses and evolving context. Over time, that state becomes outdated. Data sources change, or tools can return conflicting responses. But the agent becomes more vulnerable to inconsistencies and becomes less truthful. Agent reliability becomes a systems problem, and managing that drift may need more than faster execution; it may require visibility and control. It’s this failure point that platforms like Gemini Enterprise and AgentCore try to prevent. Though this shift is already happening, Gholami admitted that customers will dictate how they want to run and control any long-running agent. “We are going to learn a lot from customers where they would be using long-running agents, where they just assign a task to these autonomous agents to just go ahead and do,” Gholami said. “Of course, there are tricks and balances to get right and the agent may come back and ask for more input.” The new AI stack What’s becoming increasingly clear is that the AI stack is separating into distinct layers, solving different problems. AWS and, to a certain extent, Anthropic and OpenAI, optimize for faster deployment. Claude Managed Agents abstracts much of the backend work for standing up an agent, while the Agents SDK now includes support for sandboxes and a ready-made harness. These approaches aim to lower the barrier to getting agents up and running. Google offers a centralized control panel to manage identity, enforce policies and monitor long-running behaviors. Enterprises likely need both. As some practitioners see it, their businesses have to have a serious conversation on how much risk they are willing to take. “The main takeaway for enterprise technology leaders considering these technologies at the moment may be formulated this way: while the agent harness vs. runtime question is often perceived as build vs. buy, this is primarily a matter of risk management. If you can afford to run your agents through a third-party runtime because they do not affect your revenue streams, that is okay. On the contrary, in the context of more critical processes, the latter option will be the only one to consider from a business perspective,” Rafael Sarim Oezdemir, head of growth at EZContacts, told VentureBeat in an email. Iterating quickly lets teams experiment and discover what agents can do, while centralized control adds a layer of trust. What enterprises need is to ensure they are not locked into systems designed purely for a single way of executing agents.
- Anthropic’s Claude Managed Agents gives enterprises a new one-stop shop but raises vendor 'lock-in' riskAnthropic announced a new platform last week, Claude Managed Agents, aiming to cut out the more complex parts of AI agent deployment for enterprises and competes with existing orchestration frameworks. Claude Managed Agents is also an architectural shift: enterprises, already burdened with orchestrating an increasing number of agents, can now choose to embed the orchestration logic in the AI model layer. While this comes with some potential advantages, such as speed (Anthropic proposes its customers can deploy agents in days instead of weeks or months), it also, of course, then also turns more control over the enterprise's AI agent deployments and operations to the model provider — in this case, Anthropic — potentially resulting in greater "lock in" for the enterprise customer, leaving them more subject to Anthropic's terms, conditions, and any subsequent platform changes. But maybe that is worth it for your enterprise, as Anthropic further claims that its platform “handles the complexity” by letting users define agent tasks, tools and guardrails with a built-in orchestration harness, all without the need for sandboxing code execution, checkpointing, credential management, scoped permissions and end-to-end tracing. The framework manages state, execution graphs and routing and brings managed agents to a vendor-controlled runtime loop. Even before the release of Claude Managed Agents, new directional VentureBeat research showed that Anthropic was gaining traction at the orchestration level as enterprises adopted its native tooling. Claude Managed Agents represents a new attempt by the firm to widen its footprint as the orchestration method of choice for organizations. Anthropic is surging in orchestration interest Orchestration has emerged as an important segment for enterprises to address as they scale AI systems and deploy agentic workflows. VentureBeat directional research of several dozen firms for the first quarter of 2026 found that enterprises mostly chose existing frameworks, such as Microsoft’s Copilot Studio/Azure AI Studio, with 38.6% of respondents in February reporting using Microsoft’s platform. VentureBeat surveyed 56 organizations with more than 100 employees in January and 70 in February. OpenAI closely followed at 25.7%. Both showed strong growth between the first two months of the year. Anthropic, driven by increased interest in its offerings, such as Claude Code, over the past year, is putting up a fight. Adoption of the Anthropic tool-use and workflows API increased from 0% to 5.7% between January and February. This tracks closely with the growing adoption of Anthropic’s foundation models, showing that enterprises using Claude turn to the company’s native orchestration tooling instead of adding a third-party framework. While VentureBeat surveyed before the launch of Claude Managed Agents, we can extrapolate that the new tool will build on that growth, especially if it promises a more straightforward way to deploy agents. Collapsing the external orchestration layer Enterprises may find that a streamlined, internal harness for agents compelling, but it does mean giving up certain controls. Session data is stored in a database managed by Anthropic, increasing the risk that enterprises become locked into a system run by a single company. This may be less desirable for some firms and compete with their desires to move away from the locked-in software-as-a-service (SaaS) applications in the current stacks, which many hope that AI will facilitate. The specter of vendor lock-in means agent execution becomes more model-driven rather than direct by the organization, happens in an environment enterprises don’t fully control, and behavior becomes harder to guarantee. It also opens the possibility of giving agents conflicting instructions, especially if the only way for users to exert any control over agents is to prompt them with more context. Agents could have two control planes: one defined by the enterprises’ orchestration system through instructions and the other as an embedded skill from the Claude runtime. This could pose an issue for highly sensitive and regulated workflows, such as financial analysis or customer-facing tasks. Pricing, control and competitive set Balancing control with ease is one thing; enterprises also consider the cost structure of Claude Managed Agents. Claude Managed Agents introduces a hybrid pricing model that blends token-based billing with a usage-based runtime fee. This makes Managed Agets more dynamic, though less predictable, when determining cost structures. Enterprises will be charged a standard rate of $0.08 per hour when agents are actively running. For example, at $0.70 per hour, a one-hour session could cost up to $37 to process 10,000 support tickets, depending on how long each agent runs and how many steps it takes to complete a task. Microsoft, currently the leader according to VentureBeat's directional survey, offers several orchestration offerings. Copilot Studio uses a capacity-based billing structure, so enterprises pay for blocks of interactions between users and agents rather than the number of steps an agent takes. Microsoft's approach tends to be more predictable than Anthropic's pricing plan: Copilot Studio starts at $200 per month for 25,000 messages. Compared to similar competitors like OpenAI's Agents SDK, the picture becomes murky. Agents SDK is technically free to use as an open-source project. However, OpenAI bills for the underlying API usage. Agents built and orchestration with Agents SDK using GPT-5.4, for example, will cost $2.50 per 1 million input tokens and $15 per 1 million output tokens. The enterprise decision Claude Managed Agents does give enterprises who find the actual deployment of production agents too complicated a reprieve. It reduces their engineering overhead while adding speed and simplicity in a fast-changing enterprise environment. But that comes with a choice: lose control, observability and portability and risk further vendor lock-in. Anthropic just made a case for why its ecosystem is becoming not just the foundation model of choice for enterprises, but also the orchestration infrastructure. It becomes more imperative for enterprises to balance ease with lesser control.
- Anthropic wants to own your agent's memory, evals, and orchestration — and that should make enterprises nervousJust a few weeks after announcing Claude Managed Agents, Anthropic has updated the platform with three new capabilities that collapse infrastructure layers like memory, evaluation, and multi-agent orchestration, into a single runtime. This move could threaten the standalone tools that many enterprises cobble together. The new capabilities — 'Dreaming,' 'Outcomes,' and 'Multi-Agent Orchestration' — aim to make agents inside Claude Managed Agents “more capable at handling complex tasks with minimal steering,” Anthropic said in a press release. Dreaming deals with memory, where agents “reflect” on their many sessions and curate memories so they learns and surface unknown patterns. Outcomes allows teams to define and set specific rubrics to measure an agent's success, while Multi-Agent Orchestration breaks jobs down so a lead agent can delegate to other agents. Claude Managed Agents ideally provides enterprises with a simpler path to deploy agents and embeds orchestration logic in the model layer. It’s an end-to-end platform to manage state, execution graphs, and routing. With the addition of Dreaming, Outcomes and Multi-agent Orchestration, Claude Managed Agents expands capabilities even further and directly competes with tools like LangGraph or CrewAI, as well as external evaluation frameworks, RAG memory architectures, and QA loops. An integration threat Enterprises must now ask: Should we ditch our flexible, modular system in favor of an agent platform that brings almost everything in-house? Anthropic designed Claude Managed Agents to share context, state, and traceability in one place. This means the platform sees every decision agents make, rather than enterprises having to wire separate systems together. It sounds practical to have one platform that does everything. But not all enterprises want a full-service system. Claude Managed Agents already faces criticism that it encourages vendor lock-in because it owns most of the architecture and tools that govern agents. In the current paradigm, an organization may run Managed Agents but keep multi-agent orchestration, memory, or evaluations in a separate space ensures flexibility. The platform offers a fully-hosted runtime, which means memory and orchestration run on infrastructure the enterprise does not own. This can become a compliance nightmare for some organizations that have to prove data residency. Another problem to consider is that enterprises already in the middle of large-scale AI transformations must cobble together workarounds to deal with the constraints of their tech stack. Not every workflow is easily replaceable by switching to Claude Managed Agents. Dreaming and outcomes against current tools Most enterprises have a fragmented approach to AI deployment. For example, they may use LangGraph or Crew AI for agent routing and workflow management, Pinecone as a vector database for long-term memory, DeepEval for external evaluation, and a human-in-the-loop quality assurance to review some tasks. Anthropic hopes to do away with all of that. With Dreaming, Anthropic approaches memory by allowing users to actively rewrite it between sessions, so the agent essentially learns from its mistakes. Anthropic says this capability is useful for long-running states and orchestration. Current systems often handle memory persistence by storing embeddings, retrieving relevant context, and adding more state over time. Outcomes addresses the evaluation portion by detailing expectations for agents. Instead of external quality checks, which are often done by a team of humans, Anthropic is bringing evaluation into the orchestration layer rather than above it. But it’s the Multi-Agent Orchestration capability that pits Claude Managed Agents against orchestration frameworks from Microsoft, LangChain, CrewAI, and others. Model providers like Anthropic and OpenAI have already begun pushing aggressively into this space, arguing that bringing this to the model layer gives teams better control. Big decisions to make Enterprises face a big decision, and this one could depend on where they are in agent maturity. If an organization is still experimenting with agents and has not deployed many in production, they may find moving to Claude Managed Agents and configuring Dreaming and Outcomes to their needs much easier. This is the stage of development where, even if enterprises are using a third-party orchestrator like LangChain, they’re still customizing it. But for those who are already further along in the process, the calculation becomes trickier. It’s now a matter of parallel evaluation and better understanding of their processes. Businesses, though, will face the same decision even if they don’t intend to use Claude Managed Agents. Anthropic has signaled that other model and platform providers will likely shift their product roadmaps to a similar model that keeps everything locked in the same system — because models may become interchangeable, but the tooling and orchestration infrastructure will not.
- Claude agents can finally connect to enterprise APIs without leaking credentialsThe reason enterprises have been slow to connect AI agents to internal APIs and databases isn't the models — it's the credentials. In most production deployments, the agent carries authentication tokens with it as it executes tool calls, which means a compromised or misbehaving agent takes the keys with it. Anthropic is addressing that problem with two new capabilities for Claude Managed Agents: self-hosted sandboxes, which let teams run tool execution inside their own infrastructure perimeter, and MCP tunnels, which connect agents to private MCP servers without exposing credentials in the agent's context. Together they move credential control to the network boundary rather than leaving it inside the agent. Right now, self-hosted sandboxes are available to Claude Managed Agent users in public beta, while MCP tunnels are currently in research preview. Anthropic isn't the only model provider making this bet. OpenAI added local execution to its Agents SDK in April in response to similar demand. The architectural distinction Anthropic draws is a split: the agent loop runs on Anthropic's infrastructure, while tool execution runs on the enterprise's own system — a separation that existing sandbox approaches, including OpenAI's, don't make. The architecture problem in sandboxes and agents MCP moved to enterprise production faster than the security architecture around it matured. In most deployments, credentials travel through the agent itself as it executes tool calls against internal systems — meaning a compromised or misbehaving agent has everything it needs to cause damage. Self-hosted sandboxes, such as those offered on Claude Managed Agents, help keep files and packages within an enterprise's infrastructure. The agentic loop—orchestration, context management and error recovery—moves to the platform, and ideally, enterprises control compute resources. This allows the agent to complete tool calls without holding the keys that unlock it. Private network connectivity works similarly — a lightweight outbound-only gateway inside the organization's network, with no credentials passing through the agent. Orchestration teams get some control For orchestration teams, the capabilities represent more than just a security update; they help agents run better. But the first thing they need to understand is how this split architecture can affect their deployment. Since sandboxes determine tool execution locations and the resources agents access, and MCP tunnels tell agents how to reach internal systems, these are separate concerns—splitting them up enables enterprises to map agents' workflows more effectively. For teams already on Claude Managed Agents, the practical starting point is sandboxes — move tool execution onto your own infrastructure and test the boundary before touching MCP tunnels, which are still in research preview. Teams evaluating the platform for the first time should treat the sandbox architecture as the primary technical differentiator: it's the piece that changes the threat model, not just the deployment model.