Scaling AI into production is forcing a rethink of enterprise infrastructure
Our take
As enterprises shift from AI experimentation to large-scale deployment, the need for a robust infrastructure becomes paramount. In a conversation with VentureBeat, Nutanix leaders Tarkan Maner and Thomas Cornely explore the challenges of transitioning AI from pilot projects to real-world applications across diverse industries. They emphasize the importance of balancing human decision-making with AI-driven automation and the operational complexities introduced by agentic AI.
Scaling AI from sandbox to production is no longer a technology curiosity—it's a business imperative. As Nutanix’s Tarkan Maner and Thomas Cornely explain, enterprises are moving past isolated pilots and demanding an infrastructure that can sustain thousands of agents, real‑time workloads, and the governance required by regulated sectors. Readers who have watched the hype around “agentic AI” will recognize the same pattern that turned early cloud adoption into today’s multi‑cloud reality. The conversation in the article dovetails with insights from Cheaper tokens, bigger bills: The new math of AI infrastructure and the recent launch of Nvidia’s enterprise AI agent platform, underscoring that cost, control, and composability are now the three pillars of any scalable AI strategy.
What makes this shift especially consequential is the emergence of autonomous agents that orchestrate multi‑step workflows across disparate data sources. Unlike traditional batch models, these agents need continuous access to compute, storage, and networking resources while respecting security boundaries. The “AI factory” concept that Nutanix promotes is essentially a shared, policy‑driven platform that lets developers self‑service AI workloads, yet gives infrastructure teams the tools to impose constraints and audit usage. This duality resolves the classic tension between speed‑focused AI engineers and risk‑averse IT operations, a gap that many organizations still feel acutely. By abstracting the underlying hypervisor and Kubernetes layers, the Nutanix solution promises to keep the developer experience simple—think “drag‑and‑drop” agent creation—while guaranteeing that data never leaves the premises when compliance demands it.
From a practical standpoint, the article highlights why hybrid environments are now a requirement rather than a compromise. Enterprises in banking, healthcare, and government cannot simply lift and shift workloads to a public cloud; data sovereignty, IP protection, and latency concerns dictate a nuanced placement strategy. Nutanix’s ability to span AWS, Azure, Google Cloud, and emerging neoclouds means that organizations can route each agent to the optimal execution zone without re‑architecting the application. This flexibility also opens the door for incremental migration: start experiments in the public cloud, then transition mature agents to on‑premise clusters where governance and cost control are tighter. The result is a smoother path from proof‑of‑concept to enterprise‑wide adoption, reducing the risk of costly re‑engineering projects later.
The broader implication for readers is clear: scaling AI is as much about rethinking operational models as it is about choosing the right algorithms. Companies that invest in a unified AI factory now will avoid the fragmentation that plagues many AI initiatives today—multiple silos, duplicated tooling, and inconsistent security postures. As the market matures, the differentiator will be the ability to deliver AI‑enhanced experiences—such as real‑time document search, predictive threat detection, or cashier‑less retail—without sacrificing governance or escalating spend. The next wave of innovation will likely focus on orchestration standards that let agents from different vendors cooperate safely, and on AI‑aware service‑level agreements that make performance guarantees measurable.
Looking ahead, the real test will be how quickly enterprises can align their cultural processes with this technical shift. Will organizations adopt the shared‑responsibility mindset that the AI factory demands, or will they fall back into isolated, hard‑to‑manage silos? Watching how the balance between developer agility and infrastructure governance evolves will be the key indicator of whether AI moves from a promising pilot to a transformative, enterprise‑wide capability.

Presented by Nutanix
Across industries, organizations are focused on how to move from AI pilots, proofs of concept, and cloud-based experimentation to deploying it at scale — across real workloads, for real users, in real business environments. VentureBeat spoke with Tarkan Maner, president and chief commercial officer at Nutanix, and Thomas Cornely, EVP of product management, about what that transition demands, and what it will take to get it right.
“AI in general is shifting everything we do, not only in technology, but across all vertical industries, from regulated industries like banking, health care, government, education to non-regulated industries like manufacturing and retail,” Maner said. “As a complete platform company, we welcome this change. It’s creating more opportunities for us as a company to serve our customers in better ways as we move forward.”
But there’s still a practical gap between experimentation and production, Cornely said.
“It’s one thing to do an experiment, to do a prototype. It’s a different thing to take that prototype and deploy it for 10,000 employees,” he explained. “We went from people focusing on training models to chatbots to now doing agents, where the demand and pressures on AI infrastructure are growing exponentially.”
Agentic AI introduces a new layer of enterprise complexity
The rise of agentic AI is what makes this transition especially consequential. These systems introduce multi-step workflows across applications and data sources, along with a degree of autonomy that creates new operational demands.
Enterprises now have to contend with multiple agents running simultaneously, unpredictable and real-time workloads, and the need to coordinate access to infrastructure across teams.
“OpenClaw is making it very easy now for anybody to build agents and run with agents,” Cornely said. “You want those agents to be running on premises with your data. You need to have the right constructs around it to protect the enterprise from what an agent could do.”
As these systems become more autonomous, the challenge extends beyond how they operate to how they interact with enterprise data, systems, and teams.
AI is augmenting human work, not replacing it
Agentic AI is fundamentally an amplifier of human capability rather than a substitute for it, Maner said. The goal for enterprises is not to eliminate human work but to find the right balance between human decision-making, AI-driven automation, and agent-based workflows.
“We believe that there’s going to be love, peace, and harmony between AI, agentic tools, and robotics systems, and human capital,” Maner said. “That harmony can be optimized for better outcomes for businesses, enterprises, governments, and public sector organizations, if the right vendors provide the right tooling and the right services.”
How enterprises are getting started with AI at scale
In practice, the move from experimentation into real-world deployment is where the challenges become most visible. Despite the momentum, many are still working through how to scale AI beyond initial use cases.
As they do, organizations quickly run into practical constraints. Many start in the cloud because of easy access to resources and services, but practical considerations like data, governance and control, and cost quickly come to the forefront.
The cloud can be used to experiment, with the ultimate goal of bringing applications back on premises as they move toward production, using platforms that solve for security and cost.
The use cases gaining the most traction include document search and knowledge retrieval, security and predictive threat detection, software development and coding workflows, and customer support and service operations. In the security realm, banking customers and others in Europe and the U.S. are deploying AI-driven tools including facial recognition and predictive threat detection. Meanwhile, there’s a growing focus on end-to-end, 360-degree customer engagement, from pre-sales through post-sales advocacy, in the customer support industry.
Industry-specific AI transformation is already underway
Across industries, the shift from experimentation to real deployment is already taking shape in distinct ways. In retail, AI is transforming store operations with cameras and robotics used for targeted in-aisle marketing at the moment of purchase decision, while cashier-less checkout is replacing traditional POS systems, and the human capital freed up is being redeployed to back-office and merchandising functions.
In healthcare, Nutanix works with customers on applications spanning diagnosis, treatment, remote health, and hospital operations, with cloud partners including AWS and Azure. In manufacturing and logistics, the transformation is equally significant.
The operational challenges of scaling enterprise AI
As AI use cases scale, enterprises are running into a new class of operational challenges. Managing multiple AI workloads and agents, coordinating infrastructure access across teams, ensuring security and governance, and integrating AI systems with existing business processes are now top-of-mind concerns for IT and business leaders alike.
The gap between AI developers pushing for speed and access, and infrastructure teams responsible for security, uptime, and governance, is one of the defining challenges of this moment.
“Now I’m running agents, and they’re all going to fight to get access to resources to solve my problems,” Cornely said. “What you want now is infrastructure that allows you to set constraints, govern resources.”
The AI factory: a shared platform for production AI
These challenges are driving demand for what Maner and Cornely describe as the AI factory: a shared infrastructure environment that supports multiple users and workloads simultaneously, enabling both experimentation and production while balancing developer agility with enterprise governance.
At GTC 2026, Nutanix announced the Nutanix Agentic AI Solution, a complete platform spanning core infrastructure, Kubernetes-based container services running on a topology-aware hypervisor, and advanced services for building and governing agents.
“We’re launching a complete platform, from core infrastructure through PaaS and advanced PaaS services to the whole management framework for your AI factories,” Cornely said. “Really enabling self-service for the teams that will build these applications in the enterprise.”
Hybrid environments are essential to enterprise AI strategy
Operating this kind of environment requires flexibility across infrastructure. Hybrid infrastructure is not a compromise, but a requirement. Some workloads will always run in the public cloud, while others must remain on premises due to security requirements, regulatory compliance, data sovereignty, or competitive IP considerations.
“Especially in the regulated industries, as sovereignty becomes a bigger issue, data gravity becomes a bigger issue, security, and also a lot of competitive differentiation in the industry, it’s going to depend on what the company wants for their own IP,” Maner said.
This is the foundation of Nutanix’s platform position, he added.
“We are the perfect harmony, bringing those applications, that data, and all the optimization for these use cases end to end, from on-prem to off-prem and in a hybrid mode,” he said. “Doing it not only in one cloud, but for multiple clouds.”
That flexibility also extends to the broader ecosystem. Nutanix works across hyperscalers including AWS, Azure, and Google Cloud, as well as regional service providers and emerging neoclouds. Nutanix offers neoclouds a full software stack to run their own clouds and deliver advanced AI services, giving enterprise customers already running Nutanix a simple extension of compute, networking, and AI capabilities.
Maner described the arrangement as a win for both sides. For enterprises, it means simplified access to hybrid AI services. For neoclouds, it means a proven platform to build on. It’s all automated and secure by default, Cornely added.
“All of those governance problems that now come up with agentic AI are the same problems we’ve been solving for the last 16 years for every other application running in your cloud,” he said.
From pilot to production: operationalizing AI across the enterprise
Ultimately, the goal is not to run a successful AI pilot, but to operationalize AI across real-world use cases, manage infrastructure as a shared resource, support collaboration between infrastructure teams and AI developers, and scale from initial projects to enterprise-wide deployment.
“There’s a massive gap right now between people building AI applications, those AI engineers, those agentic AI developers, and your classical infra teams,” Cornely said. “They need tooling to enable the infra teams, so they can support your AI engineers. That’s what we deliver with our agentic AI solution.”
Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.
Read on the original site
Open the publisher's page for the full experience
Related Articles
- Cheaper tokens, bigger bills: The new math of AI infrastructurePresented by Nutanix As enterprises move from AI experimentation into production deployment, the primary cost driver has shifted away from foundation model training and toward the infrastructure required to run thousands of concurrent inference workloads at scale, with agentic AI as the accelerant. Where early enterprise AI projects involved a handful of large, scheduled training jobs, production agentic environments require continuous support for short-lived, unpredictable requests that consume GPU, networking, and storage resources in ways traditional infrastructure was never designed to handle. For enterprise technology leaders, that shift is turning infrastructure efficiency into a make-or-break factor in AI economics. "Every employee with an AI assistant, every automated workflow, every agent pipeline needs models for inferencing and generates a lot of tokens," says Anindo Sengupta, VP of products at Nutanix. "Those inferencing requests land on a GPU infrastructure, traverse specialized networks, and pull data from storage systems purpose built to support these AI workloads." Why cost per token is becoming a core infrastructure metric Inference costs per token have dropped by roughly an order of magnitude over the past two years, driven by model efficiency improvements and competitive pressure among cloud providers. The expectation would be that enterprise AI is getting cheaper. Instead, total costs are rising, Sengupta says, pointing to what economists call the Jevons paradox: when a resource becomes cheaper to use, consumption tends to increase faster than the price drops. So while the cost per token is going down by almost an order of 10 in the last couple of years, consumption has risen more than 100X. The result is that cost per token and GPU utilization are becoming primary operational metrics for enterprise IT, sitting alongside traditional measures like uptime and throughput. "Cost per token is really about the total cost of ownership for serving inference models," Sengupta says. "Utilization is about making sure that once you have GPU assets, you're getting maximum return from them. These metrics will be critical for enterprise IT leaders." What makes this difficult is the number of variables involved. Token costs shift depending on which models an organization runs, where workloads execute, and how prompts are structured. "There are too many variables in cost to manage intuitively," Sengupta adds. "Optimizing it is an engineering problem, and one that requires continuous tuning." Agentic workloads expose the limits of traditional infrastructure Production agentic AI introduces a workload profile that traditional enterprise infrastructure was not designed to handle. Classic data center deployments are built around predictable loads and long planning cycles. Agentic environments produce unpredictable, high-frequency bursts of short inference requests, place new demands on networking and storage, and change faster than most procurement cycles allow. The infrastructure supporting agentic AI is also structurally different from CPU-based computing. GPU topology, high-speed interconnects, parallel storage systems for agent memory and KV cache, and networking architectures capable of handling DPU offloading all represent new capabilities that require new operational skills. Siloed infrastructure compounds these challenges. When GPU resources, networking, and data access are managed independently, scheduling inefficiencies accumulate, utilization drops, and costs climb. Organizations running fragmented stacks tend to underutilize expensive GPU assets while simultaneously bottlenecking on storage and network throughput. Integrated stacks and the case for full-stack architecture The response emerging among infrastructure vendors is a move toward tightly integrated, validated full-stack platforms designed specifically for production AI workloads. The premise is that end-to-end optimization across compute, networking, storage, and software layers produces better utilization and lower per-token costs than assembling best-of-breed components from separate vendors. Nutanix's Agentic AI solutionrepresents one approach to this problem. Built on the Nutanix AHV hypervisor, Nutanix Enterprise AI and Nutanix Kubernetes Platform, the solution is designed to manage both the traditional compute layer where agent orchestration runs and the accelerated compute layer where inference executes. The company has introduced NVIDIA topology-aware enhancements to AHV that automatically optimize how GPUs, CPUs, memory, and DPUs are allocated to virtual machines, and has offloaded the Nutanix Flow Virtual Networking to BlueField DPUs, to free GPU cycles and sustain throughput without compromising security. The solution supports instant deployment of NVIDIA NIM microservices and open-source models including Nemotron, and integrates an AI gateway that governs access to frontier cloud LLMs from Anthropic, Google, OpenAI, and others. The gateway also implements model context protocol (MCP) to allow agents to connect to enterprise data with granular access controls. The solution runs on Cisco infrastructure, allowing organizations to deploy on infrastructure they already operate. "By integrating everything from the AHV hypervisor and Flow Virtual Networking up to the Kubernetes platform, you remove the silos that slow down AI projects," Sengupta explains. Platform teams and developer agility cannot be traded off against each other One organizational tension that scales with agentic AI adoption is the relationship between platform teams managing shared infrastructure and the developers building and running agent applications on top of it. These groups have historically operated with different tooling, different priorities, and different time horizons, but Sengupta argues that the core dynamic hasn't changed even as the technology has. "Platform teams will continue to deliver a catalog of self-service AI capabilities that are also compliant to business needs, that they can serve to agentic AI builders," Sengupta says. "Mature AI teams will do a great job not just in GPU utilization, but in creating an operating model that enables fast AI infrastructure delivery to meet the pace of innovation that developers want. That's what is very critical to success." The organizations that are managing GPU utilization most effectively tend to be further along in their AI adoption journey, with more established operating models and clearer cost accountability. For organizations earlier in that journey, the infrastructure design and operating model decisions being made now will determine whether AI projects can move from pilot to production without cost or complexity becoming the limiting factor. The AI factory operating model The emerging framework for enterprise AI infrastructure is the AI factory, a purpose-built environment for producing and running AI workloads at scale. The challenge is that most organizations will need to operate both traditional compute and accelerated compute simultaneously for years, requiring a common operating model that spans both technology paradigms without sacrificing agility. With Nutanix, running on Cisco as part of the Cisco AI Pods, powered by Intel and optimized for the NVIDIA reference architecture, organizations get a production-ready, full-stack foundation by enabling AI factories to be securely and efficiently shared by thousands of agents, to achieve the lowest costs per token. The solution bridges the gap between the infrastructure and platform engineering teams who manage the hardware and the AI engineering and agentic AI developer teams who build and run agentic AI applications, making it truly affordable to run AI at a massive scale. "The metrics that will determine whether an organization can sustain and scale its AI investment — cost per token, GPU utilization, scheduling efficiency — are infrastructure metrics," Sengupta says. "Managing them well is increasingly a precondition for making AI viable, not just functional." Secure and scale your AI factory — explore the full-stack approach here. Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.
- Nvidia launches enterprise AI agent platform with Adobe, Salesforce, SAP among 17 adopters at GTC 2026Jensen Huang walked onto the GTC stage Monday wearing his trademark leather jacket and carrying, as it turned out, the blueprints for a new kind of industry dominance. The Nvidia CEO unveiled the Agent Toolkit, an open-source platform for building autonomous AI agents, and then rattled off the names of the companies that will use it: Adobe, Salesforce, SAP, ServiceNow, Siemens, CrowdStrike, Atlassian, Cadence, Synopsys, IQVIA, Palantir, Box, Cohesity, Dassault Systèmes, Red Hat, Cisco and Amdocs. Seventeen enterprise software companies, touching virtually every industry and every Fortune 500 corporation, all agreeing to build their next generation of AI products on a shared foundation that Nvidia designed, Nvidia optimizes and Nvidia maintains. The toolkit provides the models, the runtime, the security framework and the optimization libraries that AI agents need to operate autonomously inside organizations — resolving customer service tickets, designing semiconductors, managing clinical trials, orchestrating marketing campaigns. Each component is open source. Each is optimized for Nvidia hardware. The combination means that as AI agents proliferate across the corporate world, they will generate demand for Nvidia GPUs not because companies choose to buy them but because the software they depend on was engineered to require them. "The enterprise software industry will evolve into specialized agentic platforms," Huang told the crowd, "and the IT industry is on the brink of its next great expansion." What he left unsaid is that Nvidia has just positioned itself as the tollbooth at the entrance to that expansion — open to all, owned by one. Inside Nvidia's Agent Toolkit: the software stack designed to power every corporate AI worker To grasp the significance of Monday's announcements, it helps to understand the problem Nvidia is solving. Building an enterprise AI agent today is an exercise in frustration. A company that wants to deploy an autonomous system — one that can, say, monitor a telecommunications network and proactively resolve customer issues before anyone calls to complain — must assemble a language model, a retrieval system, a security layer, an orchestration framework and a runtime environment, typically from different vendors whose products were never designed to work together. Nvidia's Agent Toolkit collapses that complexity into a unified platform. It includes Nemotron, a family of open models optimized for agentic reasoning; AI-Q, an open blueprint that lets agents perceive, reason and act on enterprise knowledge; OpenShell, an open-source runtime enforcing policy-based security, network and privacy guardrails; and cuOpt, an optimization skill library. Developers can use the toolkit to create specialized AI agents that act autonomously while using and building other software to complete tasks. The AI-Q component addresses a pain point that has dogged enterprise AI adoption: cost. Its hybrid architecture routes complex orchestration tasks to frontier models while delegating research tasks to Nemotron's open models, which Nvidia says can cut query costs by more than 50 percent while maintaining top-tier accuracy. Nvidia used the AI-Q Blueprint to build what it claims is the top-ranking AI agent on both the DeepResearch Bench and DeepResearch Bench II leaderboards — benchmarks that, if they hold under independent validation, position the toolkit as not merely convenient but competitively necessary. OpenShell tackles what has been the single biggest obstacle in every boardroom conversation about letting AI agents loose inside corporate systems: trust. The runtime creates isolated sandboxes that enforce strict policies around data access, network reach and privacy boundaries. Nvidia is collaborating with Cisco, CrowdStrike, Google, Microsoft Security and TrendAI to integrate OpenShell with their existing security tools — a calculated move that enlists the cybersecurity industry as a validation layer for Nvidia's approach rather than a competing one. The partner list that reads like the Fortune 500: who signed on and what they're building The breadth of Monday's enterprise adoption announcements reveals Nvidia's ambitions more clearly than any specification sheet could. Adobe, in a simultaneously announced strategic partnership, will adopt Agent Toolkit software as the foundation for running hybrid, long-running creativity, productivity and marketing agents. Shantanu Narayen, Adobe's chair and CEO, said the companies will bring together "our Firefly models, CUDA libraries into our applications, 3D digital twins for marketing, and Agent Toolkit and Nemotron to our agentic frameworks to deliver high-quality, controllable and enterprise-grade AI workflows of the future." The partnership extends deep: Adobe will explore OpenShell and Nemotron as foundations for personalized, secure agentic loops, and will evaluate the toolkit for large-scale workflows powered by Adobe Experience Platform. Nvidia will provide engineering expertise, early access to software and targeted go-to-market support. Salesforce's integration may be the one enterprise IT leaders parse most carefully. The company is working with Nvidia Agent Toolkit software including Nemotron models, enabling customers to build, customize and deploy AI agents using Agentforce for service, sales and marketing. The collaboration introduces a reference architecture where employees can use Slack as the primary conversational interface and orchestration layer for Agentforce agents — powered by Nvidia infrastructure — that participate directly in business workflows and pull from data stores in both on-premises and cloud environments. For the millions of knowledge workers who already conduct their professional lives inside Slack, this turns a messaging app into the command center for corporate AI. SAP, whose software underpins the financial and operational plumbing of most Global 2000 companies, is using open Agent Toolkit software including NeMo for enabling AI agents through Joule Studio on SAP Business Technology Platform, enabling customers and partners to design agents tailored to their own business needs. ServiceNow's Autonomous Workforce of AI Specialists leverage Agent Toolkit software, the AI-Q Blueprint and a combination of closed and open models, including Nemotron and ServiceNow's own Apriel models — a hybrid approach that suggests the toolkit is designed not to replace existing AI investments but to become the connective tissue between them. From chip design to clinical trials: how agentic AI is reshaping specialized industries The partner list extends well beyond horizontal software platforms into deeply specialized verticals where autonomous agents could compress timelines measured in years. In semiconductor design — where a single advanced chip can cost billions of dollars and take half a decade to develop — three of the four major electronic design automation companies are building agents on Nvidia's stack. Cadence will leverage Agent Toolkit and Nemotron with its ChipStack AI SuperAgent for semiconductor design and verification. Siemens is launching its Fuse EDA AI Agent, which uses Nemotron to autonomously orchestrate workflows across its entire electronic design automation portfolio, from design conception through manufacturing sign-off. Synopsys is building a multi-agent framework powered by its AgentEngineer technology using Nemotron and Nemo Agent Toolkit. Healthcare and life sciences present perhaps the most consequential use case. IQVIA is integrating Nemotron and other Agent Toolkit software with IQVIA.ai, a unified agentic AI platform designed to help life sciences organizations work more efficiently across clinical, commercial and real-world operations. The scale is already significant: IQVIA has deployed more than 150 agents across internal teams and client environments, including 19 of the top 20 pharmaceutical companies. The security sector is embedding itself into the architecture from the ground floor. CrowdStrike unveiled a Secure-by-Design AI Blueprint that embeds its Falcon platform protection directly into Nvidia AI agent architectures — including agents built on AI-Q and OpenShell — and is advancing agentic managed detection and response using Nemotron reasoning models. Cisco AI Defense will provide AI security protection for OpenShell, adding controls and guardrails to govern agent actions. These are not aftermarket bolt-ons; they are foundational integrations that signal the security industry views Nvidia's agent platform as the substrate it needs to protect. Dassault Systèmes is exploring Agent Toolkit software and Nemotron for its role-based AI agents, called Virtual Companions, on its 3DEXPERIENCE agentic platform. Atlassian is working with the toolkit as it evolves its Rovo AI agentic strategy for Jira and Confluence. Box is using it to enable enterprise agents to securely execute long-running business processes. Palantir is developing AI agents on Nemotron that run on its sovereign AI Operating System Reference Architecture. The open-source gambit: why giving software away is Nvidia's most aggressive business move There is something almost paradoxical about a company with a multi-trillion-dollar market capitalization giving away its most strategically important software. But Nvidia's open-source approach to Agent Toolkit is less an act of generosity than a carefully constructed competitive moat. OpenShell is open source. Nemotron models are open. AI-Q blueprints are publicly available. LangChain, the agent engineering company whose open-source frameworks have been downloaded over 1 billion times, is working with Nvidia to integrate Agent Toolkit components into the LangChain deep agent library for developing advanced, accurate enterprise AI agents at scale. When the most popular independent framework for building AI agents absorbs your toolkit, you have transcended the category of vendor and entered the category of infrastructure. But openness in AI has a way of being strategically selective. The models are open, but they are optimized for Nvidia's CUDA libraries — the proprietary software layer that has locked developers into Nvidia GPUs for two decades. The runtime is open, but it integrates most deeply with Nvidia's security partners. The blueprints are open, but they perform best on Nvidia hardware. Developers can explore Agent Toolkit and OpenShell on build.nvidia.com today, running on inference providers and Nvidia Cloud Partners including Baseten, CoreWeave, DeepInfra, DigitalOcean and others — all of which run Nvidia GPUs. The strategy has a historical analog in Google's approach to Android: give away the operating system to ensure that the entire mobile ecosystem generates demand for your core services. Nvidia is giving away the agent operating system to ensure that the entire enterprise AI ecosystem generates demand for its core product — the GPU. Every Salesforce agent running Nemotron, every SAP workflow orchestrated through OpenShell, every Adobe creative pipeline accelerated by CUDA creates another strand of dependency on Nvidia silicon. This also explains the Nemotron Coalition announced Monday — a global collaboration of model builders including Mistral AI, Cursor, LangChain, Perplexity, Reflection AI, Sarvam and Thinking Machines Lab, all working to advance open frontier models. The coalition's first project will be a base model codeveloped by Mistral AI and Nvidia, trained on Nvidia DGX Cloud, that will underpin the upcoming Nemotron 4 family. By seeding the open model ecosystem with Nvidia-optimized foundations, the company ensures that even models it does not build will run best on its hardware. What could go wrong: the risks enterprise buyers should weigh before going all-in For all the ambition on display Monday, several realities temper the narrative. Adoption announcements are not deployment announcements. Many of the partner disclosures use carefully hedged language — "exploring," "evaluating," "working with" — that is standard in embargoed press releases but should not be confused with production systems serving millions of users. Adobe's own forward-looking statements note that "due to the non-binding nature of the agreement, there are no assurances that Adobe will successfully negotiate and execute definitive documentation with Nvidia on favorable terms or at all." The gap between a GTC keynote demonstration and an enterprise-grade rollout remains substantial. Nvidia is not the only company chasing this market. Microsoft, with its Copilot ecosystem and Azure AI infrastructure, pursues a parallel strategy with the advantage of owning the operating systems and productivity software that most enterprises already use. Google, through Gemini and its cloud platform, has its own agent vision. Amazon, via Bedrock and AWS, is building comparable primitives. The question is not whether enterprise AI agents will be built on some platform but whether the market will consolidate around one stack or fragment across several. The security claims, while architecturally sound, remain unproven at scale. OpenShell's policy-based guardrails are a promising design pattern, but autonomous agents operating in complex enterprise environments will inevitably encounter edge cases that no policy framework has anticipated. CrowdStrike's Secure-by-Design AI Blueprint and Cisco AI Defense's OpenShell integration are exactly the kind of layered defense enterprise buyers will demand — but both are newly unveiled, not battle-hardened through years of adversarial testing. Deploying agents that can autonomously access data, execute code and interact with production systems introduces a threat surface that the industry has barely begun to map. And there is the question of whether enterprises are ready for agents at all. The technology may be available, but organizational readiness — the governance structures, the change management, the regulatory frameworks, the human trust — often lags years behind what the platforms can deliver. Beyond agents: the full scope of what Nvidia announced at GTC 2026 Monday's Agent Toolkit announcement did not arrive in isolation. It landed amid an avalanche of product launches that, taken together, describe a company remaking itself at every layer of the computing stack. Nvidia unveiled the Vera Rubin platform — seven new chips in full production, including the Vera CPU purpose-built for agentic AI, the Rubin GPU, and the newly integrated Groq 3 LPU inference accelerator — designed to power every phase of AI from pretraining to real-time agentic inference. The Vera Rubin NVL72 rack integrates 72 Rubin GPUs and 36 Vera CPUs, delivering what Nvidia claims is up to 10x higher inference throughput per watt at one-tenth the cost per token compared with the Blackwell platform. Dynamo 1.0, an open-source inference operating system that Nvidia describes as the "operating system for AI factories," entered production with adoption from AWS, Microsoft Azure, Google Cloud and Oracle Cloud Infrastructure alongside companies like Cursor, Perplexity, PayPal and Pinterest. The BlueField-4 STX storage architecture promises up to 5x token throughput for the long-context reasoning that agents demand, with early adopters including CoreWeave, Crusoe, Lambda, Mistral AI and Nebius. BYD, Geely, Isuzu and Nissan announced Level 4 autonomous vehicle programs on Nvidia's DRIVE Hyperion platform, and Uber disclosed plans to launch Nvidia-powered robotaxis across 28 cities and four continents by 2028, beginning with Los Angeles and San Francisco in the first half of 2027. Roche, the pharmaceutical giant, announced it is deploying more than 3,500 Nvidia Blackwell GPUs across hybrid cloud and on-premises environments in the U.S. and Europe — what it calls the largest announced GPU footprint available to a pharmaceutical company. Nvidia also launched physical AI tools for healthcare robotics, with CMR Surgical, Johnson & Johnson MedTech and others adopting the platform, and released Open-H, the world's largest healthcare robotics dataset with over 700 hours of surgical video. And Nvidia even announced a Space Module based on the Vera Rubin architecture, promising to bring data-center-class AI to orbital environments. The real meaning of GTC 2026: Nvidia is no longer selling picks and shovels Strip away the product specifications and benchmark claims and what emerges from GTC 2026 is a single, clarifying thesis: Nvidia believes the era of AI agents will be larger than the era of AI models, and it intends to own the platform layer of that transition the way it already owns the hardware layer of the current one. The 17 enterprise software companies that signed on Monday are making a bet of their own. They are wagering that building on Nvidia's agent infrastructure will let them move faster than building alone — and that the benefits of a shared platform outweigh the risks of shared dependency. For Salesforce, it means Agentforce agents that can draw from both cloud and on-premises data through a single Slack interface. For Adobe, it means creative AI pipelines that span image, video, 3D and document intelligence. For SAP, it means agents woven into the transactional fabric of global commerce. Each partnership is rational on its own terms. Together, they form something larger: an industry-wide endorsement of Nvidia as the default substrate for enterprise intelligence. Huang, who opened his career designing graphics chips for video games, closed his keynote by gesturing toward a future in which AI agents do not just assist human workers but operate as autonomous colleagues — reasoning through problems, building their own tools, learning from their mistakes. He compared the moment to the birth of the personal computer, the dawn of the internet, the rise of mobile computing. Technology executives have a professional obligation to describe every product cycle as a revolution. But here is what made Monday different: this time, 17 of the world's most important software companies showed up to agree with him. Whether they did so out of conviction or out of a calculated fear of being left behind may be the most important question in enterprise technology — and it is one that only the next few years can answer.
- The AI governance mirage: Why 72% of enterprises don’t have the control and security they think they doDecision makers at 72% of organizations claim to have two or more AI platforms that they identify as their "primary" layer, according to a survey of 40 enterprise companies conducted by VentureBeat last month, revealing real gaps in security and control. For enterprise management and technical leaders, and especially security leaders, these multiple AI platforms extend the attack surfaces of most enterprises at a time when AI-driven attacks have become increasingly potent. The multiple platforms — which include offerings from hyperscaler or AI labs like Microsoft Azure, Google, OpenAI or Anthropic, or big application companies like Epic, Workday or ServiceNow — reflect a state of sprawl that has emerged as these big software providers rush to offer their own AI to their enterprise customers. Those customers, in their own rush to scale AI, are finding they aren’t building a singular strategy — in fact they may be building a collection of contradictions. The strategic paradox: why leading enterprises are building around their vendors For example, take the strategic paradox faced by Mass General Brigham (MGB) hospital system, which has 90,000 employees and is the largest employer in Massachusetts. The hospital system last year had to shut down an uncontrolled number of internal proof of concepts that had sprouted up as employees had gotten carried away with AI projects, said CTO Nallan “Sri” Sriraman at the VentureBeat AI Impact event in Boston on March 26, which focused on the challenges of scaling AI. Instead, the company decided it was better to wait for the software giants it already uses to deliver on their AI roadmaps. Since these companies have so many resources, and were making AI a top priority themselves, it made no sense for MGB to try to build its own AI layer that would be duplicative, he said. "Why are we building it ourselves?" he asked. "Leverage it." Yet, even then, Sriraman’s team has been forced to build workarounds, where those companies haven’t done enough. For example, MGB has just completed a “full-scaled” custom build around Microsoft’s Copilot — to get essentially everything offered by that tool — by putting a "skin" around Copilot to handle the safety and data privacy concerns the major model providers haven't yet mastered. Specifically, MGB needed a way for employees to prompt the AI and not have their protected health information (PHI) leaked back to the Copilot LLM provider, OpenAI. The new secure platform, which can support up to 30,000 users, is really the ultimate contradiction: Even though the company has a mandate to leverage the AI provided by the bigger companies, it needs to build around its failures. The contradiction goes even further. These software vendors used by MGB — which also include Epic, Workday and ServiceNow — are all now building agents for their AI, all operating differently. So MGB has to invest in building a “control plane that coordinates and orchestrates all of these agents,” Sriraman said. “That’s where our investment is going to be.” He noted that companies like his are “discovering and experimenting as the landscape keeps shifting." The marketplace is "still nascent," he said, which makes decisions difficult. The "six blind men" problem Sriraman explained the current vendor landscape with an analogy: "When you ask six blind men to touch an elephant and say, what does this elephant look like?" Sriraman said. "You're gonna get six different answers." What emerges from the research VentureBeat conducted in the first quarter, along with conversations like the one in Boston, is a situation that we at VentureBeat are calling a “governance mirage.” While many enterprises say they have adequate governance, in reality they haven’t created clear accountability or specific guardrails, evaluations or security processes to ensure that governance. The data of disconnect: confidence vs. systematic oversight The research comes from surveys across January, February and March by VentureBeat of enterprise companies with 100 or more employees, with 40 to 70 qualified respondents per topic area — covering agentic orchestration, AI security, RAG and governance. The data lacks statistical significance in many areas and should be treated as directional. The research on governance found that a majority, or 56%, of respondents said they are “very confident” that they’d detect a misbehaving AI model, suggesting that most decision-makers believe they have sufficient basic governance at their companies. However, nearly a third of respondents have no systematic mechanism to detect AI misbehavior until it surfaces through users or audits. In a world where telemetry leakage accounts for 34% of GenAI incidents (Wiz), and the global average breach cost has hit $4.4M (IBM 2025 Cost of a Data Breach), finding out after the damage is done is the default for too many companies. Moreover, 43% of respondents say a central team owns AI governance. That sounds reassuring — until you look at what’s happening everywhere else. Twenty-three percent say governance is unclear or actively contested between teams. Twenty percent say each platform team governs independently. Six percent say no one has formally addressed it. The rest said they were unsure who owned it. More telling is the barrier data. When asked about the single biggest obstacle to governing AI across platforms, “no single owner or accountable team” ranked second at 29% — just behind vendor opacity. Accountability structure and lack of vendor transparency are the two dominant failure modes, and they compound each other: Without a central owner, no one has the mandate to demand transparency from the vendors. The day-two bill: managing sprawl, creep, and lock-in The scaling trap: Red Hat’s warning Brian Gracely, Senior Director at Red Hat, who also spoke at the VentureBeat Boston event last month, addressed the infrastructure side of this sprawl, warning that many enterprises are falling into a trap of deceptive initial wins. Gracely noted that the barrier to entry is almost nonexistent at the start, with nearly anyone able to spin up a project using a credit card and an API key. "Day zero is very, very easy," Gracely said. "Day two is when the bill comes due." Red Hat is positioning its software layer (OpenShift AI) as the necessary buffer to prevent enterprises from getting buried in a single provider's proprietary ecosystem. Gracely’s point is direct: If your control system is built entirely inside one cloud provider’s toolset, you are effectively "renting a cage." The illusion of speed in the early pilot phase often hides a technical debt that becomes obvious the moment you try to move your AI work to a different platform. Gracely illustrated this with a recent example. A senior leader from Red Hat’s centralized CTO office spent part of her vacation contributing to an open-source agent project called OpenClaw, which became widely popular in the first quarter. Within days of her name appearing as a project maintainer, Red Hat was fielding calls from major New York banks. Their problem was immediate: They realized they already had upwards of 10,000 employees bringing "claws" — agent-based tools — into their infrastructure with zero centralized oversight. Breaches caused by employees working on these sorts of unapproved technologies are costly. These so-called “shadow AI” incidents cost on average $670K more than standard incidents, according to IBM. Red Hat’s Gracely noted that while organizations can try to shut down these unapproved ports, they eventually have to figure out how to make them productive and secure — a task that requires a serious investment in an orchestration or platform layer. The dynamic defensive: MassMutual’s refusal to bet While some enterprise companies seek an "AI operating system" that oversees all of their AI technologies and apps, others are simply refusing to sign the check. Sears Merritt, CIO and head of enterprise technology at MassMutual, is managing the governance conundrum by intentionally staying in a state of high-velocity flexibility. "Things are so dynamic, it’s hard to know which of the AI vendors will end up on top," Merritt said at the Boston event. For that reason, MassMutual is refusing to enter any long-term contracts with AI vendors. Merritt’s strategy of “dynamic defensive” highlights a core finding of our research: Vendor popularity is changing radically month to month. Anthropic, for example, went from 0% in January to nearly 6% in February, in the number of respondents reporting what agent orchestration technology they were using. Again, the sample size was small, at 70 respondents. Still, even if directional, the dynamic landscape suggests picking a "primary" winner today is a fool’s errand. The January figure likely reflects survey composition: Respondents represent the broader enterprise market, not the developer community where Anthropic has seen its strongest early traction. Until recently, most organizations had signed up early with leaders like Microsoft and OpenAI as their main orchestration providers, due to their early lead with Copilot. Our finding that Anthropic is just now pushing into enterprise agent orchestration may be a confirmation of the recent excitement around that platform. One possible explanation is that enterprises already using Claude for model inference are now routing through Anthropic's native tooling rather than third-party frameworks — though the sample is too small to draw firm conclusions. The rise of “platform creep” The leading providers are also shifting toward "managed agents," as reflected by Anthropic’s recent announcement. This offering suggests possible continued platform creep, whereby providers like OpenAI and Anthropic take over more and more of the AI infrastructure — most specifically, in this case, the memory of agentic session details. And there the trap is set. Once your session data and orchestration live inside a provider's proprietary database, you aren't just using a model; you are living in its ecosystem. Moreover, persistent agent memory is a prime target for memory poisoning via injected instructions that influence every future interaction. And when that memory lives in a provider's database, you lose your own forensic capability. The security irony: The fox guarding the hen house We are seeing this platform creep in our data as well. The most jarring finding in our Q1 data is what we call the "Security Irony": the fact that the providers most responsible for creating enterprise AI risk are the same ones enterprises are using to manage it. Respondents said the top selection criterion for AI orchestration platforms was “security and permissions generally” (37.1%), beating out other criteria like cost, flexibility, control and ease of development. Yet, the market is choosing convenience over sovereignty. According to our survey, 26% of enterprises in February were using OpenAI as their primary security solution — the very same provider whose models create the risks they are trying to secure. That trend only seemed to strengthen in March, though, as stated before, we want to be careful. Our sample size is small, and this data should only be taken as directional. It’s not clear whether enterprises are choosing OpenAI as a security solution, or just relying on its built-in security features offered by Microsoft Azure (which partnered with OpenAI when it pushed its Copilot solution aggressively in 2024) because customers were already on that platform. Beyond the data, there are anecdotal signs that OpenAI's enterprise position may be shifting. Anthropic's Claude Code drew significant attention among developers early this year alongside the Claude 4.6 model. The subsequent announcement of Mythos, its security-focused model, prompted interest from enterprise security teams given its ability to identify vulnerabilities. OpenAI has also announced a security-focused model, GPT-5.4-Cyber. Our data may also point to a drop in OpenAI’s relative position in a few enterprise AI categories. One area was data-retrieval, where OpenAI again leads among third-party providers, but we saw an increase in the number of respondents instead using in-house solutions for retrieval — perhaps a sign that AI models and agents are getting better at natively being able to use tools to call directly to companies’ existing databases, and that custom code is often a way companies are building this in. However, here again we feel our data is at best directional for now. We are asking the fox to guard the hen house. Hyperscaler security features (like those from OpenAI, Azure, and Google) are winning, because they are already integrated into the platforms enterprises are using. But it creates a single-provider dependency. As agents gain the power to modify documents, call APIs and access databases, the “governance mirage" suggests we have control, while the data shows we are simply clicking "I agree" on whatever the hyperscalers offer. The resulting risks, however, include content injection, privilege escalation and data exfiltration. The path forward: toward a unified control plane The search for the "Dynatrace for AI" So, what is the way out? Sriraman argued that the industry desperately needs a "central observability platform" — a "Dynatrace for AI" — that provides full end-to-end visibility, including model drift and safety prompting, agent behavior analytics, privilege escalation alerts, and forensic logging. He is currently working with a number of potential providers to deliver on this. The “swivel chair” warning Sriraman warned that without a unified control plane, enterprises are at risk of sliding back into a fragmented "swivel chair" world — reminiscent of the early, inefficient days of Robotic Process Automation (RPA) — where employees are forced to constantly jump between different siloed AI tools to finish a single workflow. "We don’t want to create a world where you have to switch to do something here and then go back to the platform to do something else," he said. But that desire for a single control plane conflicts with the desire to avoid lock-in. Our data shows the market has settled on the “hybrid control plane.” In other words, the most popular situation among our respondents (at 34.3%), was to use model provider-native solutions like Copilot Studio or OpenAI assistants for some workflows, while also running external options like LangGraph or custom orchestration for others. Smaller numbers of companies reported being more dogmatic here, whether that be deliberately removing the model provider from the orchestration layer entirely, relying only on custom orchestration tools, or relying only on the model provider’s technology Enterprises trust no single provider enough to give them full control, yet they lack the engineering capacity to build entirely from scratch. The bottom line: The “big red button” Visibility and integration are only half the battle. In a high-stakes industry like healthcare, Sriraman argues that any legitimate control plane must also offer a hard-stop capability. "We need a big red button," he said. "Kill it. We should be able to have that … without that, don't put anything in the operational setting." In fact, such a kill switch was formally called for by the security community group OWASP as part of a recommended security framework. The “governance mirage” is the belief that you can scale AI without deciding who owns the control and security plane. If you are one of the 72% of organizations claiming multiple "primary" platforms, be careful because you may not have a strategy; you may have a conflict of interest. It suggests that the winner of the war between the AI behemoths — OpenAI, Anthropic, Google, Microsoft, etc. — won’t necessarily be the one with the best model, but the one that manages to sit above the models and help enterprises enforce a single version of the truth. That may be difficult to achieve, though, given that companies won’t want lock-in with a single player. The data suggests enterprises are already resisting that outcome — and may need to formalize that resistance. Enterprises arguably need to own their control plane with independent security instrumentation, not wait for a vendor to win that role for them.