Nvidia launches enterprise AI agent platform with Adobe, Salesforce, SAP among 17 adopters at GTC 2026
Our take

Jensen Huang walked onto the GTC stage Monday wearing his trademark leather jacket and carrying, as it turned out, the blueprints for a new kind of industry dominance.
The Nvidia CEO unveiled the Agent Toolkit, an open-source platform for building autonomous AI agents, and then rattled off the names of the companies that will use it: Adobe, Salesforce, SAP, ServiceNow, Siemens, CrowdStrike, Atlassian, Cadence, Synopsys, IQVIA, Palantir, Box, Cohesity, Dassault Systèmes, Red Hat, Cisco and Amdocs. Seventeen enterprise software companies, touching virtually every industry and every Fortune 500 corporation, all agreeing to build their next generation of AI products on a shared foundation that Nvidia designed, Nvidia optimizes and Nvidia maintains.
The toolkit provides the models, the runtime, the security framework and the optimization libraries that AI agents need to operate autonomously inside organizations — resolving customer service tickets, designing semiconductors, managing clinical trials, orchestrating marketing campaigns. Each component is open source. Each is optimized for Nvidia hardware. The combination means that as AI agents proliferate across the corporate world, they will generate demand for Nvidia GPUs not because companies choose to buy them but because the software they depend on was engineered to require them.
"The enterprise software industry will evolve into specialized agentic platforms," Huang told the crowd, "and the IT industry is on the brink of its next great expansion." What he left unsaid is that Nvidia has just positioned itself as the tollbooth at the entrance to that expansion — open to all, owned by one.
Inside Nvidia's Agent Toolkit: the software stack designed to power every corporate AI worker
To grasp the significance of Monday's announcements, it helps to understand the problem Nvidia is solving.
Building an enterprise AI agent today is an exercise in frustration. A company that wants to deploy an autonomous system — one that can, say, monitor a telecommunications network and proactively resolve customer issues before anyone calls to complain — must assemble a language model, a retrieval system, a security layer, an orchestration framework and a runtime environment, typically from different vendors whose products were never designed to work together.
Nvidia's Agent Toolkit collapses that complexity into a unified platform. It includes Nemotron, a family of open models optimized for agentic reasoning; AI-Q, an open blueprint that lets agents perceive, reason and act on enterprise knowledge; OpenShell, an open-source runtime enforcing policy-based security, network and privacy guardrails; and cuOpt, an optimization skill library. Developers can use the toolkit to create specialized AI agents that act autonomously while using and building other software to complete tasks.
The AI-Q component addresses a pain point that has dogged enterprise AI adoption: cost. Its hybrid architecture routes complex orchestration tasks to frontier models while delegating research tasks to Nemotron's open models, which Nvidia says can cut query costs by more than 50 percent while maintaining top-tier accuracy. Nvidia used the AI-Q Blueprint to build what it claims is the top-ranking AI agent on both the DeepResearch Bench and DeepResearch Bench II leaderboards — benchmarks that, if they hold under independent validation, position the toolkit as not merely convenient but competitively necessary.
OpenShell tackles what has been the single biggest obstacle in every boardroom conversation about letting AI agents loose inside corporate systems: trust. The runtime creates isolated sandboxes that enforce strict policies around data access, network reach and privacy boundaries. Nvidia is collaborating with Cisco, CrowdStrike, Google, Microsoft Security and TrendAI to integrate OpenShell with their existing security tools — a calculated move that enlists the cybersecurity industry as a validation layer for Nvidia's approach rather than a competing one.
The partner list that reads like the Fortune 500: who signed on and what they're building
The breadth of Monday's enterprise adoption announcements reveals Nvidia's ambitions more clearly than any specification sheet could.
Adobe, in a simultaneously announced strategic partnership, will adopt Agent Toolkit software as the foundation for running hybrid, long-running creativity, productivity and marketing agents. Shantanu Narayen, Adobe's chair and CEO, said the companies will bring together "our Firefly models, CUDA libraries into our applications, 3D digital twins for marketing, and Agent Toolkit and Nemotron to our agentic frameworks to deliver high-quality, controllable and enterprise-grade AI workflows of the future." The partnership extends deep: Adobe will explore OpenShell and Nemotron as foundations for personalized, secure agentic loops, and will evaluate the toolkit for large-scale workflows powered by Adobe Experience Platform. Nvidia will provide engineering expertise, early access to software and targeted go-to-market support.
Salesforce's integration may be the one enterprise IT leaders parse most carefully. The company is working with Nvidia Agent Toolkit software including Nemotron models, enabling customers to build, customize and deploy AI agents using Agentforce for service, sales and marketing. The collaboration introduces a reference architecture where employees can use Slack as the primary conversational interface and orchestration layer for Agentforce agents — powered by Nvidia infrastructure — that participate directly in business workflows and pull from data stores in both on-premises and cloud environments. For the millions of knowledge workers who already conduct their professional lives inside Slack, this turns a messaging app into the command center for corporate AI.
SAP, whose software underpins the financial and operational plumbing of most Global 2000 companies, is using open Agent Toolkit software including NeMo for enabling AI agents through Joule Studio on SAP Business Technology Platform, enabling customers and partners to design agents tailored to their own business needs. ServiceNow's Autonomous Workforce of AI Specialists leverage Agent Toolkit software, the AI-Q Blueprint and a combination of closed and open models, including Nemotron and ServiceNow's own Apriel models — a hybrid approach that suggests the toolkit is designed not to replace existing AI investments but to become the connective tissue between them.
From chip design to clinical trials: how agentic AI is reshaping specialized industries
The partner list extends well beyond horizontal software platforms into deeply specialized verticals where autonomous agents could compress timelines measured in years.
In semiconductor design — where a single advanced chip can cost billions of dollars and take half a decade to develop — three of the four major electronic design automation companies are building agents on Nvidia's stack. Cadence will leverage Agent Toolkit and Nemotron with its ChipStack AI SuperAgent for semiconductor design and verification. Siemens is launching its Fuse EDA AI Agent, which uses Nemotron to autonomously orchestrate workflows across its entire electronic design automation portfolio, from design conception through manufacturing sign-off. Synopsys is building a multi-agent framework powered by its AgentEngineer technology using Nemotron and Nemo Agent Toolkit.
Healthcare and life sciences present perhaps the most consequential use case. IQVIA is integrating Nemotron and other Agent Toolkit software with IQVIA.ai, a unified agentic AI platform designed to help life sciences organizations work more efficiently across clinical, commercial and real-world operations. The scale is already significant: IQVIA has deployed more than 150 agents across internal teams and client environments, including 19 of the top 20 pharmaceutical companies.
The security sector is embedding itself into the architecture from the ground floor. CrowdStrike unveiled a Secure-by-Design AI Blueprint that embeds its Falcon platform protection directly into Nvidia AI agent architectures — including agents built on AI-Q and OpenShell — and is advancing agentic managed detection and response using Nemotron reasoning models. Cisco AI Defense will provide AI security protection for OpenShell, adding controls and guardrails to govern agent actions. These are not aftermarket bolt-ons; they are foundational integrations that signal the security industry views Nvidia's agent platform as the substrate it needs to protect.
Dassault Systèmes is exploring Agent Toolkit software and Nemotron for its role-based AI agents, called Virtual Companions, on its 3DEXPERIENCE agentic platform. Atlassian is working with the toolkit as it evolves its Rovo AI agentic strategy for Jira and Confluence. Box is using it to enable enterprise agents to securely execute long-running business processes. Palantir is developing AI agents on Nemotron that run on its sovereign AI Operating System Reference Architecture.
The open-source gambit: why giving software away is Nvidia's most aggressive business move
There is something almost paradoxical about a company with a multi-trillion-dollar market capitalization giving away its most strategically important software. But Nvidia's open-source approach to Agent Toolkit is less an act of generosity than a carefully constructed competitive moat.
OpenShell is open source. Nemotron models are open. AI-Q blueprints are publicly available. LangChain, the agent engineering company whose open-source frameworks have been downloaded over 1 billion times, is working with Nvidia to integrate Agent Toolkit components into the LangChain deep agent library for developing advanced, accurate enterprise AI agents at scale. When the most popular independent framework for building AI agents absorbs your toolkit, you have transcended the category of vendor and entered the category of infrastructure.
But openness in AI has a way of being strategically selective. The models are open, but they are optimized for Nvidia's CUDA libraries — the proprietary software layer that has locked developers into Nvidia GPUs for two decades. The runtime is open, but it integrates most deeply with Nvidia's security partners. The blueprints are open, but they perform best on Nvidia hardware. Developers can explore Agent Toolkit and OpenShell on build.nvidia.com today, running on inference providers and Nvidia Cloud Partners including Baseten, CoreWeave, DeepInfra, DigitalOcean and others — all of which run Nvidia GPUs.
The strategy has a historical analog in Google's approach to Android: give away the operating system to ensure that the entire mobile ecosystem generates demand for your core services. Nvidia is giving away the agent operating system to ensure that the entire enterprise AI ecosystem generates demand for its core product — the GPU. Every Salesforce agent running Nemotron, every SAP workflow orchestrated through OpenShell, every Adobe creative pipeline accelerated by CUDA creates another strand of dependency on Nvidia silicon.
This also explains the Nemotron Coalition announced Monday — a global collaboration of model builders including Mistral AI, Cursor, LangChain, Perplexity, Reflection AI, Sarvam and Thinking Machines Lab, all working to advance open frontier models. The coalition's first project will be a base model codeveloped by Mistral AI and Nvidia, trained on Nvidia DGX Cloud, that will underpin the upcoming Nemotron 4 family. By seeding the open model ecosystem with Nvidia-optimized foundations, the company ensures that even models it does not build will run best on its hardware.
What could go wrong: the risks enterprise buyers should weigh before going all-in
For all the ambition on display Monday, several realities temper the narrative.
Adoption announcements are not deployment announcements. Many of the partner disclosures use carefully hedged language — "exploring," "evaluating," "working with" — that is standard in embargoed press releases but should not be confused with production systems serving millions of users. Adobe's own forward-looking statements note that "due to the non-binding nature of the agreement, there are no assurances that Adobe will successfully negotiate and execute definitive documentation with Nvidia on favorable terms or at all." The gap between a GTC keynote demonstration and an enterprise-grade rollout remains substantial.
Nvidia is not the only company chasing this market. Microsoft, with its Copilot ecosystem and Azure AI infrastructure, pursues a parallel strategy with the advantage of owning the operating systems and productivity software that most enterprises already use. Google, through Gemini and its cloud platform, has its own agent vision. Amazon, via Bedrock and AWS, is building comparable primitives. The question is not whether enterprise AI agents will be built on some platform but whether the market will consolidate around one stack or fragment across several.
The security claims, while architecturally sound, remain unproven at scale. OpenShell's policy-based guardrails are a promising design pattern, but autonomous agents operating in complex enterprise environments will inevitably encounter edge cases that no policy framework has anticipated. CrowdStrike's Secure-by-Design AI Blueprint and Cisco AI Defense's OpenShell integration are exactly the kind of layered defense enterprise buyers will demand — but both are newly unveiled, not battle-hardened through years of adversarial testing. Deploying agents that can autonomously access data, execute code and interact with production systems introduces a threat surface that the industry has barely begun to map.
And there is the question of whether enterprises are ready for agents at all. The technology may be available, but organizational readiness — the governance structures, the change management, the regulatory frameworks, the human trust — often lags years behind what the platforms can deliver.
Beyond agents: the full scope of what Nvidia announced at GTC 2026
Monday's Agent Toolkit announcement did not arrive in isolation. It landed amid an avalanche of product launches that, taken together, describe a company remaking itself at every layer of the computing stack.
Nvidia unveiled the Vera Rubin platform — seven new chips in full production, including the Vera CPU purpose-built for agentic AI, the Rubin GPU, and the newly integrated Groq 3 LPU inference accelerator — designed to power every phase of AI from pretraining to real-time agentic inference. The Vera Rubin NVL72 rack integrates 72 Rubin GPUs and 36 Vera CPUs, delivering what Nvidia claims is up to 10x higher inference throughput per watt at one-tenth the cost per token compared with the Blackwell platform. Dynamo 1.0, an open-source inference operating system that Nvidia describes as the "operating system for AI factories," entered production with adoption from AWS, Microsoft Azure, Google Cloud and Oracle Cloud Infrastructure alongside companies like Cursor, Perplexity, PayPal and Pinterest.
The BlueField-4 STX storage architecture promises up to 5x token throughput for the long-context reasoning that agents demand, with early adopters including CoreWeave, Crusoe, Lambda, Mistral AI and Nebius. BYD, Geely, Isuzu and Nissan announced Level 4 autonomous vehicle programs on Nvidia's DRIVE Hyperion platform, and Uber disclosed plans to launch Nvidia-powered robotaxis across 28 cities and four continents by 2028, beginning with Los Angeles and San Francisco in the first half of 2027.
Roche, the pharmaceutical giant, announced it is deploying more than 3,500 Nvidia Blackwell GPUs across hybrid cloud and on-premises environments in the U.S. and Europe — what it calls the largest announced GPU footprint available to a pharmaceutical company. Nvidia also launched physical AI tools for healthcare robotics, with CMR Surgical, Johnson & Johnson MedTech and others adopting the platform, and released Open-H, the world's largest healthcare robotics dataset with over 700 hours of surgical video. And Nvidia even announced a Space Module based on the Vera Rubin architecture, promising to bring data-center-class AI to orbital environments.
The real meaning of GTC 2026: Nvidia is no longer selling picks and shovels
Strip away the product specifications and benchmark claims and what emerges from GTC 2026 is a single, clarifying thesis: Nvidia believes the era of AI agents will be larger than the era of AI models, and it intends to own the platform layer of that transition the way it already owns the hardware layer of the current one.
The 17 enterprise software companies that signed on Monday are making a bet of their own. They are wagering that building on Nvidia's agent infrastructure will let them move faster than building alone — and that the benefits of a shared platform outweigh the risks of shared dependency. For Salesforce, it means Agentforce agents that can draw from both cloud and on-premises data through a single Slack interface. For Adobe, it means creative AI pipelines that span image, video, 3D and document intelligence. For SAP, it means agents woven into the transactional fabric of global commerce. Each partnership is rational on its own terms. Together, they form something larger: an industry-wide endorsement of Nvidia as the default substrate for enterprise intelligence.
Huang, who opened his career designing graphics chips for video games, closed his keynote by gesturing toward a future in which AI agents do not just assist human workers but operate as autonomous colleagues — reasoning through problems, building their own tools, learning from their mistakes. He compared the moment to the birth of the personal computer, the dawn of the internet, the rise of mobile computing.
Technology executives have a professional obligation to describe every product cycle as a revolution. But here is what made Monday different: this time, 17 of the world's most important software companies showed up to agree with him. Whether they did so out of conviction or out of a calculated fear of being left behind may be the most important question in enterprise technology — and it is one that only the next few years can answer.
Read on the original site
Open the publisher's page for the full experience
Related Articles
- Scaling AI into production is forcing a rethink of enterprise infrastructurePresented by Nutanix Across industries, organizations are focused on how to move from AI pilots, proofs of concept, and cloud-based experimentation to deploying it at scale — across real workloads, for real users, in real business environments. VentureBeat spoke with Tarkan Maner, president and chief commercial officer at Nutanix, and Thomas Cornely, EVP of product management, about what that transition demands, and what it will take to get it right. “AI in general is shifting everything we do, not only in technology, but across all vertical industries, from regulated industries like banking, health care, government, education to non-regulated industries like manufacturing and retail,” Maner said. “As a complete platform company, we welcome this change. It’s creating more opportunities for us as a company to serve our customers in better ways as we move forward.” But there’s still a practical gap between experimentation and production, Cornely said. “It’s one thing to do an experiment, to do a prototype. It’s a different thing to take that prototype and deploy it for 10,000 employees,” he explained. “We went from people focusing on training models to chatbots to now doing agents, where the demand and pressures on AI infrastructure are growing exponentially.” Agentic AI introduces a new layer of enterprise complexity The rise of agentic AI is what makes this transition especially consequential. These systems introduce multi-step workflows across applications and data sources, along with a degree of autonomy that creates new operational demands. Enterprises now have to contend with multiple agents running simultaneously, unpredictable and real-time workloads, and the need to coordinate access to infrastructure across teams. “OpenClaw is making it very easy now for anybody to build agents and run with agents,” Cornely said. “You want those agents to be running on premises with your data. You need to have the right constructs around it to protect the enterprise from what an agent could do.” As these systems become more autonomous, the challenge extends beyond how they operate to how they interact with enterprise data, systems, and teams. AI is augmenting human work, not replacing it Agentic AI is fundamentally an amplifier of human capability rather than a substitute for it, Maner said. The goal for enterprises is not to eliminate human work but to find the right balance between human decision-making, AI-driven automation, and agent-based workflows. “We believe that there’s going to be love, peace, and harmony between AI, agentic tools, and robotics systems, and human capital,” Maner said. “That harmony can be optimized for better outcomes for businesses, enterprises, governments, and public sector organizations, if the right vendors provide the right tooling and the right services.” How enterprises are getting started with AI at scale In practice, the move from experimentation into real-world deployment is where the challenges become most visible. Despite the momentum, many are still working through how to scale AI beyond initial use cases. As they do, organizations quickly run into practical constraints. Many start in the cloud because of easy access to resources and services, but practical considerations like data, governance and control, and cost quickly come to the forefront. The cloud can be used to experiment, with the ultimate goal of bringing applications back on premises as they move toward production, using platforms that solve for security and cost. The use cases gaining the most traction include document search and knowledge retrieval, security and predictive threat detection, software development and coding workflows, and customer support and service operations. In the security realm, banking customers and others in Europe and the U.S. are deploying AI-driven tools including facial recognition and predictive threat detection. Meanwhile, there’s a growing focus on end-to-end, 360-degree customer engagement, from pre-sales through post-sales advocacy, in the customer support industry. Industry-specific AI transformation is already underway Across industries, the shift from experimentation to real deployment is already taking shape in distinct ways. In retail, AI is transforming store operations with cameras and robotics used for targeted in-aisle marketing at the moment of purchase decision, while cashier-less checkout is replacing traditional POS systems, and the human capital freed up is being redeployed to back-office and merchandising functions. In healthcare, Nutanix works with customers on applications spanning diagnosis, treatment, remote health, and hospital operations, with cloud partners including AWS and Azure. In manufacturing and logistics, the transformation is equally significant. The operational challenges of scaling enterprise AI As AI use cases scale, enterprises are running into a new class of operational challenges. Managing multiple AI workloads and agents, coordinating infrastructure access across teams, ensuring security and governance, and integrating AI systems with existing business processes are now top-of-mind concerns for IT and business leaders alike. The gap between AI developers pushing for speed and access, and infrastructure teams responsible for security, uptime, and governance, is one of the defining challenges of this moment. “Now I’m running agents, and they’re all going to fight to get access to resources to solve my problems,” Cornely said. “What you want now is infrastructure that allows you to set constraints, govern resources.” The AI factory: a shared platform for production AI These challenges are driving demand for what Maner and Cornely describe as the AI factory: a shared infrastructure environment that supports multiple users and workloads simultaneously, enabling both experimentation and production while balancing developer agility with enterprise governance. At GTC 2026, Nutanix announced the Nutanix Agentic AI Solution, a complete platform spanning core infrastructure, Kubernetes-based container services running on a topology-aware hypervisor, and advanced services for building and governing agents. “We’re launching a complete platform, from core infrastructure through PaaS and advanced PaaS services to the whole management framework for your AI factories,” Cornely said. “Really enabling self-service for the teams that will build these applications in the enterprise.” Hybrid environments are essential to enterprise AI strategy Operating this kind of environment requires flexibility across infrastructure. Hybrid infrastructure is not a compromise, but a requirement. Some workloads will always run in the public cloud, while others must remain on premises due to security requirements, regulatory compliance, data sovereignty, or competitive IP considerations. “Especially in the regulated industries, as sovereignty becomes a bigger issue, data gravity becomes a bigger issue, security, and also a lot of competitive differentiation in the industry, it’s going to depend on what the company wants for their own IP,” Maner said. This is the foundation of Nutanix’s platform position, he added. “We are the perfect harmony, bringing those applications, that data, and all the optimization for these use cases end to end, from on-prem to off-prem and in a hybrid mode,” he said. “Doing it not only in one cloud, but for multiple clouds.” That flexibility also extends to the broader ecosystem. Nutanix works across hyperscalers including AWS, Azure, and Google Cloud, as well as regional service providers and emerging neoclouds. Nutanix offers neoclouds a full software stack to run their own clouds and deliver advanced AI services, giving enterprise customers already running Nutanix a simple extension of compute, networking, and AI capabilities. Maner described the arrangement as a win for both sides. For enterprises, it means simplified access to hybrid AI services. For neoclouds, it means a proven platform to build on. It’s all automated and secure by default, Cornely added. “All of those governance problems that now come up with agentic AI are the same problems we’ve been solving for the last 16 years for every other application running in your cloud,” he said. From pilot to production: operationalizing AI across the enterprise Ultimately, the goal is not to run a successful AI pilot, but to operationalize AI across real-world use cases, manage infrastructure as a shared resource, support collaboration between infrastructure teams and AI developers, and scale from initial projects to enterprise-wide deployment. “There’s a massive gap right now between people building AI applications, those AI engineers, those agentic AI developers, and your classical infra teams,” Cornely said. “They need tooling to enable the infra teams, so they can support your AI engineers. That’s what we deliver with our agentic AI solution.” Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.
- Salesforce launches Headless 360 to turn its entire platform into infrastructure for AI agentsSalesforce on Wednesday unveiled the most ambitious architectural transformation in its 27-year history, introducing "Headless 360" — a sweeping initiative that exposes every capability in its platform as an API, MCP tool, or CLI command so AI agents can operate the entire system without ever opening a browser. The announcement, made at the company's annual TDX developer conference in San Francisco, ships more than 100 new tools and skills immediately available to developers. It marks a decisive response to the existential question hanging over enterprise software: In a world where AI agents can reason, plan, and execute, does a company still need a CRM with a graphical interface? Salesforce's answer: No — and that's exactly the point. "We made a decision two and a half years ago: Rebuild Salesforce for agents," the company said in its announcement. "Instead of burying capabilities behind a UI, expose them so the entire platform will be programmable and accessible from anywhere." The timing is anything but coincidental. Salesforce finds itself navigating one of the most turbulent periods in enterprise software history — a sector-wide sell-off that has pushed the iShares Expanded Tech-Software Sector ETF down roughly 28% from its September peak. The fear driving the decline: that AI, particularly large language models from Anthropic, OpenAI, and others, could render traditional SaaS business models obsolete. Jayesh Govindarjan, EVP of Salesforce and one of the key architects behind the Headless 360 initiative, described the announcement as rooted not in marketing theory but in hard-won lessons from deploying agents with thousands of enterprise customers. "The problem that emerged is the lifecycle of building an agentic system for every one of our customers on any stack, whether it's ours or somebody else's," Govindarjan told VentureBeat in an exclusive interview. "The challenge that they face is very much the software development challenge. How do I build an agent? That's only step one." More than 100 new tools give coding agents full access to the Salesforce platform for the first time Salesforce Headless 360 rests on three pillars that collectively represent the company's attempt to redefine what an enterprise platform looks like in the agentic era. The first pillar — build any way you want — delivers more than 60 new MCP (Model Context Protocol) tools and 30-plus preconfigured coding skills that give external coding agents like Claude Code, Cursor, Codex, and Windsurf complete, live access to a customer's entire Salesforce org, including data, workflows, and business logic. Developers no longer need to work inside Salesforce's own IDE. They can direct AI coding agents from any terminal to build, deploy, and manage Salesforce applications. Agentforce Vibes 2.0, the company's own native development environment, now includes what it calls an "open agent harness" supporting both the Anthropic agent SDK and the OpenAI agents SDK. As demonstrated during the keynote, developers can choose between Claude Code and OpenAI agents depending on the task, with the harness dynamically adjusting available capabilities based on the selected agent. The environment also adds multi-model support, including Claude Sonnet and GPT-5, along with full org awareness from the start. A significant technical addition is native React support on the Salesforce platform. During the keynote demo, presenters built a fully functional partner service application using React — not Salesforce's own Lightning framework — that connected to org metadata via GraphQL while inheriting all platform security primitives. This opens up dramatically more expressive front-end possibilities for developers who want complete control over the visual layer. The second pillar — deploy on any surface — centers on the new Agentforce Experience Layer, which separates what an agent does from how it appears, rendering rich interactive components natively across Slack, mobile apps, Microsoft Teams, ChatGPT, Claude, Gemini, and any client supporting MCP apps. During the keynote, presenters defined an experience once and deployed it across six different surfaces without writing surface-specific code. The philosophical shift is significant: rather than pulling customers into a Salesforce UI, enterprises push branded, interactive agent experiences into whatever workspace their customers already inhabit. The third pillar — build agents you can trust at scale — introduces an entirely new suite of lifecycle management tools spanning testing, evaluation, experimentation, observation, and orchestration. Agent Script, the company's new domain-specific language for defining agent behavior deterministically, is now generally available and open-sourced. A new Testing Center surfaces logic gaps and policy violations before deployment. Custom Scoring Evals let enterprises define what "good" looks like for their specific use case. And a new A/B Testing API enables running multiple agent versions against real traffic simultaneously. Why enterprise customers kept breaking their own AI agents — and how Salesforce redesigned its tooling in response Perhaps the most technically significant — and candid — portion of VentureBeat's interview with Govindarjan addressed the fundamental engineering tension at the heart of enterprise AI: agents are probabilistic systems, but enterprises demand deterministic outcomes. Govindarjan explained that early Agentforce customers, after getting agents into production through "sheer hard work," discovered a painful reality. "They were afraid to make changes to these agents, because the whole system was brittle," he said. "You make one change and you don't know whether it's going to work 100% of the time. All the testing you did needs to be redone." This brittleness problem drove the creation of Agent Script, which Govindarjan described as a programming language that "brings together the determinism that's in programming languages with the inherent flexibility in probabilistic systems that LLMs provide." The language functions as a single flat file — versionable, auditable — that defines a state machine governing how an agent behaves. Within that machine, enterprises specify which steps must follow explicit business logic and which can reason freely using LLM capabilities. Salesforce open-sourced Agent Script this week, and Govindarjan noted that Claude Code can already generate it natively because of its clean documentation. The approach stands in sharp contrast to the "vibe coding" movement gaining traction elsewhere in the industry. As the Wall Street Journal recently reported, some companies are now attempting to vibe-code entire CRM replacements — a trend Salesforce's Headless 360 directly addresses by making its own platform the most agent-friendly substrate available. Govindarjan described the tooling as a product of Salesforce's own internal practice. "We needed these tools to make our customers successful. Then our FDEs needed them. We hardened them, and then we gave them to our customers," he told VentureBeat. In other words, Salesforce productized its own pain. Inside the two competing AI agent architectures Salesforce says every enterprise will need Govindarjan drew a revealing distinction between two fundamentally different agentic architectures emerging in the enterprise — one for customer-facing interactions and one he linked to what he called the "Ralph Wiggum loop." Customer-facing agents — those deployed to interact with end customers for sales or service — demand tight deterministic control. "Before customers are willing to put these agents in front of their customers, they want to make sure that it follows a certain paradigm — a certain brand set of rules," Govindarjan told VentureBeat. Agent Script encodes these as a static graph — a defined funnel of steps with LLM reasoning embedded within each step. The "Ralph Wiggum loop," by contrast, represents the opposite end of the spectrum: a dynamic graph that unrolls at runtime, where the agent autonomously decides its next step based on what it learned in the previous step, killing dead-end paths and spawning new ones until the task is complete. This architecture, Govindarjan said, manifests primarily in employee-facing scenarios — developers using coding agents, salespeople running deep research loops, marketers generating campaign materials — where an expert human reviews the output before it ships. "Ralph Wiggum loops are great for employee-facing because employees are, in essence, experts at something," Govindarjan explained. "Developers are experts at development, salespeople are experts at sales." The critical technical insight: both architectures run on the same underlying platform and the same graph engine. "This is a dynamic graph. This is a static graph," he said. "It's all a graph underneath." That unified runtime — spanning the spectrum from tightly controlled customer interactions to free-form autonomous loops — may be Salesforce's most important technical bet, sparing enterprises from maintaining separate platforms for different agent modalities. Salesforce hedges its bets on MCP while opening its ecosystem to every major AI model and tool Salesforce's embrace of openness at TDX was striking. The platform now integrates with OpenAI, Anthropic, Google Gemini, Meta's LLaMA, and Mistral AI models. The open agent harness supports third-party agent SDKs. MCP tools work from any coding environment. And the new AgentExchange marketplace unifies 10,000 Salesforce apps, 2,600-plus Slack apps, and 1,000-plus Agentforce agents, tools, and MCP servers from partners including Google, Docusign, and Notion, backed by a new $50 million AgentExchange Builders Initiative. Yet Govindarjan offered a surprisingly candid assessment of MCP itself — the protocol Anthropic created that has become a de facto standard for agent-tool communication. "To be very honest, not at all sure" that MCP will remain the standard, he told VentureBeat. "When MCP first came along as a protocol, a lot of us engineers felt that it was a wrapper on top of a really well-written CLI — which now it is. A lot of people are saying that maybe CLI is just as good, if not better." His approach: pragmatic flexibility. "We're not wedded to one or the other. We just use the best, and often we will offer all three. We offer an API, we offer a CLI, we offer an MCP." This hedging explains the "Headless 360" naming itself — rather than betting on a single protocol, Salesforce exposes every capability across all three access patterns, insulating itself against protocol shifts. Engine, the B2B travel management company featured prominently in the keynote demos, offered a real-world proof point for the open ecosystem approach. The company built its customer service agent, Ava, in 12 days using Agentforce and now handles 50% of customer cases autonomously. Engine runs five agents across customer-facing and employee-facing functions, with Data 360 at the heart of its infrastructure and Slack as its primary workspace. "CSAT goes up, costs to deliver go down. Customers are happier. We're getting them answers faster. What's the trade off? There's no trade off," an Engine executive said during the keynote. Underpinning all of it is a shift in how Salesforce gets paid. The company is moving from per-seat licensing to consumption-based pricing for Agentforce — a transition Govindarjan described as "a business model change and innovation for us." It's a tacit acknowledgment that when agents, not humans, are doing the work, charging per user no longer makes sense. Salesforce isn't defending the old model — it's dismantling it and betting the company on what comes next Govindarjan framed the company's evolution in architectural terms. Salesforce has organized its platform around four layers: a system of context (Data 360), a system of work (Customer 360 apps), a system of agency (Agentforce), and a system of engagement (Slack and other surfaces). Headless 360 opens every layer via programmable endpoints. "What you saw today, what we're doing now, is we're opening up every single layer, right, with MCP tools, so we can go build the agentic experiences that are needed," Govindarjan told VentureBeat. "I think you're seeing a company transforming itself." Whether that transformation succeeds will depend on execution across thousands of customer deployments, the staying power of MCP and related protocols, and the fundamental question of whether incumbent enterprise platforms can move fast enough to remain relevant when AI agents can increasingly build new systems from scratch. The software sector's bear market, the financial pressures bearing down on the entire industry, and the breathtaking pace of LLM improvement all conspire to make this one of the highest-stakes bets in enterprise technology. But there is an irony embedded in Salesforce's predicament that Headless 360 makes explicit. The very AI capabilities that threaten to displace traditional software are the same capabilities that Salesforce now harnesses to rebuild itself. Every coding agent that could theoretically replace a CRM is now, through Headless 360, a coding agent that builds on top of one. The company is not arguing that agents won't change the game. It's arguing that decades of accumulated enterprise data, workflows, trust layers, and institutional logic give it something no coding agent can generate from a blank prompt. As Benioff declared on CNBC's Mad Money in March: "The software industry is still alive, well and growing." Headless 360 is his company's most forceful attempt to prove him right — by tearing down the walls of the very platform that made Salesforce famous and inviting every agent in the world to walk through the front door. Parker Harris, Salesforce's co-founder, captured the bet most succinctly in a question he posed last month: "Why should you ever log into Salesforce again?" If Headless 360 works as designed, the answer is: You shouldn't have to. And that, Salesforce is wagering, is precisely what will keep you paying for it.
- Cheaper tokens, bigger bills: The new math of AI infrastructurePresented by Nutanix As enterprises move from AI experimentation into production deployment, the primary cost driver has shifted away from foundation model training and toward the infrastructure required to run thousands of concurrent inference workloads at scale, with agentic AI as the accelerant. Where early enterprise AI projects involved a handful of large, scheduled training jobs, production agentic environments require continuous support for short-lived, unpredictable requests that consume GPU, networking, and storage resources in ways traditional infrastructure was never designed to handle. For enterprise technology leaders, that shift is turning infrastructure efficiency into a make-or-break factor in AI economics. "Every employee with an AI assistant, every automated workflow, every agent pipeline needs models for inferencing and generates a lot of tokens," says Anindo Sengupta, VP of products at Nutanix. "Those inferencing requests land on a GPU infrastructure, traverse specialized networks, and pull data from storage systems purpose built to support these AI workloads." Why cost per token is becoming a core infrastructure metric Inference costs per token have dropped by roughly an order of magnitude over the past two years, driven by model efficiency improvements and competitive pressure among cloud providers. The expectation would be that enterprise AI is getting cheaper. Instead, total costs are rising, Sengupta says, pointing to what economists call the Jevons paradox: when a resource becomes cheaper to use, consumption tends to increase faster than the price drops. So while the cost per token is going down by almost an order of 10 in the last couple of years, consumption has risen more than 100X. The result is that cost per token and GPU utilization are becoming primary operational metrics for enterprise IT, sitting alongside traditional measures like uptime and throughput. "Cost per token is really about the total cost of ownership for serving inference models," Sengupta says. "Utilization is about making sure that once you have GPU assets, you're getting maximum return from them. These metrics will be critical for enterprise IT leaders." What makes this difficult is the number of variables involved. Token costs shift depending on which models an organization runs, where workloads execute, and how prompts are structured. "There are too many variables in cost to manage intuitively," Sengupta adds. "Optimizing it is an engineering problem, and one that requires continuous tuning." Agentic workloads expose the limits of traditional infrastructure Production agentic AI introduces a workload profile that traditional enterprise infrastructure was not designed to handle. Classic data center deployments are built around predictable loads and long planning cycles. Agentic environments produce unpredictable, high-frequency bursts of short inference requests, place new demands on networking and storage, and change faster than most procurement cycles allow. The infrastructure supporting agentic AI is also structurally different from CPU-based computing. GPU topology, high-speed interconnects, parallel storage systems for agent memory and KV cache, and networking architectures capable of handling DPU offloading all represent new capabilities that require new operational skills. Siloed infrastructure compounds these challenges. When GPU resources, networking, and data access are managed independently, scheduling inefficiencies accumulate, utilization drops, and costs climb. Organizations running fragmented stacks tend to underutilize expensive GPU assets while simultaneously bottlenecking on storage and network throughput. Integrated stacks and the case for full-stack architecture The response emerging among infrastructure vendors is a move toward tightly integrated, validated full-stack platforms designed specifically for production AI workloads. The premise is that end-to-end optimization across compute, networking, storage, and software layers produces better utilization and lower per-token costs than assembling best-of-breed components from separate vendors. Nutanix's Agentic AI solutionrepresents one approach to this problem. Built on the Nutanix AHV hypervisor, Nutanix Enterprise AI and Nutanix Kubernetes Platform, the solution is designed to manage both the traditional compute layer where agent orchestration runs and the accelerated compute layer where inference executes. The company has introduced NVIDIA topology-aware enhancements to AHV that automatically optimize how GPUs, CPUs, memory, and DPUs are allocated to virtual machines, and has offloaded the Nutanix Flow Virtual Networking to BlueField DPUs, to free GPU cycles and sustain throughput without compromising security. The solution supports instant deployment of NVIDIA NIM microservices and open-source models including Nemotron, and integrates an AI gateway that governs access to frontier cloud LLMs from Anthropic, Google, OpenAI, and others. The gateway also implements model context protocol (MCP) to allow agents to connect to enterprise data with granular access controls. The solution runs on Cisco infrastructure, allowing organizations to deploy on infrastructure they already operate. "By integrating everything from the AHV hypervisor and Flow Virtual Networking up to the Kubernetes platform, you remove the silos that slow down AI projects," Sengupta explains. Platform teams and developer agility cannot be traded off against each other One organizational tension that scales with agentic AI adoption is the relationship between platform teams managing shared infrastructure and the developers building and running agent applications on top of it. These groups have historically operated with different tooling, different priorities, and different time horizons, but Sengupta argues that the core dynamic hasn't changed even as the technology has. "Platform teams will continue to deliver a catalog of self-service AI capabilities that are also compliant to business needs, that they can serve to agentic AI builders," Sengupta says. "Mature AI teams will do a great job not just in GPU utilization, but in creating an operating model that enables fast AI infrastructure delivery to meet the pace of innovation that developers want. That's what is very critical to success." The organizations that are managing GPU utilization most effectively tend to be further along in their AI adoption journey, with more established operating models and clearer cost accountability. For organizations earlier in that journey, the infrastructure design and operating model decisions being made now will determine whether AI projects can move from pilot to production without cost or complexity becoming the limiting factor. The AI factory operating model The emerging framework for enterprise AI infrastructure is the AI factory, a purpose-built environment for producing and running AI workloads at scale. The challenge is that most organizations will need to operate both traditional compute and accelerated compute simultaneously for years, requiring a common operating model that spans both technology paradigms without sacrificing agility. With Nutanix, running on Cisco as part of the Cisco AI Pods, powered by Intel and optimized for the NVIDIA reference architecture, organizations get a production-ready, full-stack foundation by enabling AI factories to be securely and efficiently shared by thousands of agents, to achieve the lowest costs per token. The solution bridges the gap between the infrastructure and platform engineering teams who manage the hardware and the AI engineering and agentic AI developer teams who build and run agentic AI applications, making it truly affordable to run AI at a massive scale. "The metrics that will determine whether an organization can sustain and scale its AI investment — cost per token, GPU utilization, scheduling efficiency — are infrastructure metrics," Sengupta says. "Managing them well is increasingly a precondition for making AI viable, not just functional." Secure and scale your AI factory — explore the full-stack approach here. Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.
- AI agents that automatically prevent, detect and fix software issues are here as NeuBird AI launches Falcon, FalconClawThe mantra of the modern tech industry was arguably coined by Facebook (before it became Meta): "move fast and break things." But as enterprise infrastructure has shifted into a dizzying maze of hybrid clouds, microservices, and ephemeral compute clusters, the "breaking" part has become a structural tax that many organizations can no longer afford to pay. Today, two-year-old startup NeuBird AI is launching a full-scale offensive against this "chaos tax," announcing a $19.3 million funding round alongside the release of its Falcon autonomous production operations agent. The launch isn't just a product update; it is a philosophical pivot. For years, the industry has focused on "Incident Response"—making the fire trucks faster and the hoses bigger. NeuBird AI is arguing that the only sustainable path forward is "Incident Avoidance". As Venkat Ramakrishnan, President and COO of NeuBird AI, put it in a recent interview: "Incident management is so old school. Incident resolution is so old school. Incident avoidance is what is going to be enabled by AI". By grounding AI in real-time enterprise context rather than just large language model reasoning, the company aims to move site reliability engineering and devops teams from a reactive posture to a predictive one. The AI divide: a reality check on automation Accompanying the launch is NeuBird AI’s 2026 State of Production Reliability and AI Adoption Report, a survey of over 1,000 professionals that reveals a massive disconnect between the boardroom and the server room. While 74% of C-suite executives believe their organizations are actively using AI to manage incidents, only 39% of the practitioners—the engineers actually on-call at 2:00 AM—agree. This 35-point "AI Divide" suggests that while leadership is writing checks for AI platforms, the technology is often failing to reach the frontline. For engineers, the reality remains manual and grueling: the study found that engineering teams spend an average of 40% of their time on incident management rather than building new products. Gou Rao, co-founder CEO of NeuBird AI, told VentureBeat that this is a persistent operational reality: “Over the past 18 months that we have been in production, this is not a marketing slide. We have concretely been able to demonstrate a massive reduction in time to incident response and resolution”. The consequences of this "toil" are more than just lost productivity. Alert fatigue has transitioned from a morale issue to a direct reliability risk. According to the report, 83% of organizations have teams that ignore or dismiss alerts occasionally, and 44% of companies experienced an outage in the past year tied directly to a suppressed or ignored alert. In many cases, the systems are so noisy that customers discover failures before the monitoring tools do. Introducing NeuBird AI Falcon NeuBird AI’s answer to this systemic failure is the Falcon engine. While the company’s previous iteration, Hawkeye, focused on autonomous resolution, Falcon extends that capability into predictive intelligence. "When we launched NeuBird AI in 2023, our first version of the agent was called Hawkeye," Rao explains. "What we’re announcing next week at HumanX is our next-generation version of the agent, codenamed Falcon. Falcon is easily three times faster than Hawkeye and is averaging around 92% in confidence scores". This level of accuracy allows engineers to trust the agent's output at face value. Falcon represents a significant leap over previous generative AI applications in the space, particularly in its ability to forecast failure. "Falcon is really good at preventive prediction, so it can tell you what can go wrong," Rao says. "It’s pretty accurate on a 72-hour window, even better at 48 hours, and by 24 hours it gets really, really accurate”. One of the standout features of the new release is the Advanced Context Map. Unlike static dashboards, this is a real-time view of infrastructure dependencies and service health. It allows teams to visualize the "blast radius" of an issue as it propagates across an environment, helping engineers understand not just what is broken, but why it is failing in the context of its neighbors. 'Minority Report' for incident management While many AI tools favor flashy web interfaces, NeuBird AI is leaning into the developer's native habitat with NeuBird AI Desktop. This allows engineers to invoke the production ops agent directly from a command-line interface to explore root causes and system dependencies. "Falcon has a desktop mode which allows it to interact with a developer’s local tools," Rao noted. "We’re getting a lot more traction from a hands-on developer audience, especially as people go to Claude Desktop and Cursor. They’re completing the loop by using production agents talking to their coding agents”. This integration enables a "multi-agent" workflow where an engineer can use NeuBird AI’s agent to diagnose a root cause in production and then hand off that diagnosis to a coding agent like Claude Code to implement the fix. During a live demo, Rao showcased how the agent could be set to "Sentinel Mode," constantly sweeping a cluster for risks. If it detects an anomaly—such as a projected 5% spike in AWS costs or a misconfigured Kubernetes pod—it can flag the specific engineer on-call who has the domain expertise to fix it. "This is like 'Minority Report for Incident Management'," one financial services executive reportedly told the team after a demo. Context engineering: a gateway for security A primary concern for enterprises deploying AI is security—ensuring large language models don't go "crazy" or exfiltrate sensitive data. NeuBird AI addresses this through a proprietary approach to "context engineering". "The way we implemented our agent is that the large language models themselves are never actually touching the data directly," Rao explains. "We become the gateway for how the context can be accessed”. This means the model is the reasoning engine, but NeuBird AI is the middleman that wraps the data. Furthermore, the company has implemented strict guardrails on what the agent can actually execute. “We’ve created a language that confines and restricts the agent from what it can do," says Rao. "If it comes up with something anomalous, or something we don’t know, it won’t run. We won’t do it”. This architectural choice allows NeuBird AI to remain model-agnostic. If a newer model from Anthropic or Google outperforms the current reasoning engine, NeuBird AI can simply switch it out without requiring the customer to change their platform. "Customers don’t want to be tied to a specific way of reasoning," Rao asserts. "They want to be tied to a platform from which they can get the value of an agentic system”. Displacing the "army": displacing expensive observability One of the most radical claims NeuBird AI makes is that agentic systems can actually reduce the amount of data enterprises need to store in the first place. Currently, teams rely on massive storage platforms with complex query languages. "People use very complex observability tools like Datadog, Dynatrace, and Sysdig," Rao says. "This is the norm today, which is why it takes an army of people to solve a problem. What we’ve been able to demonstrate with agentic systems is that you don’t need to store all that data in the first place”. Because the agent can reason across raw data sources, it can identify which signals are junk and which are critical. This shift, Rao argues, “reduces human toil and effort while simultaneously reducing your reliance on these insanely expensive observability tools”. The practical impact of this "incident avoidance" was recently demonstrated at Deep Health. Rao recounts how their agent detected a systemic issue that was invisible to traditional tools: “Our agent was able to go in and prevent an issue from happening which would have caused this company, Deep Health, a major production outage. The customer is completely beside themselves and happy about what it could do”. FalconClaw: operationalizing 'tribal knowledge' One of the most persistent problems in IT operations is the loss of "tribal knowledge"—the hard-won expertise of senior engineers that exists only in their heads. NeuBird AI is attempting to solve this with FalconClaw, a curated, enterprise-grade skills hub compatible with the OpenClaw ecosystem. FalconClaw allows teams to capture best practices and resolution steps as "validated and compliant skills". The tech preview launched today with 15 initial skills that work natively with NeuBird AI’s toolchain. According to Francois Martel, Field CTO at NeuBird AI, this turns hard-won expertise into a reusable asset that the AI can use automatically. It’s an attempt to standardize how agents interact with infrastructure, moving away from proprietary "black box" systems toward a multi-agent world where different AI tools can share a common set of operational abilities. Scaling the moat: funding and leadership The $19.3 million round was led by Xora Innovation, a Temasek-backed firm, with participation from Mayfield, M12, StepStone Group, and Prosperity7 Ventures. This brings NeuBird AI’s total funding to approximately $64 million. The investor interest is fueled largely by the pedigree of the founding team. Gou Rao and Vinod Jayaraman previously co-founded Portworx, which was acquired by Pure Storage, and Ocarina Networks, acquired by Dell. They have recently bolstered their leadership with Venkat Ramakrishnan, another Pure Storage veteran, as President and COO. For investors like Phil Inagaki of Xora, the value lies in NeuBird AI’s "best-in-class results across accuracy, speed and token consumption". As cloud costs continue to spiral, the ability of an AI agent to not only fix bugs but also optimize infrastructure capacity is becoming a "must-have" rather than a "nice-to-have". NeuBird AI claims its agent can save enterprise teams more than 200 engineering hours per month. The path to 'self-healing' infrastructure As the State of Production Reliability report notes, current incident management practices are "no longer sustainable". With 61% of organizations estimating that a single hour of downtime costs $50,000 or more, the financial stakes of staying in a reactive loop are enormous. NeuBird AI's launch of Falcon and FalconClaw marks a definitive attempt to break that loop. By focusing on prevention and the "context engineering" required to make AI trustworthy for enterprise production, the company is positioning itself as the critical intelligence layer for the modern stack. While the "AI Divide" between executives and practitioners remains a significant hurdle for the industry, NeuBird AI is betting that as engineers see the value of a cli-driven, 92%-accurate agent that can "see around corners," the skepticism will fade. For the site reliability engineers currently drowning in a flood of non-actionable alerts, the arrival of a reliable ai teammate couldn't come soon enough. NeuBird AI Falcon is available starting today, with organizations able to sign up for a free trial at neubird.ai.