Salesforce’s Agentforce Vibes 2.0 targets a hidden failure: context overload in AI agents
Our take

When startup fundraising platform VentureCrowd began deploying AI coding agents, they saw the same gains as other enterprises: they cut the front-end development cycle by 90% in some projects.
However, it didn’t come easy or without a lot of trial and error.
VentureCrowd’s first challenge revolved around data and context quality, since Diego Mogollon, chief product officer at VentureCrowd, told VentureBeat that “agents reason against whatever data they can access at runtime” and would then be confidently “wrong” because they’re only basing their knowledge on the context given to them.
Their other roadblock, like many others, was messy data and unclear processes. Similar to context, Mogollon said coding agents would amplify bad data, so the company had to build a well-structured codebase first.
“The challenges are rarely about the coding agents themselves; they are about everything around them,” said Mogollon. “It’s a context problem disguised as an AI problem, and it is the number one failure mode I see across agentic implementations.”
Mogollon said VentureCrowd encountered several roadblocks in overhauling its software development.
VentureCrowd's experience illustrates a broader issue in AI agent development. The models are not failing the agents; rather, they become overwhelmed by too much context and too many tools at once.
Too much context
This comes from a phenomenon called Context bloat, when AI systems accumulate more and more data, tools or instructions, the more complex the workflows become.
The problem arises because agents need context to work better, but too much of it creates noise. And the more context an agent has to sift through, the more tokens it uses, the work slows down and the costs increase.
One way to curb context bloat is through context engineering. Context engineering helps agents understand code changes or pull requests and align them with their tasks.
However, context engineering often becomes an external task rather than built into the coding platforms enterprises use to build their agents.
How coding agent providers respond
VentureCrowd relied on one solution in particular to help it overcome the issues with context bloat plaguing its enterprise AI agent deployment: Salesforce’s Agentforce Vibes, a coding platform that lives within Salesforce and is available for all plans starting with the free one.
Salesforce recently updated Agentforce Vibes to version 2.0, expanding support for third-party frameworks like ReAct. Most important for companies like VentureCrowd, Agentforce Vibes added Abilities and Skills, which they can use to direct agent behavior.
“For context, our entire platform, frontend and backend, runs on the Salesforce ecosystem. So when Agentforce Vibes launched, it slotted naturally into an environment we already knew well,” Mogollon said.
Salesforce’s approach doesn’t minimize the context agents use; rather, it helps enterprises ensure that context stays within their data models or codebases. Agentforce Vibes adds additional execution through the new Skills and Abilities feature. Abilities define what agents want to accomplish, and Skills are the tools they will use to get there.
Other coding agent platforms manage context differently. For example, Claude Code and OpenAI’s Codex focus on autonomous execution, continuously reading files, running commands and as tasks evolve, expanding context. Claude Code has a context indicator that which compacts context when it becomes too large.
With these different approaches, the consistent pattern is that most systems manage growing contexts for agents, not necessarily to limit them. Context keeps growing, especially as workflows become more complex, making it more difficult for enterprises to control costs, latency and reliability.
Mogollon said his company chose Agentforce Vibes not only because a large portion of their data already lives on Salesforce, making it easier to integrate, but also because it would allow them to control more of the context they feed their agents.
What builders should know
There’s no single way to address context bloat, but the pattern is now clear: more context doesn't always mean better results.
Along with investing in context engineering, enterprises have to experiment with the context constraint approach they are most comfortable with. For enterprises, that means the challenge isn’t just giving agents more information—it’s deciding what to leave out.
Read on the original site
Open the publisher's page for the full experience
Related Articles
- The AI scaffolding layer is collapsing. LlamaIndex's CEO explains what survives.The scaffolding layer that developers once needed to ship LLM applications — indexing layers, query engines, retrieval pipelines, carefully orchestrated agent loops — is collapsing. And according to Jerry Liu, co-founder and CEO of LlamaIndex, that's not a problem. It's the point. “As a result, there's less of a need for frameworks to actually help users compose these deterministic workflows in a light and shallow manner,” Jerry Liu, co-founder and CEO of LlamaIndex, explains in a new VentureBeat Beyond the Pilot podcast. Context is becoming the moat Liu’s LlamaIndex is one of the foremost retrieval-augmented generation (RAG) frameworks connecting private, custom, and domain-specific data to LLMs. But even he acknowledges that these types of frameworks are becoming less relevant. With every new release, models demonstrate incremental capabilities to reason over “massive amounts” of unstructured data, and they’re getting better at it than humans, he notes. They can be trusted to reason extensively, self-correct, and perform multi-step planning; Modern Context Protocol (MCP) and Claude Agent Skills plug-ins allow models to discover and use tools without requiring integrations for every one independently. Agent patterns have consolidated toward what Liu calls a "managed agent diagram" — a harness layer combined with tools, MCP connectors, and skills plug-ins, rather than custom-built orchestration for every workflow. Further, coding agents excel at writing code, meaning devs don’t need to rely on extensive libraries. In fact, about 95% of LlamaIndex code is generated by AI. “Engineers are not actually writing real code,” Liu said. “They're all typing in natural language.” This means the layers between programmers and non-programmers is collapsing, because “the new programming language is essentially English.” Instead of manual coding or struggling to understand API and document integration, devs can just point Claude Code at it. “This type of stuff was either extremely inefficient or just would break the agent three years ago,” said Liu. “It's just way easier for people to build even relatively advanced retrieval with extremely simple primitives.” So what’s the core differentiator when the stack collapses? Context, Liu says. Agents need to be able to decipher file formats to extract the right information. Providing higher accuracy and cheaper parsing becomes key, and LlamaIndex is well-positioned here, he contends, because of its developments with agentic document processing via optical character recognition (OCR). “We've really identified that there's a core set of data that has been locked up in all these file format containers,” he said. Ultimately, “whether you use OpenAI Codex or Claude Code doesn't really matter. The thing that they all need is context.” Keeping stacks modular There’s growing concern about builders like Anthropic locking in session data; in light of this, Liu emphasizes the importance of modularity and agnosticism. Builders shouldn’t bet on any one frontier model, or overbuild in a way that overcomplicates components of the stack. Retrieval has evolved into “agent-plus-sandbox,” as he describes it, and enterprises must ensure that their code bases are tech debt free and adaptable to changing patterns. They also have to acknowledge that some parts of the stack will eventually need to be thrown away as a matter of course. “Because with every new model release, there's always a different model that is kind of the winner,” Liu said. “You want to make sure you actually have some flexibility to take advantage of it.” Listen to the podcast to hear more about: LlamaIndex’s beginnings as a ‘toy project’ with initially only about 40% accuracy; How SaaS companies can tap into complicated workflows that must be standardized and repeatable for average knowledge workers; Why vertical AI companies are taking off and why ‘build versus buy’ is still a very valid question in the agent age. You can also listen and subscribe to Beyond the Pilot on Spotify, Apple or wherever you get your podcasts.
- Salesforce launches Agentforce Operations to fix the workflows breaking enterprise AIEnterprise AI teams are hitting a wall — not because their models can't reason, but because the workflows underneath them were never built for agents. Tasks fail, handoffs break, and the problem compounds as organizations push agents deeper into back-office systems. A new architectural layer is emerging to address it: workflow execution control planes that impose deterministic structure on processes agents are expected to run. One of the companies bringing this to the forefront is Salesforce, with a new workflow platform that turns back-office workflows into a set of tasks for specialized agents to complete. Users can upload their processes or use one of the set Blueprints provided by Salesforce, and Agentforce Operations will break it down for agents. Salesforce senior vice president of Product, Sanjna Parulekar, told VentureBeat in an interview that the problem is that many enterprise workflows are not built for agents. “What we’ve observed with customers is that a lot of times, the brokenness in a process is probably in your product requirements document,” Parulekar said. “So when that’s uploaded into a product, it doesn’t quite work. We can optimize it and cut out some things and replace it with an agent.” Without this control panel layer, enterprises could risk deploying agents that increase cost rather than fix their workflow problems. Making the workflow work for agents, not just humans Enterprises deploying agents are learning a costly lesson: Their workflows were designed around human judgment gaps, not machine execution. Processes that evolved through years of workarounds — loosely defined steps, implicit decisions, coordination that depends on individuals knowing what to do next — break when agents are asked to follow them literally. Even with all of an enterprise’s context at its fingertips, AI systems will have difficulty completing tasks if it is not clear what it’s supposed to do. Parulekar said her team found that focusing on what makes the process tick and breaking it down into more explicit steps and workflows makes the system more deterministic. Then, when platforms like Agentforce Operations introduce agents, those agents already know their specific tasks. “It forces companies to rethink their processes and introduces observability into the mix because of the session tracing model in the system,” she said. Parulekar said human checks can be built into the system, so the process is more transparent. What makes this approach different from other workflow automation offerings is that it doesn’t rely on agents to decide what to do next; the system does. Unlike more traditional automation tools that route tasks and agents on probabilistic decision-making, this enforces execution on a more pre-defined, deterministic structure. The problem it introduces Codifying a workflow doesn't fix a broken one. If a process has flawed steps, encoding it for agents locks in the problem at scale. And once workflows are distributed across agents, the challenge shifts from execution to governance: who owns the process, who validates it, and how it evolves when business conditions change. It puts the onus on teams to take a hard look at what works for them and what doesn’t. Organizations need to consider that, along with the execution control plane offered by platforms like Agentforce Operations, someone should be made responsible for task completion and success. Brandon Metcalf, founder and CEO of workforce orchestration company Asymbl, told VentureBeat in a separate interview that the key to both humans and agents following a workflow is a shared goal. “You have to understand the goal or the agent or human won’t complete the task successfully,” Metcalf said. “Someone has to manage that outcome that has to be delivered. It can be a person or an agent.” The bottleneck has moved. As Metcalf framed it, the question is no longer whether agents can reason through a task, it's whether the workflow underneath them is coherent enough to execute. For enterprises that built their processes around human judgment and institutional memory, that's a harder fix than swapping in a smarter model.
- Google and AWS split the AI agent stack between control and executionThe era of enterprises stitching together prompt chains and shadow agents is nearing its end as more options for orchestrating complex multi-agent systems emerge. As organizations move AI agents into production, the question remains: "how will we manage them?" Google and Amazon Web Services offer fundamentally different answers, illustrating a split in the AI stack. Google’s approach is to run agentic management on the system layer, while AWS’s harness method sets up in the execution layer. The debate on how to manage and control gained new energy this past month as competing companies released or updated their agent builder platforms—Anthropic with the new Claude Managed Agents and OpenAI with enhancements to the Agents SDK—giving developer teams options for managing agents. AWS with new capabilities added to Bedrock AgentCore is optimizing for velocity—relying on harnesses to bring agents to product faster—while still offering identity and tool management. Meanwhile, Google’s Gemini Enterprise adopts a governance-focused approach using a Kubernetes-style control plane. Each method offers a glimpse into how agents move from short-burst task helpers to longer-running entities within a workflow. Upgrades and umbrellas To understand where each company stands, here’s what’s actually new. Google released a new version of Gemini Enterprise, bringing its enterprise AI agent offerings—Gemini Enterprise Platform and Gemini Enterprise Application—under one umbrella. The company has rebranded Vertex AI as Gemini Enterprise Platform, though it insists that, aside from the name change and new features, it’s still fundamentally the same interface. “We want to provide a platform and a front door for companies to have access to all the AI systems and tools that Google provides,” Maryam Gholami, senior director, product management for Gemini Enterprise, told VentureBeat in an interview. “The way you can think about it is that the Gemini Enterprise Application is built on top of the Gemini Enterprise Agent Platform, and the security and governance tools are all provided for free as part of Gemini Enterprise Application subscription.” On the other hand, AWS added a new managed agent harness to Bedrock Agentcore. The company said in a press release shared with VentureBeat that the harness “replaces upfront build with a config-based starting point powered by Strands Agents, AWS’s open source agent framework.” Users define what the agent does, the model it uses and the tools it calls, and AgentCore does the work to stitch all of that together to run the agent. Agents are now becoming systems The shift toward stateful, long-running autonomous agents has forced a rethink of how AI systems behave. As agents move from short-lived tasks to long-running workflows, a new class of failure is emerging: state drift. As agents continue operating, they accumulate state—memory, too, responses and evolving context. Over time, that state becomes outdated. Data sources change, or tools can return conflicting responses. But the agent becomes more vulnerable to inconsistencies and becomes less truthful. Agent reliability becomes a systems problem, and managing that drift may need more than faster execution; it may require visibility and control. It’s this failure point that platforms like Gemini Enterprise and AgentCore try to prevent. Though this shift is already happening, Gholami admitted that customers will dictate how they want to run and control any long-running agent. “We are going to learn a lot from customers where they would be using long-running agents, where they just assign a task to these autonomous agents to just go ahead and do,” Gholami said. “Of course, there are tricks and balances to get right and the agent may come back and ask for more input.” The new AI stack What’s becoming increasingly clear is that the AI stack is separating into distinct layers, solving different problems. AWS and, to a certain extent, Anthropic and OpenAI, optimize for faster deployment. Claude Managed Agents abstracts much of the backend work for standing up an agent, while the Agents SDK now includes support for sandboxes and a ready-made harness. These approaches aim to lower the barrier to getting agents up and running. Google offers a centralized control panel to manage identity, enforce policies and monitor long-running behaviors. Enterprises likely need both. As some practitioners see it, their businesses have to have a serious conversation on how much risk they are willing to take. “The main takeaway for enterprise technology leaders considering these technologies at the moment may be formulated this way: while the agent harness vs. runtime question is often perceived as build vs. buy, this is primarily a matter of risk management. If you can afford to run your agents through a third-party runtime because they do not affect your revenue streams, that is okay. On the contrary, in the context of more critical processes, the latter option will be the only one to consider from a business perspective,” Rafael Sarim Oezdemir, head of growth at EZContacts, told VentureBeat in an email. Iterating quickly lets teams experiment and discover what agents can do, while centralized control adds a layer of trust. What enterprises need is to ensure they are not locked into systems designed purely for a single way of executing agents.