May 29, 2026•6 min read•from VentureBeat

AI agents are entering their rebuild era as enterprises confront the reliability problem

Our take

As enterprises embrace AI agents, they face a critical reliability challenge that underscores the need for robust infrastructure. Preeti Somal, Senior VP Engineering at Temporal Technologies, emphasizes that many organizations are now rethinking their initial implementations, prioritizing workflow orchestration, observability, and recovery mechanisms. This shift reflects a growing understanding that long-running AI workflows must withstand interruptions and manage state effectively.

AI agents are entering their rebuild era as enterprises confront the reliability problem

As enterprise AI agents move into production, organizations are increasingly facing the reliability problem that has come to define the next phase of AI integration. Many teams are realizing that the performance of large language models (LLMs) alone does not ensure success in real-world applications. Long-running AI workflows must not only survive crashes but also preserve state, recover from failures, and coordinate seamlessly across APIs and enterprise systems. This challenge echoes themes from other recent developments in technology, such as how Pinterest cut AI costs 90% by gutting a frontier model's vision layer, which showcases the necessity of optimizing existing systems rather than perpetually pushing forward with untested innovations.

Preeti Somal, Senior VP of Engineering at Temporal Technologies, highlights a crucial realization among enterprises: the need to revisit first-generation AI agent implementations. The initial focus on rapid deployment has left many organizations grappling with foundational issues, much like the early days of cloud adoption when businesses rushed to migrate workloads without adequate redesigns. This lack of foresight can lead to significant operational pitfalls, where teams are forced to rebuild agents from the ground up after experiencing failures due to inadequate architecture. As organizations begin to understand that AI is not merely a plug-and-play solution, but requires a robust infrastructure, the design of AI systems must evolve to prioritize workflow orchestration, observability, governance, and recovery.

The implications of this shift extend beyond technical specifications; they touch on the economic realities facing enterprises today. As AI becomes a strategic priority, leaders must evaluate the return on investment (ROI) associated with these systems. Costs can spiral when workflows fail, requiring reruns of entire processes, thereby driving up inference expenses and impacting customer experiences. The idea of a "deterministic spine," as articulated by Somal, provides a framework for understanding how orchestration software can support the reliability of probabilistic models, ensuring consistent execution even when faced with interruptions. This perspective is crucial as enterprises navigate the complexities of integrating AI into their existing workflows.

Looking ahead, the need for governance will become even more pronounced. As organizations seek to build standardized frameworks that balance flexibility with necessary controls, the focus will shift from merely adopting AI solutions to creating sustainable, long-term systems that enhance productivity. As seen in the healthcare example with Abridge, where workflows are complex and multifaceted, successful AI agents must be able to maintain continuity over time and withstand interruptions. This raises a significant question for enterprises: how will they ensure that their AI systems are not only innovative but also resilient and economically viable?

As organizations embark on this journey, the importance of collaboration with experts in workflow orchestration will only grow. The challenges presented by agentic AI are not merely technical hurdles; they are opportunities for enterprises to reimagine their data management practices and improve overall operational efficiency. The trend toward revisiting and refining first-generation implementations underscores a pivotal moment in the evolution of enterprise AI, encouraging organizations to build a foundation that will support the transformative potential of AI in the future. The journey is just beginning, and the successful enterprises will be those that not only adopt new technologies but also construct the robust systems that enable them to thrive.

As enterprise AI agents move into production, organizations are confronting a growing reliability problem. Many teams are discovering that LLM performance alone does not determine whether agents succeed in production. Long-running AI workflows must survive crashes, preserve state, recover from failures, manage inference costs, and coordinate across APIs, tools, and enterprise systems.

After a first wave focused on rapid deployment, organizations now need to revisit those first-generation implementations, and redesign early agent architectures around workflow orchestration, observability, governance, and recovery, said Preeti Somal, Senior VP Engineering at Temporal Technologies, during the latest AI Impact Series event in New York.

“We do have a lot of customers that come to us where they’re building version 2.0 of the same agent,” Somal said. “They had to move really fast, but they didn’t take care of the plumbing. Things crash and burn, and then they’re back to rebuilding with the reliable foundation.”

For workflow orchestration company Temporal, whose infrastructure predates the current wave of agentic AI, the shift reflects a broader enterprise realization: production AI systems require durable execution, state management, visibility into workflows, and mechanisms to recover when models or downstream systems fail.

Agentic AI has supercharged familiar engineering problems

“These patterns aren’t necessarily new," Somal said. " AI just supercharges them."

Agentic systems introduce additional complexity because they often involve long-running, multi-step processes spanning multiple services, models, APIs, and tools. A single workflow might call several large language models, access retrieval systems, trigger external applications, and manage state over hours or days. The engineering questions, Somal said, often emerge only after deployment.

“People will write agents but haven’t thought about what happens if the agent crashes,” she said. “Am I going to need to run the entire agent flow again?”

For enterprises operating under cost constraints, the answer matters. Restarting workflows after failures can multiply inference expenses, increase latency, and create poor customer experiences.

Somal compared the current moment to an earlier period in enterprise cloud adoption when organizations went straight to migrating workloads before considering that they needed to redesign underlying architectures if they wanted these workloads to weather the long-term.

“This rush to do AI in a world where you haven’t even modernized your application reminds me a little bit of that lift-and-shift that happened in the cloud,” she said. “Everybody realized you’re spending more money on cloud and we haven’t gotten value there.”

Why long-running agents force a new architecture

Enterprise workflows increasingly involve agents executing over long windows, sometimes spanning many hours while interacting with tools and systems. Reliability challenges compound when workflows persist over time, and it impacts both state and memory, two ideas that are often treated interchangeably in AI conversations.

State concerns workflow execution. It includes where an agent is in a process, which actions have already completed, and where recovery should resume after failure. Memory or context captures information an agent carries forward across interactions or tasks.

“The state of the agent is around what step and what actions have been performed, and if something crashes, where do you want to recover from, versus the context and memory piece,” Somal explained.

That distinction becomes increasingly important when enterprises begin moving beyond simple chatbot interactions toward longer-running business processes. Somal pointed to a healthcare example involving customer Abridge, where workflows process physician visits through multiple stages, including audio processing, summarization, model calls, and after-visit generation.

“There’s not just one piece to that flow,” Somal said. “Taking videos and slicing that, taking summaries, calling the LLMs, generating the after-visit summary, all of that is being orchestrated.”

The implication for enterprises is that successful agents increasingly depend on systems that can survive interruptions, coordinate across services, and maintain continuity over time.

The rise of the deterministic spine

A useful framework for enterprise AI design is the deterministic spine, Somal said, which is how they think about Temporal's role.

“It is denoting the path you want to take," she said. "It is calling the brain, but if the brain doesn’t respond, it will call it again. If the brain responds but the next step is going to fail, it will pick up from where that failure happened.”

In this framing, the language model acts as a probabilistic system producing variable outputs, while orchestration software maintains execution reliability around it. And the concept matters because enterprise systems increasingly require consistency even when models remain non-deterministic. A procurement workflow, healthcare summary, customer support escalation, or compliance process cannot simply fail silently because a model call timed out or an external dependency crashed.

“What you care most about is making sure that you can recover and that you’re not paying the token tax if something goes wrong,” Somal said.

Reliability, visibility, and the economics of token spend

As enterprise leaders evaluate AI ROI, cost visibility has become a growing concern. Long-running agents frequently make multiple model calls across complex workflows, which can create opaque spending patterns. Somal described one operational advantage of orchestration as visibility into where costs accumulate. Because workflows are observable step-by-step, teams can see where tokens are being consumed across an agent process.

“You’ve got visibility into that entire flow in a single pane of glass,” she said. “You can now see where you’re spending the tokens in an agent that is multiple steps and calling multiple different systems.”

Workflow recovery also shapes cost efficiency. Without durable orchestration, a late-stage failure can force organizations to rerun an entire process from the beginning, including all prior model calls. Somal said systems designed around recovery can resume execution from the point of interruption.

“You pick up from where the crash happened,” she said. “We save you the cost of running the agent from step one again.”

Enterprises need to build paved paths and enlist partner expertise

Governance concerns are another emerging pattern as agentic AI takes hold. Rather than adopting fully managed agent systems wholesale, Somal said enterprises increasingly want standardized internal frameworks that provide guardrails while preserving flexibility, and implementing necessary features like governance controls, model selection policies, identity systems, cost management, and observability.

“The enterprises are looking at building these paved paths,” she said. “Taking something off the shelf is maybe not going to work because there are all of these other requirements.”

As organizations revisit first-generation deployments, challenges like this increasingly look less like a model problem and more like a systems engineering problem, and Temporal is positioned to help enterprises take this next step in part because for many organizations, it already existed as part of broader modernization programs before AI became a strategic priority.

“Temporal is already in the enterprise,” Somal said. “Taking that and extending that to AI and agent platforms feels very natural.”

Read on the original site

Open the publisher's page for the full experience

View original article →

Tagged with

#natural language processing