D&B's database of 642 million businesses was built for humans, not AI agents. So they rebuilt it.

Dun & Bradstreet's recent overhaul of its Commercial Graph, which encompasses a staggering 642 million businesses, is a pivotal moment in the evolution of data management for AI-driven environments. This transformation comes as organizations increasingly integrate AI agents into workflows spanning credit assessments, procurement, and supply chain operations. Historically, D&B's systems were designed with human analysts in mind, catering to their unique needs for nuanced decision-making and the ability to navigate complex relationships between entities. However, as AI agents began to take on roles traditionally held by humans, it became clear that the existing architecture was inadequate, leading to a significant reengineering of their data infrastructure. This shift underscores a broader trend in the industry, where legacy systems are being challenged by the demands of modern AI capabilities, a theme echoed in other sectors, such as in Spotify’s AI bet: more of everything, less of what you want, where traditional models are reshaped to meet new consumer needs.

The challenges faced by D&B are emblematic of a larger issue affecting enterprises transitioning to AI-driven processes. The fragmented nature of their legacy systems—comprised of siloed databases and custom integrations—was ill-suited for the agile, instantaneous queries demanded by AI agents. As Gary Kotovets, D&B's Chief Data and Analytics Officer, points out, the architecture that served humans well for decades was simply not built with machine intelligence in mind. The company’s decision to migrate to a unified cloud infrastructure and develop a new data fabric layer represents a crucial step towards creating a more dynamic and flexible data environment. This move not only addresses the immediate needs of AI but also sets a precedent for other organizations grappling with similar transitions. For instance, many companies still rely on outdated frameworks that could hinder their AI aspirations, as highlighted in the recent article about Trump Mobile confirms it exposed customers’ personal data, including phone numbers and home addresses, where data management failures led to significant repercussions.

Moreover, the concept of "Know Your Agent," which D&B implemented to ensure machine accountability, reflects a growing recognition of the complexities involved in machine identity verification. This approach not only safeguards the integrity of the data being accessed but also aligns with the evolving regulatory landscape that increasingly emphasizes data accountability. As organizations consider deploying AI agents, they must prioritize robust data foundations and ensure that their systems can elegantly handle relationships that shift over time. This necessity for adaptability is becoming a core requirement across industries, suggesting that the insights gained from D&B's experience could serve as a valuable roadmap for businesses navigating the complexities of AI integration.

Looking ahead, the implications of D&B's rebuild extend far beyond their own operations. As more enterprises recognize the importance of a solid data foundation for AI, we may see a significant transformation in how data architectures are designed and implemented. The demand for dynamic relationships and entity consistency checks will likely drive innovation in data management practices, pushing organizations to rethink traditional approaches. As we continue to observe these developments, a key question emerges: how will organizations balance the need for rapid AI deployment with the foundational work required to support it effectively? The answers will shape the future of data management and the role of AI within it.

Dun & Bradstreet has spent over 180 years building a comprehensive commercial database. Its Commercial Graph, covering 642 million businesses and their relationships, corporate hierarchies and risk profiles, was designed for people. Credit analysts, risk managers and sales professionals who could wait for query results and work through ambiguous entity matches. AI agents cannot do any of those things.

When D&B's customers started pushing agents into credit, procurement and supply chain workflows, the Commercial Graph that had reliably served nearly 200,000 customers globally became a problem. The systems built to serve human analysts were the wrong architecture for machines. So D&B rebuilt.

"We need to think about agents as our new consumer category, evolving from our standard credit analysts or sales and marketing professionals, et cetera, to also now catering to these customers' agents," Gary Kotovets, Chief Data and Analytics Officer at Dun & Bradstreet, told VentureBeat.

What broke when agents started querying

The Commercial Graph was not a single database. It was a collection of separate systems built for different use cases and different markets, held together by custom integrations. Human analysts navigated that fragmentation through SQL queries or pre-built interfaces. Agents could not.

The scale of the underlying data compounded the problem. The database had nearly doubled in five years, expanding from more than 300 million to more than 642 million business records, with 11,000 fields per record, according to D&B. The firm now runs approximately 100 billion data quality checks per month as records move through its systems. Querying that at the sub-second latency agents require, against a fragmented architecture, was not workable.

The relationships the graph tracked were also the wrong kind. Legacy systems recorded static connections between entities. A CEO was linked to a company. That was the line. Agents working on credit assessments or third-party risk need dynamic relationships: when that CEO leaves for a new company, which organization does their track record follow? When a subsidiary changes ownership, how does that propagate across a corporate hierarchy? Those questions required custom analyst work before. Agents cannot wait for custom analyst work.

The broader problem is not unique to D&B. Kotovets said he has spoken with hundreds of CDOs and CIOs over the past six months and consistently heard the same constraint: they could not build what they wanted in AI because their data foundations were not standardized, normalized or agent-queryable. D&B had that foundation, built over decades to serve human analysts. It still had to rebuild for agents.

What they actually built

The rebuild started with consolidation. D&B migrated its fragmented databases to cloud infrastructure, redesigned the underlying schema and built a data fabric layer that normalizes records across markets while preserving regional compliance requirements. The result is a unified knowledge graph that tracks billions of relationships across 642 million companies, continuously updated and enriched by AI-driven data processing.

On top of that graph, D&B built a structured access layer for agents. Raw SQL access at agent query volumes and latency requirements was not the answer. Instead, D&B created a set of tools and skills available through MCP that package data with context and route agents to the right records for specific queries. A match and entity resolution engine sits behind every query, confirming that when an agent asks about a company, the answer resolves to a verified, specific entity rather than a name match.

D&B solved agent identity from both directions

Rebuilding the graph and adding MCP access solved the data retrieval problem. It did not solve the identity problem. Agents are not humans, and the authentication model built for human users did not extend to machines.

D&B built a new registration model for agents. They must map to a verified IP address and register an individual access key, treated as an authenticated identity in the same pipeline as a human user.

"We actually have a concept of Know Your Agent, similar to know your customer, that does those additional verifications," Kotovets said.

That handles the inbound problem: knowing which company an agent belongs to and what data it is entitled to query. But D&B also built for the outbound problem: what happens when a customer's own multi-agent workflow loses track of which company it is analyzing.

In a workflow that chains a credit check agent, a KYC agent and a third-party risk agent, each queries D&B at a different step. Without a mechanism to confirm they are all referencing the same entity, a workflow can complete while operating on divergent records.

"They have to come back to our verification agent to ensure that they're still talking to each other about the same entity," Kotovets said. "It's almost like a digital handshake, in a sense."

D&B's business verification agent can be embedded into any workflow as a persistent reference point and is available on Google's A2A protocol regardless of which orchestration tool a customer uses.

Four things enterprises must get right before deploying AI agents

The rebuild exposed requirements that go beyond D&B's own stack.

Data foundations come before agent infrastructure. The CDOs and CIOs Kotovets spoke with over the past six months consistently hit the same wall: they cannot build what they want in AI until their data is clean, normalized and consolidated. D&B had that foundation already. Most enterprises do not, and they will feel it.
Design for dynamic relationships, not static ones. Enterprise data systems typically record point-in-time connections: a person belongs to a company, an asset belongs to a subsidiary. Agents working on credit, risk or supply chain decisions need to reason across relationships that shift over time. If the underlying data only captures the static line, the agent will too.
Build entity consistency checks into multi-agent workflows. When multiple agents touch the same entity at different steps, there is no guarantee they are all referencing the same record by the time the workflow completes. That gap needs to be engineered for explicitly. Entity verification is a workflow design requirement, not an optional guardrail.
Embed lineage from the start, not as an afterthought. Every agent-produced answer should carry a traceable path back to its source. In credit, risk and supply chain decisions, the cost of an error is concrete. Lineage needs to be built in before scaling, not added after problems surface.

"You could always click and see where it came from, and validate it all the way back to the original source," Kotovets said. "That's been the key for us in unlocking a lot of other capabilities, because we have that level of certainty in the things that we've done."

Tagged with

#enterprise data management