June 19, 2026•1 min read•from InfoQ

Presentation: AI Agents to Make Sense of Data at OpenAI

Our take

Bonnie Xu of OpenAI reveals how Kepler, an internal AI data analyst agent, navigates the challenges of querying an astonishing 600+ petabytes of data. This presentation explores Kepler's innovative approach, detailing solutions for context window limitations through techniques like MCP, automated code crawling, and Retrieval-Augmented Generation (RAG). Furthermore, Xu outlines the team’s use of scoped semantic memory for continuous self-learning and a robust, regression-free evaluation pipeline built with AST-based LLM grading.

Presentation: AI Agents to Make Sense of Data at OpenAI

OpenAI’s recent unveiling of Kepler, their internal AI data analyst agent, isn't just a fascinating technical deep dive – it's a significant signal about the future of data interaction. Bonnie Xu’s presentation, detailed in the InfoQ article, highlights a move beyond querying data; it’s about building intelligent agents that can *understand* and *reason* with it at an unprecedented scale. Kepler’s ability to navigate 600+ petabytes of data underscores the growing challenges – and opportunities – in managing increasingly vast datasets. The techniques employed to overcome context window limitations, particularly the use of Memory-Augmented Context Processing (MCP), automated code crawling, and Retrieval-Augmented Generation (RAG), represent crucial advancements in making these large language model (LLM) powered systems practical. This echoes the complexities addressed in "Behind the Scenes: Block 450 JVM Repositories Into Monorepo to Reduce Dependency Drift" [/post/behind-the-scenes-block-450-jvm-repositories-into-monorepo-t-cmql8irz60727yt0pmhufm9hm], where managing interrelated components at scale is paramount, albeit in a different technological domain. The need for robust architectures and efficient management strategies is clearly a common thread. Furthermore, the focus on scoped semantic memory for self-learning and the innovative use of AST-based LLM grading for evaluation speaks to a sophisticated approach to building reliable and continuously improving AI systems, a concept explored further in "System Design for ML Interviews: 10 Real Problems Walked Through" [/post/system-design-for-ml-interviews-10-real-problems-walked-thro-cmql8ink0071dyt0p4lc3ekg3], where rigorous system design is essential for real-world ML deployments.

What’s particularly compelling about Kepler is the emphasis on practical implementation. While numerous organizations are exploring AI agents, OpenAI's approach demonstrates a commitment to addressing the core engineering hurdles that often derail such projects. The AST-based LLM grading, for example, provides a regression-free evaluation pipeline, a crucial requirement for ensuring consistency and reliability in a rapidly evolving AI landscape. This signals a shift towards more disciplined development practices within the AI space, moving beyond the initial hype and towards a focus on measurable outcomes. The modularity implied by the automated code crawling and MCP techniques suggests an adaptable architecture capable of incorporating new data sources and functionalities—a vital characteristic for long-term viability. It also reinforces the emerging trend of building specialized LLMs, tailored to specific tasks, rather than relying solely on general-purpose models. The challenges of integrating disparate LLM providers, as addressed in "Project Tutorial: Build a Multi-Provider LLM Gateway" [/post/project-tutorial-build-a-multi-provider-llm-gateway-cmql8i9x0071dyt0prqr5bf0v], highlight the need for flexible and interoperable architectures, and Kepler's design suggests a similar consideration.

The broader significance of Kepler’s development lies in its potential to democratize access to complex data analysis. Currently, extracting meaningful insights from massive datasets often requires specialized expertise and significant time investment. AI data analyst agents like Kepler have the potential to automate many of these tasks, empowering a wider range of users to leverage the power of their data. This isn't about replacing data scientists; it's about augmenting their capabilities and freeing them from repetitive tasks, allowing them to focus on more strategic and creative problem-solving. Think of it as shifting from manually sifting through mountains of data to having an intelligent assistant proactively identify patterns and anomalies. The implications for fields like scientific research, financial analysis, and healthcare are profound, potentially accelerating discovery and innovation across numerous domains. The accessibility aspect is key—making these powerful tools available to users beyond a small group of specialists.

Looking ahead, the key question will be how effectively these techniques can be adapted and scaled for broader application. While OpenAI’s internal deployment benefits from significant computational resources and engineering expertise, replicating this success in other organizations will require careful consideration of cost, infrastructure, and talent. The challenges of ensuring data security and privacy will also become increasingly critical as AI agents gain access to sensitive information. Ultimately, OpenAI’s Kepler provides a glimpse into a future where data analysis is less a specialized skill and more an integral part of everyday workflows, and the innovations demonstrated pave the way for a new generation of AI-powered data tools. How will organizations adapt their data governance and security protocols to accommodate these increasingly intelligent agents, and what new ethical considerations will arise as AI takes on more responsibilities in data analysis?

OpenAI’s Bonnie Xu discusses Kepler, an internal AI data analyst agent built to query 600+ petabytes of data. She explains how they overcome context window limits using MCP, automated code crawling, and RAG. Xu also shares how their team leverages scoped semantic memory for self-learning and utilizes AST-based LLM grading to build a robust, regression-free evaluation pipeline.

By Bonnie Xu

Read on the original site

Open the publisher's page for the full experience

View original article →

Tagged with

#generative AI for data analysis#Excel alternatives for data analysis#big data management in spreadsheets#conversational data analysis#real-time data collaboration#intelligent data visualization#data visualization tools#enterprise data management#big data performance#data analysis tools#data cleaning solutions#natural language processing for spreadsheets#self-service analytics tools#machine learning in spreadsheet applications#automated anomaly detection#cloud-based spreadsheet applications#no-code spreadsheet solutions#self-service analytics#rows.com#AI Agents