1 min readfrom Towards Data Science

Context Engineering for RAG : The Four Typed Inputs Behind Every RAG Answer

Our take

Context Engineering for Retrieval-Augmented Generation (RAG) is rapidly becoming a critical practice in enterprise AI. As Tobi Lütke and Andrej Karpathy defined in 2025, effective RAG systems rely on precisely structured inputs. This approach, detailed in "Context Engineering for RAG," organizes data into four typed inputs that converge to inform a single Large Language Model (LLM) call. Understanding these inputs—corpus, conversation, and tool extensions—is essential for maximizing performance.
Context Engineering for RAG : The Four Typed Inputs Behind Every RAG Answer

The emergence of "Context Engineering for RAG," as coined by Tobi Lütke and Andrej Karpathy in 2025, signifies a crucial evolution in how we approach Retrieval-Augmented Generation (RAG) within enterprise settings. The core idea – structuring input data into “typed pieces” emitted by individual components that converge on a single LLM call – moves beyond simply feeding documents to a large language model and towards a more deliberate, engineered approach to information delivery. This is especially relevant given the increasing need for AI agents to operate effectively, as highlighted in "AI agents need context everywhere they run, even where the cloud can't follow," which underscores the rising importance of context as a competitive differentiator in enterprise AI. The initial framing of corpus, conversation, and tool extensions as “follow-up work” suggests a roadmap for expanding this architectural approach, indicating a deeper focus on the provenance and utility of the information being presented to the LLM. This isn't merely about retrieval; it’s about *curation* and structured delivery of relevant information, a critical element for reliable and actionable insights.

The shift to typed inputs represents a departure from the more ad-hoc methods often employed in early RAG implementations. Simply throwing a document at an LLM and hoping it extracts the right information is unreliable and prone to hallucinations. Context Engineering, by contrast, implies a framework that defines the *type* of information each component provides – whether it’s a summary, a key entity, a sentiment score, or a specific fact. This structured approach enables greater control over the information flow and allows for more sophisticated reasoning by the LLM. Consider the implications for video content, as explored in "Google's Gemini Omni Flash hits the API, turning enterprise video production into a conversation;" the ability to engineer context from video transcripts, identifying key moments and actions, would dramatically enhance the utility of AI in video analysis and editing workflows. The precision offered by typed inputs also helps to mitigate the risks associated with LLM inaccuracies, enabling a more robust and trustworthy AI experience.

The conceptual model of "bricks" emitting typed pieces points to a modular and potentially scalable architecture. This aligns with the growing trend toward composable AI, where specialized components are integrated to create more complex AI systems. The emphasis on a single LLM call per document suggests an optimization for efficiency and latency, essential for real-time applications. While the article doesn’t delve into the specifics of implementation, the foundational concept of typed data streams offers a powerful framework for building more reliable and performant RAG systems. The fact that Google’s work on Nano Banana 2 Lite, or Gemini 3.1 Flash-Lite, is focused on fast image generation is a complementary development, demonstrating the broader industry drive toward AI that is both powerful and resource-efficient, and that increasingly relies on structured data inputs.

Looking ahead, the success of Context Engineering for RAG will hinge on the development of standardized typing systems and tools to facilitate their implementation. The ability to easily define and manage these typed pieces will be critical for widespread adoption. A key question to watch is how this architectural approach will adapt to increasingly complex data sources, particularly those involving multiple modalities and real-time streams. Will the "brick" model evolve to accommodate dynamic, continuously updating contexts? The current framing suggests a focus on single documents; extending this concept to manage ongoing conversations and integrate data from disparate sources will be essential for unlocking the full potential of context-aware AI.

Enterprise Document Intelligence [Vol.1 #7bis] - Tobi Lütke and Andrej Karpathy named the practice in 2025. For a single document, each brick emits typed pieces that converge on one LLM call. Corpus, conversation, and tool extensions are follow-up work

The post Context Engineering for RAG : The Four Typed Inputs Behind Every RAG Answer appeared first on Towards Data Science.

Read on the original site

Open the publisher's page for the full experience

View original article

Tagged with

#generative AI for data analysis#Excel alternatives for data analysis#natural language processing for spreadsheets#enterprise data management#big data management in spreadsheets#enterprise-level spreadsheet solutions#conversational data analysis#business intelligence tools#rows.com#real-time data collaboration#intelligent data visualization#data visualization tools#big data performance#data analysis tools#data cleaning solutions