June 24, 2026•9 min read•from VentureBeat

OpenAI unveils first custom AI inference chip, Jalapeño, with Broadcom — and its development was sped-up with OpenAI's own models

Our take

OpenAI and Broadcom have jointly unveiled Jalapeño, a custom AI inference chip designed to accelerate large language model (LLM) workloads. This purpose-built processor, developed in a remarkably swift nine-month timeframe, aims to reduce inference costs by approximately 50% and will initially support ChatGPT, Codex, and future agentic products. Jalapeño represents a strategic expansion for OpenAI, enabling greater efficiency and broader access to advanced AI—a move that mirrors similar initiatives by tech giants like Google and Amazon.

OpenAI unveils first custom AI inference chip, Jalapeño, with Broadcom — and its development was sped-up with OpenAI's own models

The unveiling of OpenAI’s Jalapeño chip, co-developed with Broadcom, marks a significant shift in the AI landscape, moving beyond reliance on general-purpose GPUs towards specialized hardware optimized for large language model inference. This isn’t just about incremental performance gains; it’s about fundamentally reshaping the cost structure of deploying and scaling AI. As Intuit will show off how it rebuilt its AI infrastructure to support fast and complex tasks at VB Transform 2026, the need for specialized and efficient hardware is becoming increasingly evident, particularly as AI moves beyond simple conversational interfaces to handle more demanding workloads. The reported 50% reduction in inference costs is a game-changer, especially considering OpenAI’s previously disclosed financial realities, as highlighted by their substantial R&D expenditures and reliance on Microsoft’s compute infrastructure. Similarly, Amazon will present its framework for engineering trustworthy AI agents at VB Transform 2026, underscoring the growing importance of both performance *and* reliability as AI systems take on more critical responsibilities.

The move is a direct response to OpenAI's own substantial operational expenses, driven primarily by compute costs – a problem echoed by many organizations scaling AI. The audited financial documents revealing a near $21 billion operating loss in 2025 serve as a stark reminder of the financial pressures inherent in cutting-edge AI development. Jalapeño represents a strategic attempt to address this head-on, moving OpenAI closer to a vertically integrated model similar to those employed by tech giants like Google, Amazon, and Microsoft. By designing its own chips, OpenAI gains greater control over its infrastructure, potentially unlocking significant cost savings and improving performance beyond what’s achievable with off-the-shelf components. This isn’t to say that OpenAI is abandoning its existing partnerships with Nvidia, AMD, Cerebras, and AWS; rather, it's diversifying its compute landscape and mitigating reliance on a single vendor while simultaneously building a competitive advantage.

The broader implications of Jalapeño extend beyond OpenAI’s own bottom line. It signals the beginning of a true silicon arms race within the AI ecosystem, with major players vying to control every layer of the stack. The emergence of custom chips from Alibaba, Huawei, and even ByteDance underscores the growing recognition that specialized hardware is essential for achieving both performance and cost efficiency at scale. This trend will likely accelerate the development of new chip architectures and manufacturing processes, further driving innovation in the semiconductor industry. The race isn’t just about computational power; it's about optimizing for specific AI workloads, reducing energy consumption, and ultimately democratizing access to advanced AI capabilities. The focus on performance per watt, as highlighted by OpenAI’s president Greg Brockman, signifies a move towards more sustainable and environmentally responsible AI deployments.

Looking ahead, the key question is whether OpenAI can successfully scale Jalapeño production and integrate it seamlessly into its existing infrastructure. The nine-month development timeline, facilitated by OpenAI’s own models, is impressive, but the transition to gigawatt-scale data centers will present significant engineering and logistical challenges. Moreover, the long-term performance of Jalapeño relative to Nvidia’s and AMD’s GPUs remains to be seen. While initial reports suggest outstanding performance, sustained real-world results will be crucial for validating OpenAI’s strategic bet on custom hardware. The true test will be whether Jalapeño can not only reduce costs but also enable entirely new AI applications and capabilities that were previously unattainable.

OpenAI and Broadcom this morning unveiled their first custom AI accelerator chip named "Jalapeño," positioning it is as a purpose-built processor for large language model (LLM) inference, rather than the more general GPUs offered by the likes of Nvidia or AMD.

According to its creators, Jalapeño is designed to support workloads behind ChatGPT, Codex, the API and future agentic products, though notably, both OpenAI's and Broadcom's news releases position it as a product that could be made available to external AI firms as well — "built from the ground up for current and future LLMs across the industry." [Emphasis mine.]

It reportedly cuts inference costs by about 50%, according to Bloomberg. Recall inference is when the finished AI model is served to end users to use, while there remain high costs for training, research and development.

Jalapeño's engineering timeline set a blistering pace for the semiconductor industry, moving from early schematics to fabrication readiness within a brief nine-month window, when new processor development cycles are typically measured in years. Indeed, the OpenAI and Broadcom partnership itself was only publicly announced in October 2025.

The companies attributed this speed to a deep software-hardware co-development process that actively used OpenAI’s own models to accelerate parts of the chip design. Sources close to the firms told VentureBeat the development process relied on prior generation OpenAI models, though an OpenAI spokesperson declined to specify exactly which when asked by VentureBeat.

After receiving an early physical model on Wednesday, OpenAI outlined plans to begin rolling out these processors across active data centers by the end of this year. OpenAI says it has already begun testing running at least one of its prior generation models, GPT‑5.3‑Codex‑Spark, on the chips at a production workload, though in a test environment.

The release marks a major strategic expansion for the ChatGPT creator as it attempts to build the full computational stack required to make advanced AI faster, more reliable, and more accessible.

There remain, of course, many outstanding questions — including how the new Jalapeño chip performs compared to direct competitors, its costs, and its manufacturing viability. Sources close to the company said the initial performance itself was (ironically): "outstanding."

Greg Brockman, OpenAI's president and co-founder and Broadcom appeared on CNBC alongside Broadcom CEO Hock Tan this morning to discuss the news, and Brockman noted in the interview that "this is a real performance improvement...on performance per watt and performance per dollar." In a separate post on X, Brockman wrote that "Perf[ormance] per watt looking incredible."

Why OpenAI Built an ASIC

To understand why OpenAI is moving into chip design, it helps to look at the architecture. Jalapeño is an Application-Specific Integrated Circuit, or ASIC.

Unlike a GPU, which can handle many types of workloads, an ASIC is tuned for narrower uses, as industry experts note. That narrower focus can make it cheaper and more efficient for specific AI tasks, though less adaptable than Nvidia-style GPUs.

In Jalapeño’s case, OpenAI is starting from a clean design focused on modern LLM serving, instead of adapting a broader accelerator to fit its needs. The company says the architecture is shaped by its experience running large-scale AI products and is meant to reduce unnecessary data movement while better matching compute, memory and networking resources.

Broadcom is contributing core silicon implementation and networking technology, including Tomahawk networking silicon, while Celestica is helping with board, rack and system integration. The goal is to move the chip closer to its practical performance ceiling in real workloads, not just improve theoretical benchmarks.

However, OpenAI's pivot into proprietary hardware is not just as a quest for technical supremacy: it may also make its core unit economics far more sustainable.

Audited financial documents posted recently by AI critic and AI public relations specialist Ed Zitron revealed that while OpenaAI generated an impressive $13.07 billion in revenue throughout 2025, its total operational expenses for the year ballooned to $34 billion, resulting in an operating loss of nearly $20.92 billion.

The primary culprit behind this cash hemorrhage involved pure compute requirements, though more is likely due to training than inference.

In 2025 alone, research and development costs—driven largely by the infrastructure required to train and serve massive language models—accounted for $19.18 billion, or approximately 56 percent of the company's entire spending footprint. Furthermore, OpenAI reportedly paid Microsoft over $10.59 billion just for R&D and compute infrastructure last year.

Still, as OpenAI lays the groundwork for a heavily anticipated public offering in 2026, the Jalapeño inference chip may offer some reassurance to private investors and public markets that OpenAI has a plan for digging itself out of the financial hole and moving toward profitability. If it can drive down the costs of AI inference, then maybe it can recoup some of the losses spent on costly training runs.

"By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access," said Brockman included in Broadcom's release.

What Does This Mean for Nvidia and All of OpenAI's Other Chip Providers?

The introduction of Jalapeño immediately raises questions about OpenAI's strategic positioning within the fiercely competitive semiconductor and GPU market.

Since kicking off the generative AI boom in late 2022, OpenAI has remained one of the largest customers of GPU market leader Nvidia's premium products, but has also taken billions in investment dollars from the firm (engendering accusations of "circular dealing"), and expanded to work with other rival chipmakers to fuel its appetites.

Nvidia: In February 2026, Nvidia finalized a $30 billion direct investment into OpenAI as part of a massive $110 billion funding round.This deal secured an agreement to deploy 10 gigawatts of computing systems—including 3 gigawatts of dedicated inference capacity and 2 gigawatts of training capacity—utilizing Nvidia's next-generation Vera Rubin platform. Sources close to the companies tell VentureBeat Nvidia will remain central to OpenAI, particularly on the model training and development side.
Amazon Web Services (AWS): As part of the same February 2026 funding round, Amazon invested $50 billion into OpenAI. This deal included a commitment for OpenAI to consume approximately two gigawatts of AWS's proprietary Trainium computing capacity over the next eight years.
Advanced Micro Devices (AMD): OpenAI signed agreements with Nvidia's chief hardware rival, AMD for the former's usage of the latter's AMD Instinct™ MI450 Series GPUs.
Cerebras: The company also struck a pact with Cerebras, an AI chipmaker that executed its initial public offering in May 2026.

Sources with knowledge of these deals said at present, they currently remain in place, unaltered.

The Global Silicon Arms Race: OpenAI Joins AI Infrastructure Heavyweights

Before the introduction of Jalapeño, OpenAI operated at a distinct structural disadvantage compared to the world's vertically integrated technology empires.

Tech giants like Google and Amazon have for years utilized their own mature custom silicon programs— Google's Tensor Processing Units (TPUs) and Amazon's Trainium lines—to serve massive computational workloads at drastically lower margins.

Microsoft, OpenAI's primary cloud provider and single biggest financial backer, aggressively entered the bespoke silicon market by launching the Azure Maia 100 accelerator in late 2023.

Microsoft subsequently escalated this effort in January 2026 by introducing the Maia 200, an inference powerhouse built on TSMC's 3-nanometer process that already actively powers OpenAI's GPT-5.2 models within Azure data centers.

Similarly, Meta has aggressively expanded its Meta Training and Inference Accelerator (MTIA) portfolio in recent years, debuting the MTIA 300, 400, 450, and 500 series to power its recommendation engines and generative artificial intelligence features without relying solely on Nvidia.

Jalapeño provides OpenAI with the opportunity to match and offset the hyperscaler advantage. By baking its software architecture directly into a proprietary processor, OpenAI has the chance to replicate, at least in part, the playbook used by Google, Amazon, Microsoft, and Meta — transitioning from a captive cloud customer into a more independent AI infrastructure provider.

The timing is ripe amid a rapidly escalating global silicon arms race. Driven in part by United States export restrictions, Chinese tech heavyweights are pursuing more of their own custom AI chip hardware, too:

In May, Alibaba's semiconductor division, T-Head, unveiled the Zhenwu M890, a proprietary processor expressly engineered for autonomous AI agents that require massive memory bandwidth and long-running context windows.
Huawei is reportedly gearing up to release its new Ascend 950DT chip next month
ByteDance, the corporate parent of TikTok, reportedly entered active negotiations with Qualcomm in June 2026 to design custom application-specific integrated circuits for its data centers to escape third-party dependency.

By successfully finalizing the Jalapeño design, OpenAI is seeking to move beyond the traditional confines of a software laboratory and stand shoulder-to-shoulder with international cloud and infrastructure titans.

The Gigawatt Future

This sprawling web of vendor agreements highlights the sheer scale of OpenAI's infrastructural ambitions. The ultimate goal of the OpenAI and Broadcom partnership involves deploying gigawatt-scale data centers with Microsoft and other partners beginning in 2026 — that is, data centers with compute requiring energy on the order of cities.

For Broadcom, the partnership acts as a massive reputational catalyst. The company has been among the biggest beneficiaries of the generative AI boom, helping hyperscalers and frontier labs engineer custom silicon.

Broadcom shares reflect this momentum, demonstrating an 18% year-over-year increase in the first part of 2026 and a nearly 7X boost since the end of 2022, according to CNBC.

Ultimately, Jalapeño confirms that OpenAI believes it is ready to move beyond software and code into the realm of real-world, custom hardware.

By controlling the physics of its inference pipeline—while simultaneously leveraging the capital and hardware of Nvidia, Amazon, AMD, and Cerebras—OpenAI is attempting to rapidly rewrite its future unit economics of AI.

Read on the original site

Open the publisher's page for the full experience

View original article →

Tagged with

#generative AI for data analysis#Excel alternatives for data analysis#natural language processing for spreadsheets#financial modeling with spreadsheets#rows.com#big data performance#real-time data collaboration#big data management in spreadsheets#conversational data analysis#intelligent data visualization#data visualization tools#enterprise data management#data analysis tools#data cleaning solutions#digital transformation in spreadsheet software#business intelligence tools#real-time collaboration#financial modeling#AI formula generation techniques#large dataset processing