OpenAI Unveils Jalapeño, Its First Custom AI Chip Built With Broadcom

OpenAI revealed Jalapeño, a custom inference chip co-designed with Broadcom in a roughly nine-month cycle, aimed at cheaper, more efficient AI serving.

Sam CarterJun 25, 2026 8 min read

Cover image for OpenAI Unveils Jalapeño, Its First Custom AI Chip Built With Broadcom — Photo: Daveography.ca / flickr (BY-NC-SA 2.0)

On June 24, 2026, OpenAI and Broadcom unveiled Jalapeño, OpenAI's first custom AI chip, built for one job: running large language models more cheaply and efficiently than the general-purpose Nvidia GPUs the whole industry has leaned on. It is OpenAI's clearest move yet to control its own destiny on hardware, not just models.

Quick answer

Jalapeño is OpenAI's first in-house chip, co-developed with Broadcom and announced June 24, 2026. It is an ASIC (application-specific integrated circuit) built for inference, the work of running already-trained models, rather than training them. The companies say it went from design to tape-out in roughly nine months, an unusually fast cycle that OpenAI's own models reportedly helped accelerate. It targets better performance per watt than current top accelerators, with initial deployment planned for late 2026. It reduces OpenAI's dependence on Nvidia but does not replace GPUs.

Key takeaways

Jalapeño is OpenAI's first in-house chip, co-developed with Broadcom.
It is an ASIC built for inference, the work of running already-trained models.
The companies say it went from design to tape-out in about nine months, an unusually fast cycle.
It targets better performance per watt than current state-of-the-art accelerators.
Initial deployment is planned for late 2026, expanding in following years.

What happened

OpenAI and Broadcom jointly announced Jalapeño, describing it as an LLM-optimized inference processor. Unlike a graphics processing unit, which is flexible enough for many workloads, Jalapeño is an ASIC, an application-specific integrated circuit, purpose-built for one job: serving the kinds of large models OpenAI runs in production. That specialization trades flexibility for efficiency and cost.

The companies highlighted the development speed. According to the announcement, Jalapeño moved from initial design to manufacturing tape-out in roughly nine months, which Broadcom characterized as among the fastest cycles for a chip of this complexity. OpenAI said its own models helped accelerate parts of the design work. Engineering samples are reportedly already running machine learning workloads in the lab at target frequency and power.

A semiconductor wafer with a grid of chip dies reflecting colored light — Photo: jurvetson / flickr (BY 2.0)

Note

Training builds a model; inference runs it. Most of the day-to-day cost of operating an AI service comes from inference, since every user request consumes compute. Chips tuned for inference can lower that recurring cost substantially.

GPU vs custom inference ASIC

The core of the story is a trade-off between flexibility and efficiency. Here is how a general-purpose GPU stacks up against a purpose-built inference chip like Jalapeño.

	Nvidia GPU (e.g. H200, B200)	Jalapeño (inference ASIC)
Workloads	Training and inference, many tasks	Inference of LLMs only
Flexibility	High	Low, fixed function
Efficiency for its job	Good	Potentially much better per watt
Supply	Constrained, expensive	Controlled in-house
Software maturity	Years of CUDA ecosystem	New stack, must be built
Strategic value	Depend on one supplier	Reduce dependence, tune to own models

The catch is the bottom two rows. A GPU benefits from a mature software ecosystem, while a brand-new chip needs its software stack built and proven before it carries real traffic.

Why it matters

For years, AI labs have depended heavily on Nvidia GPUs, which are powerful but expensive and often supply-constrained. Designing a custom inference chip is a way to reduce that dependence, control costs, and tune hardware to a company's exact serving patterns. OpenAI joins a list of firms building their own silicon, alongside Google, Apple, Amazon, and others.

The move also reflects how acute the compute shortage has become. Demand for AI serving capacity is outpacing supply across the industry, a strain visible in Google rationing Gemini access to Meta and in the broader scramble that has Amazon selling its Trainium chips directly to challenge Nvidia. A cheaper, more efficient chip eases that pressure if it performs as promised. The efficiency angle ties into a broader shift toward squeezing more output from every unit of compute, which we explore in why companies are suddenly counting every AI token.

The details

A few specifics worth noting:

Jalapeño is positioned as the first step in a multi-generation compute platform, not a one-off.
It is built around the kernels, memory movement, and networking patterns that matter most for frontier models, per OpenAI's description.
Broadcom handles the chip design and manufacturing partnership; OpenAI supplies the workload expertise.
Initial deployment is targeted for the end of 2026, with capacity expanding in later years.

Custom silicon is a long game. Designing a chip is only part of the challenge; manufacturing at scale, building the software stack, and integrating it into data centers all take time and carry risk. Performance claims made at announcement are based on early samples and should be confirmed by real-world deployment.

It is worth putting OpenAI's move in context, because it is following a well-worn path rather than blazing a new one. Google has run its own Tensor Processing Units (TPUs) for years, Amazon designed Trainium and Inferentia for training and inference, Microsoft built Maia, and Meta has its MTIA accelerators. The logic is identical across all of them: when you operate models at enormous scale, the recurring cost of inference dwarfs almost everything else, and shaving even a modest percentage off the energy and silicon bill per query compounds into billions of dollars. A chip tuned to your exact workloads, rather than a general-purpose GPU that must serve everyone, is the obvious lever to pull. What is notable about Jalapeño is the speed: a roughly nine-month design-to-tape-out cycle is aggressive for a chip of this complexity, and OpenAI's claim that its own models helped accelerate the design work hints at a feedback loop where AI helps build the hardware that runs AI.

Warning

Vendor benchmarks at launch describe lab conditions and target specs. Real efficiency depends on production yields, software maturity, and how the chip performs under live traffic. Treat early performance-per-watt claims as goals, not guarantees.

What is next

Things to watch:

First production deployment. Whether Jalapeño ships into OpenAI's data centers on schedule by late 2026.
Cost impact. If custom silicon meaningfully lowers inference costs, it could change pricing for AI services.
Software readiness. A new chip needs a mature software stack to be useful; watch how quickly OpenAI's workloads move over.
Nvidia's position. Custom chips chip away at, but do not replace, Nvidia's dominance; the balance is worth tracking.

Frequently asked questions

What is Jalapeño?

It is OpenAI's first custom AI chip, an inference-focused ASIC co-developed with Broadcom and announced on June 24, 2026.

How is it different from a GPU?

A GPU is flexible and handles many tasks; an ASIC like Jalapeño is built for one specific job. That makes it less versatile but potentially cheaper and more efficient for running large language models.

When will it be deployed?

OpenAI is targeting initial deployment by the end of 2026, with capacity expanding in subsequent years.

Does this mean OpenAI is dropping Nvidia?

No. Custom chips reduce dependence on a single supplier but do not eliminate GPU use. Training, experimentation, and any workload the ASIC is not tuned for still run on GPUs. OpenAI is adding an option, not abandoning Nvidia.

Why work with Broadcom instead of building it alone?

Designing and manufacturing a leading-edge chip requires deep expertise in physical design, packaging, and fab relationships that OpenAI does not have in-house. Broadcom is an established custom-silicon partner (it has helped Google build TPUs), so OpenAI supplies the workload knowledge while Broadcom handles the chip engineering and manufacturing partnership.

Why is it called Jalapeño?

It is simply OpenAI's internal codename for the chip. Tech companies routinely give projects playful codenames before, and sometimes after, they ship.

Jalapeño marks OpenAI's entry into custom hardware, a strategic step toward controlling the full stack from model to silicon as compute demand keeps climbing.

#news#ai#hardware