8 min readfrom VentureBeat

OpenAI unveils GPT-5.6 Sol, Terra and Luna models — but only accessible to limited preview partners for now, per US Gov

Our take

OpenAI today initiates a limited preview of its next-generation GPT-5.6 model series—Sol, Terra, and Luna—designed to transform developer and enterprise workflows. Following coordination with the U.S. government, access is currently restricted to approximately 20 organizations. Sol, the top-tier model, excels in complex reasoning and security applications, while Terra balances performance and efficiency, and Luna prioritizes speed and cost-effectiveness. This phased release reflects a novel landscape of safety interventions and compliance parameters for enterprise buyers. "It’s not about Anthropic vs.
OpenAI unveils GPT-5.6 Sol, Terra and Luna models — but only accessible to limited preview partners for now, per US Gov

OpenAI’s unveiling of the GPT-5.6 models – Sol, Terra, and Luna – represents a significant, and somewhat complex, moment for the AI landscape. The tiered approach, designed for varying enterprise needs, signals a maturing market where generalized models are giving way to specialized tools. It’s not about replacing everything with a single behemoth, but about choosing the right instrument for the task, whether that’s the demanding complexities of Sol, the reliable throughput of Terra, or the speedy efficiency of Luna. This shift echoes trends we've seen in other industries, like cloud computing, where different tiers of service cater to different workloads. As we've noted previously, autonomous security agents need complete data to be effective Autonomous security agents need complete data. Here's how to check if yours is ready, and these new models offer enhanced capabilities for such applications, particularly Sol’s focus on security-focused workflows. The introduction of “max reasoning effort mode” and “ultra mode,” leveraging subagents for complex tasks, also points towards a future where AI isn't just about generating text, but about orchestrating sophisticated work processes.

The decision to coordinate the release with the U.S. government, and the ensuing limitations on early access, adds a layer of geopolitical complexity that is reshaping the trajectory of AI development. While OpenAI rightly critiques this "sovereign gatekeeping," the reality is that the dual-use potential of increasingly powerful AI necessitates a degree of oversight. The export control order against Anthropic, and the subsequent removal of access to their powerful models, underscores the seriousness of these concerns. It’s not about stifling innovation, but about mitigating risks in a rapidly evolving technological landscape. The mandated safety protocols – model-level refusals, real-time classifiers, and reasoning review pauses – while potentially introducing operational friction, are a necessary step towards responsible deployment. The fact that OpenAI dedicated 700,000 A100e GPU hours to automated red-teaming is a testament to the company's commitment to safety, and the observed limitation of Sol’s ability to autonomously engineer full-chain exploits is reassuring. We’ve long observed that AI model development isn't just about performance; it's about understanding and mitigating potential harms, something that aligns with the broader conversation around responsible AI deployment, as discussed in It’s not about Anthropic vs. OpenAI anymore.

The pricing structure, while placing OpenAI’s cheapest option in a mid-priced tier, highlights the growing cost of frontier AI. While cheaper alternatives exist, the performance gains offered by Sol, particularly for long-running coding, cybersecurity, and agentic tasks, justify the premium for many enterprise users. The introduction of prompt caching mechanics, with its predictable cost structure, is a crucial development for businesses looking to integrate these models into production environments. Furthermore, the planned deployment of Sol on Cerebras hardware targeting real-time reasoning applications suggests a focus on specialized use cases and addressing latency concerns. This is particularly pertinent as businesses explore a wider range of applications for AI, requiring more than just simple, quick responses – demanding complex reasoning and analysis in real-time. It’s a stark contrast to the prevailing focus on “early bird” opportunities, as seen in Early Bird pricing ends tonight for TechCrunch Founder Summit, illustrating a shift from speculative investment to practical, enterprise-grade solutions.

Ultimately, the release of GPT-5.6 and its tiered model structure marks a turning point in the AI landscape. It’s a move away from the hype surrounding “revolutionary” breakthroughs and towards a more pragmatic focus on delivering tangible value to businesses. The coordinated release with the U.S. government, while presenting challenges, underscores the growing importance of responsible AI development and deployment. The key question moving forward isn't just about how powerful these models become, but about how effectively we can integrate them into our workflows while mitigating the inherent risks. Will the focus on specialized models and clear safety protocols usher in an era of more stable and predictable AI adoption, or will the evolving geopolitical landscape continue to introduce unforeseen obstacles?

OpenAI is announcing a limited preview of its next-generation GPT-5.6 model series today, introducing three distinct, capability-tiered models—Sol, Terra, and Luna—designed to re-engineer developer and enterprise workflows.

The initial rollout is available through the API and Codex to a narrow set of approximately 20 total organizations after OpenAI shared the models and release plans with the U.S. government, following an executive order issued by President Donald J. Trump earlier this month on June 2, 2026, which calls upon various federal agencies to collaborate on a process for benchmarking and assessing capabilities of new AI models to ensure they are safe and appropriate for wide release.

While this process remains underway (it was said in the order to take 30 days, so July 2), OpenAI says in its release blog post that it "previewed our plans and the models’ capabilities ahead of today’s launch. At [the U.S. government's] request, we are starting with a limited preview for a small group of trusted partners."

OpenAI's limited preview release strategy also follows the drastic step taken by he U.S. government to issue an export control order against Anthropic, OpenAI's top U.S. competitor, over jailbreaks found in its most powerful generally released model, Claude Fable 5, to which Anthropic responded by removing any access to the model and its cybersecurity focused counterpart Claude Mythos 5 by public or private parties.

Because OpenAI is coordinating its release framework with the White House ahead of a broader public launch, enterprise buyers must navigate a novel landscape of real-time safety interventions, mandatory compliance parameters, and structured token caching systems.

How the 3 new GPT-5.6 models differ: Sol vs. Terra vs. Luna

The three GPT-5.6 models are designed to address different enterprise needs and performance profiles.

Sol is the top-tier option, built for the most demanding tasks such as complex reasoning, extended coding sessions, advanced agent-driven workflows, and security-focused applications. It delivers the highest level of capability but comes with the greatest resource requirements.

It's priced at $5.00 per million input tokens / $30.00 per million output tokens — the same as GPT-5.5 — and OpenAI says it delivers a major performance gain for long-running coding, cybersecurity and agentic tasks.

Terra balances strong performance with efficiency. It is intended for large-scale production environments where organizations need reliable results across high volumes of work without the overhead of the most advanced model. It's available for $2.50/$15 per 1M tokens.

Luna is the most lightweight and cost-efficient option, optimized for speed and everyday use cases. It is well suited for simpler tasks, routine workflows, and applications where responsiveness and scalability are more important than maximum depth of reasoning, and is the most affordably priced at $1/$6 per million tokens in and out, respectively.

Sources with knowledge of OpenAI's inner workings shared with VentureBeat that the new naming scheme was designed to move away from the "nano" and "mini" variants of GPT-5, as these models are not so different in terms of size or raw intelligence, but rather, designed for different distinct use cases.

As OpenAI states in its blog post about the new naming scheme: "In this new naming system introduced with GPT‑5.6, the number identifies a model’s generation, while Sol, Terra, and Luna identify durable capability tiers that can advance on their own cadence. Together, the family gives people and developers clearer choices across intelligence, speed, and cost."

Also, sources said OpenAI sought to evoke a sense of inspiration by looking to the cosmos and names associated with it.

Further, Sol fits well alongside OpenAI's Daybreak opt-in program for organizations interested in cyber defense, which is an added bonus. The "Sol" voice style for OpenAI's voice mode on ChatGPT is unrelated, and will likely be renamed.

Here's how they stack up against the rest of the current leading LLM field in price — note that OpenAI's cheapest option is overall a mid-priced model, and still more expensive than the frontier-level GLM-5.2

VentureBeat Frontier AI Model API Pricing Snapshot

Model

Input

Output

Total Cost

Source

MiMo-V2.5 Flash

$0.10

$0.30

$0.40

Xiaomi MiMo

deepseek-v4-flash

$0.14

$0.28

$0.42

DeepSeek

deepseek-v4-pro

$0.435

$0.87

$1.305

DeepSeek

MiniMax-M3

$0.30

$1.20

$1.50

MiniMax

Gemini 3.1 Flash-Lite

$0.25

$1.50

$1.75

Google

Qwen3.7-Plus

$0.40

$1.60

$2.00

Alibaba Cloud

MiMo-V2.5

$0.40

$2.00

$2.40

Xiaomi MiMo

Grok 4.3 (low context)

$1.25

$2.50

$3.75

xAI

MiMo-V2.5 Pro (≤256K)

$1.00

$3.00

$4.00

Xiaomi MiMo

Kimi-K2.6

$0.95

$4.00

$4.95

Moonshot/Kimi

GLM-5.2

$1.40

$4.40

$5.80

Z.ai

GPT-5.6 Luna

$1.00

$6.00

$7.00

OpenAI

Grok 4.3 (high context)

$2.50

$5.00

$7.50

xAI

MiMo-V2.5 Pro (>256K)

$2.00

$6.00

$8.00

Xiaomi MiMo

Qwen3.7-Max

$2.50

$7.50

$10.00

Alibaba Cloud

Gemini 3.5 Flash

$1.50

$9.00

$10.50

Google

Gemini 3.1 Pro Preview (≤200K)

$2.00

$12.00

$14.00

Google

GPT-5.6 Terra

$2.50

$15.00

$17.50

OpenAI

GPT-5.4

$2.50

$15.00

$17.50

OpenAI

Gemini 3.1 Pro Preview (>200K)

$4.00

$18.00

$22.00

Google

Claude Opus 4.8

$5.00

$25.00

$30.00

Anthropic

GPT-5.5

$5.00

$30.00

$35.00

OpenAI

GPT-5.5 Instant (chat-latest)

$5.00

$30.00

$35.00

OpenAI

Sakana Fugu Ultra (≤272K)

$5.00

$30.00

$35.00

Sakana AI

GPT-5.6 Sol

$5.00

$30.00

$35.00

OpenAI

Claude Fable 5 / Claude Mythos 5

$10.00

$50.00

$60.00

Anthropic

Technology: Deep Reasoning and the Multi-Agent Paradigm

The core architectural evolution of the GPT-5.6 series centers on how compute is allocated during inference. Rather than relying on instantaneous token generation, OpenAI introduces a new max reasoning effort mode, which explicitly grants the flagship Sol model extended time to reason through highly complex problems deeply. Compounding this is the debut of an ultra mode.

This configuration expands past the structural boundaries of a single standalone model, instead deploying specialized "subagents" to divide, conquer, and accelerate multi-step, long-horizon projects. Data from initial evaluations indicates that this subagent coordination shifts the frontier for programmatic execution:

  • Command-Line Automation: On Terminal-Bench 2.1—which evaluates planning, tool usage, and iterative error correction in command-line environments—GPT-5.6 Sol (Ultra) achieves a state-of-the-art score of 91.91%. This edges out GPT-5.6 Sol (Max) at 88.76% and eclipses Claude Mythos 5 at 88%.

  • Professional Workflows: On Agent's Last Exam, a benchmark spanning 55 professional domains to test long-running workflows, GPT-5.6 Sol is the only model to clear the 50% success threshold, scoring 50.9% in code mode while displaying superior token efficiency relative to preceding architectures,.

  • Quantitative Biology: On GeneBench v1, which measures long-horizon genomics analysis, the flagship model systematically outperforms GPT-5.5 while consuming fewer total tokens across simulated latency periods

Predictable Prompt Caching Mechanics

To help enterprises control the unpredictable cost curves of running agentic loops, the GPT-5.6 API introduces a revamped prompt caching protocol.

Developers can now implement explicit cache breakpoints, backed by a guaranteed 30-minute minimum cache lifetime. Under this framework, initial cache writes carry a 1.25x premium over the model's standard uncached input rate, but subsequent cache reads receive a steep 90% discount. For systems that routinely pass massive context windows or codebase definitions back into the model, this predictability is a critical financial guardrail.

Furthermore, for enterprise applications where latency is the primary barrier to adoption, OpenAI is launching GPT-5.6 Sol on Cerebras hardware this July. This infrastructure partnership claims processing speeds of up to 750 tokens per second, targeting specialized enterprise applications requiring real-time, frontier-grade reasoning.

Enterprise Implications: High Security and Algorithmic Friction

For corporate engineering, information security, and compliance teams, the deployment of GPT-5.6 requires a meticulous look at its security architecture. The models are accessible under a commercial enterprise API license, with open-source options completely off the table due to the dual-use risks inherent to its cyber capabilities.

To achieve clearance for release, OpenAI dedicated roughly 700,000 A100e GPU hours solely to automated red-teaming. This compute was allocated to discovering "universal jailbreaks"—systemic attack vectors designed to bypass safeguards across varied contexts, rather than single-prompt workarounds.

This massive testing phase feeds directly into a highly strict, multi-layered safeguard stack that operates in real time:

  1. Model-Level Refusals: Hardcoded boundaries trained directly into the base weights to resist masked intent or adversarial obfuscation.

  2. Real-Time Classifiers: Auxiliary systems that evaluate cyber and biological output token-by-token as it is generated.

  3. Reasoning Review Pauses: If a potential high-risk violation is flagged mid-generation, the pipeline automatically pauses. A secondary, larger reasoning model reviews the context of the conversation; if verified as malicious, the output is withheld before it reaches the user endpoint.

Operational Friction for Dual-Use Security Work

This real-time safety stack introduces distinct operational hurdles for enterprise security teams.

Because legitimate defensive work—such as code reviews, vulnerability discovery, patch engineering, and defensive testing—frequently utilizes the exact same code primitives as offensive exploits, OpenAI admits that its classifiers may regularly trigger false positives. During this preview period, enterprise developers should expect localized latency spikes, paused API generations, and intermittent request refusals.

Persistent flagging can trigger automated account-level reviews across historical conversations to evaluate if an enterprise client is engaging in malicious behavior or standard security research. OpenAI is currently negotiating longer-term enterprise safety compliance controls, including customer-operated safety overrides and privacy-preserving detection mechanisms, to insulate corporate data from manual review pipelines.

Importantly, OpenAI notes that under testing, Sol remains optimized for defensive containment rather than offensive deployment. In evaluations running against the Chromium and Firefox codebases, the model successfully isolated bugs and exploitation primitives but was unable to autonomously engineer a functional, full-chain exploit, keeping it safely below the organization's "Cyber Critical" alert threshold.

The Geopolitics of the Phased Release

The broader rollout of the GPT-5.6 series reflects an escalating entanglement between frontier AI labs and national security protocols. The decision to limit initial access to a small circle of vetted partners whose details are shared with the U.S. government stems from direct coordination regarding the developing cyber Executive Order framework. OpenAI has taken the unusual step of publicly critiquing this sovereign gatekeeping within its official product announcement documentation. The company states plainly:

"We don’t believe this kind of government access process should become the long-term default. It keeps the best tools from users, developers, enterprises, cyber defenders, and global partners who need them."

This tension highlights the precarious position of modern tech enterprises. While organizations can leverage unprecedented agentic efficiency and robust defensive patching capabilities via benchmarks like ExploitGym and ExploitBench, they must also accept that access to premier tools remains subject to diplomatic and regulatory authorization. General availability across ChatGPT and the wider public API is expected to roll out incrementally over the coming weeks.

Read on the original site

Open the publisher's page for the full experience

View original article

Tagged with

#generative AI for data analysis#Excel alternatives for data analysis#natural language processing for spreadsheets#enterprise-level spreadsheet solutions#enterprise data management#real-time data collaboration#financial modeling with spreadsheets#real-time collaboration#spreadsheet API integration#rows.com#google sheets#cloud-based spreadsheet applications#automation in spreadsheet workflows#no-code spreadsheet solutions#big data performance#data analysis tools#business intelligence tools#AI formula generation techniques#machine learning in spreadsheet applications#AutoML capabilities