3 Sources
[1]
OpenAI launches GPT-5.6 but restricts rollout after US request
OpenAI has introduced GPT-5.6, a new family of large language models led by its flagship Sol model, alongside Terra and Luna variants built for different performance and cost requirements. However, the company is limiting the initial rollout to a small group of trusted U.S.-based partners after a request from the U.S. government. The GPT-5.6 series introduces a new naming system, with Sol representing the highest capability tier, Terra offering GPT-5.5-level performance at half the cost, and Luna targeting lower-cost, faster AI applications. OpenAI said the models will become generally available through ChatGPT, Codex, and its API in the coming weeks. GPT-5.6 Sol also introduces a new maximum reasoning mode that gives the model more time to solve complex tasks. OpenAI is also launching an Ultra mode that uses subagents to tackle sophisticated workflows beyond the capabilities of a single AI agent. The company said GPT-5.6 Sol delivers its strongest performance yet in coding, biology, and cybersecurity while introducing its "most robust safety stack to date." According to OpenAI, GPT-5.6 Sol achieved a new state of the art on TerminalBench 2.1, a benchmark for command-line coding workflows. In biology, the company said the model outperformed GPT-5.5 on GeneBench v1 while using fewer output tokens. OpenAI also highlighted gains in cybersecurity. On ExploitBench, GPT-5.6 Sol matched the performance of Anthropic's Mythos Preview while using roughly one-third of the output tokens. On ExploitGym, developed by researchers at UC Berkeley with OpenAI and other frontier AI labs, all three GPT-5.6 models showed improved cyber capabilities as reasoning increased. Despite those gains, OpenAI said GPT-5.6 Sol does not cross the Cyber Critical threshold under its Preparedness Framework. "GPT-5.6 Sol is better at helping people find and fix vulnerabilities than reliably carrying out end-to-end attacks," the company said. The company also introduced a layered safety system that combines model-level protections, real-time misuse detection, account-level monitoring, differentiated access, and extensive automated and human red-teaming. OpenAI said it dedicated more than 700,000 A100-equivalent GPU hours to automated red-teaming to uncover jailbreak techniques before release. Unlike previous launches, GPT-5.6 will initially be available only to a select group of trusted partners. "As part of our ongoing engagement with the U.S. government, we previewed our plans and the models' capabilities ahead of today's launch," OpenAI said. "At their request, we are starting with a limited preview for a small group of trusted partners whose participation has been shared with the government, before releasing more broadly." OpenAI said it does not want government previews to become standard practice. "We don't believe this kind of government access process should become the long-term default," the company said, adding that it is taking the temporary step while working with the administration on a repeatable framework for future frontier AI releases. CEO Sam Altman echoed that view on X, saying the government requested a limited preview instead of the broader launch OpenAI had planned. He added that the company hopes to make GPT-5.6 widely available as quickly as possible while developing a transparent process for future releases.
[2]
OpenAI debuts GPT-5.6 models with limited access after US government request
OpenAI has introduced GPT-5.6, a new family of large language models led by its flagship Sol model, alongside Terra and Luna variants built for different performance and cost requirements. However, the company is limiting the initial rollout to a small group of trusted U.S.-based partners after a request from the U.S. government. The GPT-5.6 series introduces a new naming system, with Sol representing the highest capability tier, Terra offering GPT-5.5-level performance at half the cost, and Luna targeting lower-cost, faster AI applications. OpenAI said the models will become generally available through ChatGPT, Codex, and its API in the coming weeks. GPT-5.6 Sol also introduces a new maximum reasoning mode that gives the model more time to solve complex tasks. OpenAI is also launching an Ultra mode that uses subagents to tackle sophisticated workflows beyond the capabilities of a single AI agent. The company said GPT-5.6 Sol delivers its strongest performance yet in coding, biology, and cybersecurity while introducing its "most robust safety stack to date." According to OpenAI, GPT-5.6 Sol achieved a new state of the art on TerminalBench 2.1, a benchmark for command-line coding workflows. In biology, the company said the model outperformed GPT-5.5 on GeneBench v1 while using fewer output tokens. OpenAI also highlighted gains in cybersecurity. On ExploitBench, GPT-5.6 Sol matched the performance of Anthropic's Mythos Preview while using roughly one-third of the output tokens. On ExploitGym, developed by researchers at UC Berkeley with OpenAI and other frontier AI labs, all three GPT-5.6 models showed improved cyber capabilities as reasoning increased. Despite those gains, OpenAI said GPT-5.6 Sol does not cross the Cyber Critical threshold under its Preparedness Framework. "GPT-5.6 Sol is better at helping people find and fix vulnerabilities than reliably carrying out end-to-end attacks," the company said. The company also introduced a layered safety system that combines model-level protections, real-time misuse detection, account-level monitoring, differentiated access, and extensive automated and human red-teaming. OpenAI said it dedicated more than 700,000 A100-equivalent GPU hours to automated red-teaming to uncover jailbreak techniques before release. Unlike previous launches, GPT-5.6 will initially be available only to a select group of trusted partners. "As part of our ongoing engagement with the U.S. government, we previewed our plans and the models' capabilities ahead of today's launch," OpenAI said. "At their request, we are starting with a limited preview for a small group of trusted partners whose participation has been shared with the government, before releasing more broadly." OpenAI said it does not want government previews to become standard practice. "We don't believe this kind of government access process should become the long-term default," the company said, adding that it is taking the temporary step while working with the administration on a repeatable framework for future frontier AI releases. CEO Sam Altman echoed that view on X, saying the government requested a limited preview instead of the broader launch OpenAI had planned. He added that the company hopes to make GPT-5.6 widely available as quickly as possible while developing a transparent process for future releases.
[3]
OpenAI introduces GPT-5.6 to challenge Claude Mythos 5
OpenAI Group PBC today introduced GPT-5.6, a new series of large language models that it says can outperform Claude Mythos 5 across certain coding tasks. The most advanced algorithm in the lineup is known as Sol. It's available alongside a mid-range option called Terra and an entry-level model dubbed Luna. All three LLMs come with two modes that weren't included in GPT-5.5. The first is a "max" setting that increases the amount of time GPT-5.6 spends on a task to boost reasoning quality. Additionally, OpenAI has developed an "ultra" mode that can spin up multiple subagents to parallelize work. The company describes Sol as the most capable LLM it has built to date. The model scored 88.8% on a popular AI benchmark called TerminalBench-2.1 that includes 89 complex programming tasks. When the company enabled the "ultra" setting, Sol's score increased to 91.9%. Anthropic PBC's flagship Claude Mythos 5 model managed 88%. Claude Mythos 5 was preceded by a model called Mythos Preview that made its debut in April. According to Anthropic, the latter LLM has identified more than 10,000 high-severity and critical software vulnerabilities. OpenAI says that Sol nearly matches Mythos Preview's performance on a cybersecurity research benchmark called ExploitBench. The GPT-5.6 series also brings efficiency improvements. OpenAI had Sol tackle GeneBench v1, a collection of scientific data analysis tasks that it released in April. The model matched the performance of the company's previous flagship LLM using fewer tokens. Sol includes guardrails designed to prevent it from supporting malicious activities such as developing hacking campaigns. If the controls fail to prevent the LLM from generating harmful output, a specialized large reasoning model filters the prompt response before it reaches the user. OpenAI says that the GPT-5.6 series can not only block risky requests but also fend off cyberattacks. The company ran a series of red teaming exercises to find universal jailbreaks, hacking tactics that can be used to create not one but multiple malicious prompts. Some of the tests were carried out automatically using "700,000 A100-equivalent GPU hours." OpenAI used the test findings to improve its new model lineup's security. Terra and Luna, the two lower-end GPT-5.6 models that debuted alongside Sol, trade off some output quality for increased cost-efficiency. Sol is priced at $5 per million input tokens and $30 per million output tokens. Terra costs half as much while Luna offers 80% lower rates. At the request of the U.S. government, OpenAI is limiting GPT-6.5 access to a "small group of trusted partners" on launch. The company plans to move the LLM series into general availability in a few weeks. Additionally, OpenAI will bring Sol to newly public Cerebras Systems Inc.'s WSE-3 wafer-size AI chip.
Share
Copy Link
OpenAI has launched GPT-5.6, a new family of large language models featuring Sol, Terra, and Luna variants designed for different performance tiers. The company is restricting initial access to a small group of trusted U.S.-based partners following a government request, marking an unusual departure from typical AI model launches and raising questions about future oversight frameworks.
OpenAI has unveiled GPT-5.6, a new family of large language models that introduces a tiered naming system designed to address varying performance and cost requirements
1
. The flagship GPT-5.6 Sol represents the highest capability tier, while Terra delivers GPT-5.5-level performance at half the cost, and Luna targets lower-cost, faster AI applications2
. However, in a significant departure from previous AI model launches, the company is limiting initial rollout to a small group of trusted U.S. partners following a US government request1
. This limited access approach signals growing government involvement in frontier AI development and raises important questions about how future releases will be managed.
Source: Interesting Engineering
The GPT-5.6 models demonstrate significant advances in coding, biology, and cybersecurity benchmarks. According to OpenAI, GPT-5.6 Sol achieved 88.8% on TerminalBench 2.1, a benchmark for command-line coding workflows that includes 89 complex programming tasks
3
. When the Ultra mode setting was enabled, Sol's score increased to 91.9%, surpassing Anthropic's Claude Mythos 5, which managed 88%3
. In biology, the model outperformed GPT-5.5 on GeneBench v1 while using fewer output tokens, demonstrating both capability gains and efficiency improvements1
.On ExploitBench, GPT-5.6 Sol matched the performance of Anthropic's Mythos Preview while using roughly one-third of the output tokens
2
. On ExploitGym, developed by researchers at UC Berkeley with OpenAI and other frontier AI labs, all three GPT-5.6 models showed improved cyber capabilities as reasoning increased1
. Despite these gains, OpenAI emphasized that GPT-5.6 Sol does not cross the Cyber Critical threshold under its Preparedness Framework, stating the model "is better at helping people find and fix vulnerabilities than reliably carrying out end-to-end attacks"2
.GPT-5.6 Sol introduces a new maximum reasoning mode that gives the model more time to solve complex tasks
1
. The Ultra mode uses subagents to tackle sophisticated workflows beyond the capabilities of a single AI agent, enabling parallel processing of complex tasks2
. OpenAI describes these large language models as featuring its "most robust safety stack to date," combining model-level protections, real-time misuse detection, account-level monitoring, differentiated access, and extensive automated and human red-teaming1
.
Source: SiliconANGLE
The company dedicated more than 700,000 A100-equivalent GPU hours to automated red-teaming to uncover jailbreak techniques before release
2
. If the controls fail to prevent the LLM from generating harmful output, a specialized large reasoning model filters the prompt response before it reaches the user3
.Related Stories
Unlike previous launches, GPT-5.6 will initially be available only to a select group of trusted partners. "As part of our ongoing engagement with the U.S. government, we previewed our plans and the models' capabilities ahead of today's launch," OpenAI stated. "At their request, we are starting with a limited preview for a small group of trusted partners whose participation has been shared with the government, before releasing more broadly"
1
. OpenAI emphasized it does not want government previews to become standard practice, calling this a temporary step while working with the administration on a repeatable framework for future frontier AI releases2
.CEO Sam Altman echoed this position, saying the government requested a limited preview instead of the broader launch OpenAI had planned
1
. The company plans to make GPT-5.6 widely available through ChatGPT, Codex, and its API in the coming weeks2
. Additionally, OpenAI will bring Sol to Cerebras Systems' WSE-3 wafer-size AI chip3
. Sol is priced at $5 per million input tokens and $30 per million output tokens, while Terra costs half as much and Luna offers 80% lower rates3
.Summarized by
Navi
[1]
[2]
Interesting Engineering
|[3]
1
Policy and Regulation

2
Policy and Regulation

3
Technology
