30 Sources
30 Sources
[1]
New ChatGPT 5.4 Model Is 'Built for Agents.' Will It Lure Back Claude Converts?
Already bored with ChatGPT 5.3 Instant, which came out Tuesday? OpenAI has a new model family ready for you. The company released two new models, named GPT 5.4 Thinking and GPT 5.4 Pro, on Thursday. GPT 5.4 Thinking is built for enterprise work, like coding and overseeing AI agents. ChatGPT 5.4 is a "thinking" model, which means it takes a little bit longer to cook its answers but those responses should be more accurate and handle more complex tasks. It's meant to be used by AI agents, which are bots that can operate independently. OpenAI said that 5.4 can more efficiently support agentic activity, meaning it uses less computing power and therefore costs less money. OpenAI is calling GPT 5.4 its "most factual model yet," inadvertently highlighting the very real problem of when AI models hallucinate, or make stuff up. OpenAI said that should be less of an issue with 5.4, as its benchmarks reporting shows that responses from 5.4 are 18% less likely to contain errors and that individual claims are 33% less likely to be false, compared to GPT 5.2. Still, always fact-check what an AI tool tells you. GPT 5.4 Thinking and Pro are available now for paying ChatGPT users and in the API (a developer tool). GPT 5.4 Thinking is also in Codex, OpenAI's coding application. Read More: AI Slop Is Destroying the Internet. These Are the People Fighting to Save It GPT 5.4 is a boost for OpenAI. Having an agentic-centric model that's designed for power users willing to cough up a monthly subscription like 5.4 certainly feels like OpenAI is taking a swing at Anthropic's Claude. Anthropic and OpenAI have been locked in a growing feud, recently kicked into higher gear by Anthropic's Super Bowl commercials attacking ChatGPT's ad rollout. But things have escalated from there, with recent reports showing that Anthropic's popularity is growing. The Claude mobile apps nabbed the top spots in Apple's and Google's app stores, and AI users are filling online forums with advice for how to transfer your data from ChatGPT to Claude. One reason some users are making the switch is because of the growing controversy between the two AI titans and the US government. In the middle of starting a war in Iran and dealing with the subsequent growing international crisis, the Department of War (formerly the Defense Department) has been negotiating contracts with AI companies. The original deal was for Anthropic's Claude, but it fell apart last week when Anthropic refused to allow the US government to use its AI to surveil citizens and support autonomous weapon systems. OpenAI stepped into that void, with CEO Sam Altman clarifying this week that it would implement safeguards and wouldn't be made available to intelligence agencies like the NSA. The company previously announced it had struck a $200 million deal with the defense department in 2025. There are still a lot of unanswered questions about how AI, made by any company, is being used by government agencies and defense contractors.
[2]
New GPT-5.4 clobbers humans on pro-level work in OpenAI's tests - by 83%
GPT-5.4's 83% score suggests AI rivals expert professionals.Tests span nine industries and 44 real-world occupations.New capabilities boost coding, tools, and computer control. It seems like only yesterday that OpenAI released its GPT-5.2 model to the world. In fact, it's been less than three months. Thursday, OpenAI is releasing the thinking model of GPT-5.4. Also: How to switch from ChatGPT to Claude: Transferring your memories and settings is easy What exactly does that mean? In this article, I'll briefly touch on the official announcement and availability details, and then I'll dive into what I think is the most startling detail: GPT-5.4 can match or outperform human professionals 83% of the time, according to OpenAI. (Disclosure: Ziff Davis, ZDNET's parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) OpenAI says GPT-5.4 is "the most capable and efficient frontier model for complex professional work." Within ChatGPT, the company calls this model GPT 5.4 Thinking. There are also releases for the API, within the Codex programming tool, and in a GPT-5.4 Pro version. Also: 10 ChatGPT Codex secrets I only learned after 60 hours of pair programming with it In terms of overall performance, the company says that GPT-5.4 is "18% less likely to contain errors, and individual claims are 33% less likely to be false compared to GPT-5.2, based on prompts where users previously flagged factual mistakes." It's always nice when an extremely powerful artificial intelligence makes stuff up less frequently. As for availability, the company will offer GPT-5.4 via API on Friday. It will be "rolling out" across ChatGPT paid tiers and in Codex, which presumably means it will show up fairly soon for most users. It gives me no joy to say this, but OpenAI's naming conventions give me a headache. When it comes to naming, it feels like it fired all its experienced product managers and replaced them with a GPT-3.5 instance from 2022. So, OK, OpenAI released GPT-5.3-Codex last month. That's the first version of Codex that used itself to help build itself. Skynet, anyone? Then, two days ago...two days ago it released GPT-5.3 Instant. This, according to the company, "makes everyday conversations more consistently helpful and fluid." It's available to all users of ChatGPT. In the API, it's released as gpt-5.3-chat-latest. Not gpt-5.3-chat-instant, because that would make too much sense. And now, we have GPT-5.4. So in the space between Tuesday and Thursday, OpenAI has released a GPT-5.3 and a GPT-5.4 model. You'd have to be an AI to keep track of it all. Because such crimes against coherent versioning make me twitchy, I had to ask the OpenAI communications team about it. They were patient and kind enough to answer: GPT-5.4 is our first mainline reasoning model that incorporates the frontier coding capabilities of gpt-5.3-codex, and that is rolling out across ChatGPT, the API, and Codex. We're calling it GPT-5.4 to reflect that jump, and to simplify the choice between models when using Codex. Over time, you can expect our Instant models and Thinking models to evolve at different speeds. I still don't like it. If Instant and Thinking are really two separate products, they should have completely separate versioning. 5.3 and 5.4 are too close and too confusing. If they're considered to be different variants of the same product, they should share version numbers. Also: Is ChatGPT Plus still worth your $20? I compared it to the Free, Go, and Pro plans But hey. OpenAI is worth something on the order of $840 billion, and I own a 14-year-old Ford. What do I know? Let's move on to the part where we all worry about our job security. In September, OpenAI introduced a new AI evaluation test called GPTval. It's a test designed to measure how well AI models perform doing "economically valuable, real-world tasks." The test measures performance in nine industries and 44 occupations. OpenAI chose the industries based on those contributing 5% or more to the US gross domestic product. Each industry has unique occupations. For the test, the company selected up to five occupations, choosing those that had less than 40% physical or manual work, and which make up those jobs with the highest total wages and most overall compensation. Also: I stopped using ChatGPT for everything: These AI models beat it at research, coding, and more It basically picked a cross-section of knowledge-related jobs where AI could have the most impact "on real-world productivity." The intent was that the GPT models could help professionals get more done, but it's not too big a leap to infer that these occupations are also the most at risk from AI replacement or augmentation. Here's how those occupations fit into their industries. I could get picky about which occupations are the most impactful in the various industries, but this selection is a good one for testing model performance overall. The tests themselves are interesting in both how they are constructed and how they are measured. OpenAI worked with experienced professionals in each occupation to create a set of tasks that "reflect their day-to-day work." The task sets all went through many rounds of expert review and resulted in a series of fully reviewed, complex tasks per industry. One of the manufacturing engineer tasks, for example, involves the design of a jig (guides a tool) or a fixture (holds the work) to simplify the reeling in and reeling out of a cable spool for underground mining operations. Also: This simple ChatGPT trick helps you spot scams before you click or respond Grading for each of these tests was done by human professionals in each of the occupations. The graders weren't told whether the results were from the AI, or from other professionals in their fields. Additionally, OpenAI built an automated grading system based on the work of the human graders, so that the humans don't have to take their time grading each iteration of the AI model. I'm sure OpenAI constructed this automated system with all appropriate safeguards, but I worry that some level of inherent bias might be possible when letting an AI grade the performance of an AI. Ethan Mollick, associate professor and co-director of the Generative AI Lab at Wharton, describes the GDPval test as "probably the most economically relevant measure of AI ability." The speed of improvement is insane. GPT-5.1 was released in November and had a GDPval score of 38.8%. In December, just a month later, GPT-5.2 performance exploded to nearly double that, to 70.9%. Professor Mollick described the importance of GDPval running on GPT-5.2. He said, "In head-to-head competition with human experts on tasks that require 4-8 hours for a human to do, GPT-5.2 wins 71% of the time as judged by other humans." Now, in early March, less than three months after GPT-5.2, GPT-5.4 matches or exceeds the performance of human professionals 83% of the time! Also: How to learn ChatGPT in an hour - for free In other words, almost every time the same task was given to an experienced human pro and GPT-5.4, the AI either kept up with or blew past the experienced human pro, at least, according to its grader, which may have been human or AI. Sit with that for a few minutes. We're not just talking about programming tasks. We're talking about a wide range of industries and a wider range of high-value occupations. According to Daniel Swiecki, head of Artificial Intelligence Solutions at Walleye Capital, "On our toughest internal finance and Excel evaluations, GPT-5.4 outperformed prior models, improving accuracy by 30 percentage points. This step change in reliability materially expands our automation of model updates and scenario analyses for fundamental investors." The freaky thing is this sort of performance could take us in two directions. On the one hand, it could help augment human pros, giving experienced folks the ability to get more done, faster. On the other hand, it could well be seen as the harbinger of a time when the AI is simply replacing the humans in high-value, high-skill jobs. The future is probably not going to be all one or all the other. But even as OpenAI takes a victory lap for its latest release, those of us who support our families based on a lifetime of skill building within those professions have to rock back on our heels, take deep, worried breaths, and hope for the best. Speaking personally, my approach has been to learn all I can, as quickly as I can, and use AI as much as I can. That helps me describe all of this to you, but it also helps me augment my individual productivity using AI resources, particularly for programming. Also: I'm a ChatGPT power user: Here are 7 useful settings that are turned off by default But I worry. AI slop is a real thing, and as AI slop increases more and more in quality, each of us will be competing with a giant superbrain that never sleeps, never eats, and is improving at almost supernatural speed. In addition to overall performance, GPT-5.4 improves on other core capabilities. Stay tuned. GPT-5.4 Thinking will be in your ChatGPT interface shortly. Let the competition begin. What do you think about GPT-5.4's claim that it can match or outperform human professionals 83% of the time? Does that seem like a meaningful benchmark for real-world work? Also: The best AI chatbots of 2026: Expert tested and reviewed Have you started integrating AI into your own professional workflow? If so, where does it help the most or fall short? Looking ahead, do you see tools like this mostly augmenting human expertise, or eventually replacing parts of it? Share your thoughts and experiences in the comments below.
[3]
GPT 5.4 Is Here: New Model Prepares for Autonomous Agents, Shares Fewer Errors
ChatGPT is getting another upgrade, and this time it's moving up to GPT 5.4, just days after the release of GPT 5.3 Instant. OpenAI says this new release brings together its "recent advances in reasoning, coding, and agentic workflows into a single frontier model." Some of GPT 5.4's biggest updates are coming to professional tools, such as improvements to AI-generated spreadsheets, documents, and presentations, but it also sees improvements across search and how you can interact with the chatbot. One big change is OpenAI's next step toward fully autonomous, agentic technologies: the model can use native computer resources, enabling tools to complete complex tasks across applications. GPT 5.4 can write code to operate computers, responding to mouse and keyboard commands depending on what's captured in a screenshot. This means developers can better leverage GPT 5.4 to build agents that operate other services with limited human interaction. A benchmark called OSWorld-Verified, designed to monitor AI's ability to navigate desktop environments, found that GPT 5.4 scored 75%, up from 47.3% with its GPT 5.2 model. That also beats the average human result on the same benchmark of 72.4%. Inside OpenAI's chatbot, you'll get the model as GPT 5.4 Thinking, which now lets you adjust an answer mid-response while it's generating. If it misunderstands your question or you change your mind about the direction of a query, you can now interrupt to make it more closely aligned with an answer without having to start afresh. This feature is immediately available on Android and ChatGPT's website, and it's coming soon to the iPhone app. ChatGPT 5.4 also allows for deeper web research, which OpenAI says is particularly helpful for "highly specific queries, while better maintaining context," especially when answering longer questions. For professional tasks, OpenAI says it has finessed its ability to create and edit documents, making the files it generates easier to read. Its internal tests found that spreadsheets generated to emulate a junior investment banking analyst achieved a mean success rate of 87.3% with human raters. OpenAI also says there will be fewer errors and hallucinations with the new model, calling GPT 5.4 its most factual model yet. It says, "On a set of de-identified prompts where users flagged factual errors, GPT‑5.4's individual claims are 33% less likely to be false and its full responses are 18% less likely to contain any errors, relative to GPT‑5.2." The GPT 5.4 Thinking upgrades are rolling out now for Plus, Pro, and Team subscribers, replacing the GPT 5.2 Thinking model. The existing version won't go away immediately, but it'll be moved to the Legacy Models and then removed on June 5. There's also a GPT 5.4 Pro option in the brand's API for those with Pro and Enterprise plans. So far, there's no word on whether free users will be able to use GPT 5.4 Thinking. Separately, OpenAI has also introduced a dedicated tool, ChatGPT for Excel, to make it easier to plug your spreadsheets into its models. OpenAI says it'll help you use Excel data from workbooks to "run scenarios, and generate outputs based on cells and formulas." Disclosure: Ziff Davis, PCMag's parent company, filed a lawsuit against OpenAI in April 2025, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.
[4]
OpenAI Releases New Financial-Services Tools, Rivaling Anthropic
OpenAI is releasing a new flagship artificial intelligence model and a suite of financial-services tools that are meant to be better at tackling office work, ramping up competition with Anthropic PBC as the rival firm faces new risks from a showdown with the Pentagon. The model, GPT-5.4, is more capable of taking on tasks like generating spreadsheets, documents and presentations, and requires less back-and-forth with a user, OpenAI said Thursday. It's also meant to be better at using the web to get answers to tricky questions, such as by finding information from several sources, analyzing it and coming up with a response. The company also announced a new set of tools intended to help professionals streamline financial analysis, investment memos and other work. The product can connect with ChatGPT apps from financial data and research firms like FactSet Research Systems Inc. and Third Bridge. OpenAI said people can also use ChatGPT directly in Microsoft Excel and Google Sheets to make and examine financial models. OpenAI and Anthropic have been pushing to convince more business professionals to pay up for their services to offset the immense cost of developing AI systems and support their lofty valuations. Anthropic, in particular, has emphasized financial services professionals in some of its recent releases. The company also introduced Claude for Financial Services last year. The growing rivalry between the two firms coincides with a period of uncertainty for Anthropic's business. On Friday, the Pentagon declared Anthropic a "supply-chain risk" after the AI developer demanded protections to ensure its technology not be used for mass surveillance of Americans or the deployment of fully autonomous weapons. Hours later, OpenAI struck an agreement to let the Pentagon deploy the company's artificial intelligence models in the agency's classified network. Chief Executive Officer Sam Altman later said OpenAI's rush to forge a deal with the Defense Department looked "opportunistic and sloppy." He also said his company was working with the department to "make some additions in our agreement to make our principles very clear." Anthropic, meanwhile, has resumed talks with the Pentagon, Bloomberg News has reported. OpenAI, Anthropic and other AI developers have unveiled a number of models in recent months they say are capable of doing a wider range of work on person's behalf, building on their traction with tools that streamline the process of writing and debugging code. That push has rattled investors in legacy software companies that some fear could be rendered obsolete by the AI firms. The ChatGPT maker is rolling out its new model for paid subscribers with the name GPT-5.4 Thinking, indicating it will spend more time computing a response to a prompt. It will be available through the company's AI coding assistant, Codex, as well. GPT-5.4 Thinking will show a breakdown of how it plans to tackle complex assignments and let users weigh in while generating a response if they want to change course.
[5]
OpenAI, in Desperate Need of a Win, Launches GPT-5.4
The latest model is supposed to be a big step forward for AI agents. In the wake of a much-maligned decision to do business with the Department of Defense, OpenAI is looking to course-correct and win back the public with the release of GPT-5.4, the company's latest model. OpenAI called the model its "most capable and efficient frontier model for professional work" and claimed that it brings its advancements in reasoning, coding, and agentic workflows under one unified model. GPT-5.4 is rolling out starting today and will be available in ChatGPT, Codex, and OpenAI's API. GPT-5.4 Thinking will be available for Plus, Teams, and Pro users, and GPT-5.4 Pro will be available through the API, as well as for ChatGPT Enterprise and Edu subscribers.
[6]
OpenAI's GPT-5.4 sets new records on professional benchmarks
The new model introduces native computer use, a 1-million-token context window, and a reworked tool-calling system. Whether it actually holds off Anthropic and Google is less clear. OpenAI is moving fast enough these days that it teased its next model on the same afternoon it launched its last one. Just two days ago, the company released GPT-5.3 Instant. On Thursday, it shipped GPT-5.4 -- a considerably more substantial release, and one that arrives amid an unusually turbulent moment for the company, whose deal with the US Department of Defense has triggered a wave of user cancellations and a public spat with Anthropic's CEO. The model itself, at least, is a genuine step forward. OpenAI is positioning GPT-5.4 as "our most capable and efficient frontier model for professional work," and has released it in three configurations. A standard version for general use, GPT-5.4 Thinking for tasks that benefit from extended chain-of-thought reasoning, and GPT-5.4 Pro for the highest-demand workloads. In ChatGPT, Thinking is available to Plus, Team, and Pro subscribers starting today, replacing GPT-5.2 Thinking. Pro is reserved for the $200-per-month ChatGPT Pro and Enterprise tiers. The benchmark story is striking. On GDPval, OpenAI's internal evaluation measuring performance on knowledge work tasks across 44 occupations, from legal analysis to financial modelling, GPT-5.4 matched or exceeded industry professionals in 83% of comparisons, up from 70.9% for GPT-5.2. On OSWorld-Verified, which measures a model's ability to navigate a desktop environment using screenshots and keyboard and mouse input, GPT-5.4 hit a 75% success rate, ahead of the reported human performance benchmark of 72.4%, and a substantial jump from GPT-5.2's 47.3%. It also claimed the top position on Mercor's APEX-Agents benchmark, designed to evaluate agents on sustained professional tasks across investment banking, consulting, and corporate law. On hallucinations, OpenAI reports that individual factual claims are 33% less likely to be incorrect compared to GPT-5.2, and that overall responses are 18% less likely to contain errors. These figures are self-reported, and benchmark comparisons are against GPT-5.2 rather than the more recent GPT-5.3 -- a pattern worth noting when reading the headline numbers. The most consequential new capability is native computer use in Codex and the API. GPT-5.4 is the first general-purpose OpenAI model with this built in, allowing agents to operate software, navigate file systems, and carry out multi-step workflows across applications, the kind of behaviour previously associated with specialised agentic frameworks that layered on top of models. For developers building automation pipelines, the significance is less about demos and more about reliability: a general-purpose model that handles computer interaction natively removes one category of integration complexity. The API version also supports context windows up to 1 million tokens, more than double the 400,000 available in GPT-5.3, and the largest OpenAI has shipped. For organisations dealing with sprawling document sets, long codebases, or multi-quarter financial records, keeping the full context in-window rather than relying on retrieval workarounds is a genuine practical advantage. It is worth noting, though, that the 1-million-token window comes with a pricing caveat: OpenAI charges double the standard rate per million tokens once input exceeds 272,000 tokens. Google's Gemini 3.1 Pro, by comparison, offers a 2-million-token context at a lower base price. A secondary efficiency improvement is worth attention for developers. The new Tool Search system changes how API calls handle tool definitions. Previously, every call included the full specification for all available tools upfront, a practice that could add tens of thousands of tokens to each request as tool ecosystems grew. Under the new system, the model retrieves tool definitions on demand when it needs them. In internal testing using 250 tasks across 36 MCP servers, OpenAI reported a 47% reduction in total token usage. For developers running large agentic systems with many integrations, that translates directly into lower costs and faster responses. The Mercor APEX-Agents result is presented in the launch materials as a straightforward win, but there is important context attached. When Mercor introduced the benchmark in January, it found that even the best frontier models at the time completed fewer than 25% of professional tasks on the first attempt, and with eight tries, the ceiling was around 40%. GPT-5.4 topping the leaderboard means it is the best-performing model in a field where no model is yet close to professional-grade reliability on long-horizon tasks. Brendan Foody, Mercor's co-founder and CEO, acknowledged as much when he introduced the benchmark: "Right now it's fair to say it's like an intern that gets it right a quarter of the time." That caveat does not diminish the progress. It does affect how the headline benchmark result should be read, particularly when OpenAI's own framing frames GDPval, its internal benchmark, as evidence of matching or exceeding "industry professionals." GDPval and APEX-Agents measure quite different things: GDPval evaluates individual deliverables across broad occupation categories, while APEX-Agents tests sustained multi-step workflows inside simulated enterprise environments. Both matter; neither tells the complete story. OpenAI has included a new open-source evaluation called CoT Controllability, designed to test whether reasoning models can deliberately obscure their chain-of-thought to evade monitoring. This addresses a concern that has been building in AI safety research for some time: that a sufficiently capable model might learn to misrepresent its internal reasoning when being observed. The company reports that GPT-5.4 Thinking shows low ability to control its chain-of-thought in this way, which OpenAI frames as a positive safety signal, suggesting that monitoring the model's visible reasoning remains a meaningful safeguard. Anthropic published related research in February noting that its own models sometimes engage in reasoning that differs from their stated chain-of-thought under specific conditions; OpenAI explicitly links to that work in its launch materials. Whether the chain-of-thought controllability evaluation will hold as models become more capable is an open question. The fact that OpenAI is publishing the evaluation methodology as open source is at least a step toward external scrutiny. GPT-5.4 arrives during what is arguably the most competitive month in frontier AI to date. Anthropic's Claude Opus 4.6, released in February, still leads on several coding benchmarks. Google's Gemini 3.1 Pro leads on abstract reasoning measures and offers a larger context window at a lower price. GPT-5.4 appears to take the lead on desktop computer use and professional knowledge work tasks, as measured by the benchmarks OpenAI is choosing to highlight. No single model sweeps everything. The release cadence itself is worth noticing. GPT-5.3 Instant launched Monday; GPT-5.4 landed Thursday. That pace, two significant model releases in under a week, with the next one already being hinted at, suggests OpenAI is betting that staying visible in the news cycle is as important as any single capability leap.
[7]
OpenAI GPT-5.4 launches with native computer-use and 1M tokens
The release includes several versions. The standard model powers the API and developer tools, while GPT-5.4 Thinking appears in ChatGPT as a reasoning-focused variant. OpenAI also launched GPT-5.4 Pro, designed for users who want maximum performance on complex workloads. The company positions GPT-5.4 as its most capable and efficient frontier model so far, particularly for tasks involving documents, spreadsheets, coding, and multi-step workflows. OpenAI says GPT-5.4 improves significantly across several independent benchmarks used to evaluate AI reasoning and professional skills. The model posted record results on computer-use benchmarks OSWorld-Verified and WebArena Verified, which measure how effectively AI systems interact with software environments and online tools. It also achieved an 83% score on OpenAI's GDPval benchmark, which evaluates knowledge work tasks such as writing, research, and analysis. In addition, GPT-5.4 topped the APEX-Agents benchmark, developed by Mercor to test AI performance in professional domains like law and finance.
[8]
OpenAI is upgrading its professional AI model to ChatGPT-5.4
Following an update to how ChatGPT models react to users, OpenAI is rolling out an even newer version to cater to complex problems with ChatGPT-5.4. Just a couple of days prior to OpenAI's recent release, the company announced a change to its Instant model. The chatbot meant for quick answers and more conversational flow got an update to its ability to, well, converse. Following that behavioral change, the company has tagged on an additional release. OpenAI made ChatGPT-5.4 official, marking it as an upgrade for users who need the model for "professional" work and for those who need more performance on complex tasks. Those models will be available as ChatGPT-5.4 Thinking and Pro, respectively. OpenAI says the ChatGPT-5.4 Thinking model allows users to make changes during its thinking process. The response will start with a plan of action, so users have a chance to alter course if necessary. The model also gets improvements to deep web research without losing context for other conversational elements. It seems like the AI company is positioning the Thinking version as its main model. The company is also aiming the Thinking model at those who need more from an AI model when creating spreadsheets, presentations, and other documents. The model supposedly operates much better than ChatGPT-5.2, from scoring 68.4% to 87.3%. Another point of interest is ChatGPT-5.4's ability to use computers. The press release notes that the new model is "excellent at writing code to operate computers via libraries like Playwright, as well as issuing mouse and keyboard commands in response to screenshots." OpenAI notes that ChatGPT-5.4 is rolling out gradually, starting today. The Thinking model is available for Plus, Team, and Pro users, while the Pro variant is only available for Pro and Enterprise plans. Free users don't seem to be included, for now.
[9]
GPT-5.4 is here -- and OpenAI just made every other AI model look slow
OpenAI drops its newest model, giving users a new reason to keep using ChatGPT. * OpenAI has officially launched GPT-5.4, a new frontier model that consolidates its best reasoning, coding and agentic capabilities into a single package * Faster than GPT-5.2, dramatically better at real-world professional tasks * Capable of controlling computers natively OpenAI is not having a quiet week. From amending Pentagon deals to managing the PR fallout from a leaked internal transcript, the company appears to be dealing with plenty behind closed doors. Yet despite the turmoil, OpenAI has just launched GPT-5.4, its most capable and efficient frontier model to date, rolling it out simultaneously across ChatGPT, the Codex platform and its developer API. For users on Plus, Team and Pro plans, the new model -- called GPT-5.4 Thinking inside ChatGPT -- begins rolling out today. This is hardly a minor refresh. GPT-5.4 combines the elite coding abilities of GPT-5.3 Codex with significantly improved reasoning, computer use and knowledge-work capabilities. The result is a model designed to do real work, actually operating software, analyzing spreadsheets and powering long-horizon agent workflows with minimal hand-holding. What makes GPT-5.4 different? The biggest shift here is the rise of native computer use. GPT-5.4 is the first general-purpose OpenAI model that can take control of a computer -- clicking, typing and navigating software using screenshots and mouse/keyboard commands, without relying on a separate specialized model. Developers can now build agents that actually operate websites and applications, not just generate text about them. On OSWorld-Verified -- the benchmark that measures a model's ability to navigate a real desktop environment -- GPT-5.4 scores 75.0%, which not only destroys GPT-5.2's 47.3% score but also edges past the measured human baseline of 72.4%. In other words, this model is already better than the average person at navigating a computer via screenshots alone. Professional work: where it really shines OpenAI says GPT-5.4 is specifically engineered to be better at the kind of work real professionals do every day: building financial models, editing presentations, drafting legal documents and managing complex spreadsheets. On an internal benchmark of spreadsheet modeling tasks designed for junior investment banking analysts, GPT-5.4 scored 87.5% -- up from 68.4% for GPT-5.2. That's a massive improvement for anyone automating financial workflows. Similarly, human evaluators preferred GPT-5.4's presentations over GPT-5.2's 68% of the time, citing stronger visual variety and better use of image generation. Hallucinations are down significantly. According to OpenAI, GPT-5.4's individual factual claims are 33% less likely to be false than GPT-5.2's, and its full responses are 18% less likely to contain any errors -- a meaningful upgrade for professionals who rely on accurate outputs. Coding: faster, smarter, more visual GPT-5.4 now serves as OpenAI's primary coding model too -- replacing the need to choose between ChatGPT and Codex for most tasks. It matches or outperforms GPT-5.3-Codex on SWE-Bench Pro while also being faster, especially at lower reasoning effort settings. Within the chat, you can start coding without needing to choose. A new fast mode in Codex delivers up to 1.5x speed improvement across all supported models. OpenAI also highlights that GPT-5.4 is notably better at complex front-end coding tasks, producing results that are both more aesthetically polished and more functionally correct. A new experimental feature -- "Playwright (Interactive)" -- lets Codex visually debug web and Electron apps in real time, even testing apps as it builds them using its native computer-use capabilities. What everyday users can expect with ChatGPT-5.4 For everyday ChatGPT users, the most noticeable change is that GPT-5.4 Thinking now shows an upfront plan before it starts working on complex tasks. You can intervene, redirect or adjust mid-response without starting over -- a feature that promises to save significant time on multi-step research or creative projects. The model can also maintain coherent context across much longer workflows, handling extended conversations and complex prompts without losing track of earlier steps. This is now live on chatgpt.com and Android, with iOS coming soon. Availability, tool use and agents ChatGPT Plus, Team, and Pro users get GPT-5.4 Thinking starting today. Enterprise and Edu plan users can enable early access via admin settings. GPT-5.4 Pro is exclusive to Pro and Enterprise plans. Developers can access both gpt-5.4 and gpt-5.4-pro via the API immediately. Perhaps the reason for this level to receive the features first is that GPT-5.4 offers a significant upgrade with tool search: instead of loading every available tool's full definition into context upfront (which can burn tens of thousands of tokens per request), the model receives a lightweight list and looks up specific tools only when needed. In testing on 250 tasks from Scale's MCP Atlas benchmark with 36 MCP servers enabled, the tool-search configuration reduced total token usage by 47% while maintaining accuracy. For developers building large agentic systems, that translates directly to lower costs and faster response times. It's clear OpenAI is catering to developers and power users with this rollout. Final thoughts GPT-5.4 is a legitimately significant release. Native computer use alone would make it noteworthy -- but combined with best-in-class professional knowledge performance, a 1M token context window, and dramatically improved tool efficiency, it represents a meaningful step change for anyone building with or working alongside AI. Bottom line: This is the model to watch in 2026. Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds.
[10]
OpenAI releases new ChatGPT model for working in Excel and Google Sheets
Why it matters: The update keeps the pressure on rivals like Google and Anthropic in the fast-moving race to add AI features, especially for the workplace. Driving the news: GPT-5.4 is designed to be less error-prone, more efficient and better at workplace tasks like drafting documents, OpenAI said. * The new model can create files in fewer tries with less back-and-forth than prior models, the company said. * GPT-5.4 outperformed office workers 83% of the time on GDPval, an OpenAI benchmark measuring performance on real-world tasks across 44 occupations. * The model can also solve problems using fewer tokens, OpenAI says -- which can translate to faster responses and lower costs. * The company is also debuting OpenAI for Financial Services, a set of new tools that includes the version of ChatGPT that runs inside spreadsheets and new apps and skills within ChatGPT. * Partners include FactSet, MSCI, Third Bridge and Moody's. Zoom in: OpenAI will offer some of GPT 5.4's advances first to developers and those using OpenAI's Codex developer tool, including native computer use, the ability to search for tools and connectors, and support for up to 1 million tokens of context.
[11]
OpenAI launches GPT-5.4 with native computer use mode, financial plugins for Excel, Sheets
The AI updates aren't slowing down. Literally two days after OpenAI launched a new underlying AI model for ChatGPT called GPT-5.3 Instant, the company has unveiled another, even more massive upgrade: GPT-5.4. Actually, GPT-5.4 comes in two varieties: GPT-5.4 Thinking and GPT-5.4 Pro, the latter designed for the most complex tasks. Both will be available in OpenAI's paid application programming interface (API) and Codex software development application, while GPT-5.4 Thinking will be available to all paid subscribers of ChatGPT (Plus, the $20-per-month plan, and up) and Pro will be reserved for ChatGPT Pro ($200 monthly) and Enterprise plan users. ChatGPT Free users will also get a taste of GPT-5.4, but only when their queries are auto-routed to the model, according to an OpenAI spokesperson. The big headlines on this release are efficiency, with OpenAI reporting that GPT-5.4 uses far fewer tokens (47% fewer on some tasks) than its predecessors, and, arguably even more impressively, a new "native" Computer Use mode available through the API and its Codex that lets GPT-5.4 navigate a users' computer like a human and work across applications. The company is also releasing a new suite of ChatGPT integrations allowing GPT-5.4 to be plugged directly into users' Microsoft Excel and Google Sheets spreadsheets and cells, enabling granular analysis and automated task completion that should speed up work across the enterprise, but may make fears of white collar layoffs even more pronounced on the heels of similar offerings from Anthropic's Claude and its new Cowork application. OpenAI says GPT-5.4 supports up to 1 million tokens of context in the API and Codex, enabling agents to plan, execute, and verify tasks across long horizons -- however, it charges double the cost per 1 million tokens once the input exceeds 272,000 tokens. The most consequential capability OpenAI highlights is that GPT-5.4 is its first general-purpose model released with native, state-of-the-art computer-use capabilities in Codex and the API, enabling agents to operate computers and carry out multi-step workflows across applications. OpenAI says the model can both write code to operate computers via libraries like Playwright and issue mouse and keyboard commands in response to screenshots. OpenAI also claims a jump in agentic web browsing. Benchmark results are presented as evidence that this is not merely a UI wrapper. On BrowseComp, which measures how well AI agents can persistently browse the web to find hard-to-locate information, OpenAI reports GPT-5.4 improving by 17% absolute over GPT-5.2, and GPT-5.4 Pro reaching 89.3%, described as a new state of the art. On OSWorld-Verified, which measures desktop navigation using screenshots plus keyboard and mouse actions, OpenAI reports GPT-5.4 at 75.0% success, compared to 47.3% for GPT-5.2, and notes reported human performance at 72.4%. On WebArena-Verified, GPT-5.4 reaches 67.3% success using both DOM- and screenshot-driven interaction, compared to 65.4% for GPT-5.2. On Online-Mind2Web, OpenAI reports 92.8% success using screenshot-based observations alone. OpenAI also links computer use to improvements in vision and document handling. On MMMU-Pro, GPT-5.4 reaches 81.2% success without tool use, compared with 79.5% for GPT-5.2, and OpenAI says it achieves that result using a fraction of the "thinking tokens." On OmniDocBench, GPT-5.4's average error is reported at 0.109, improved from 0.140 for GPT-5.2. The post also describes expanded support for high-fidelity image inputs, including an "original" detail level up to 10.24M pixels. OpenAI positions GPT-5.4 as built for longer, multi-step workflows -- work that increasingly looks like an agent keeping state across many actions rather than a chatbot responding once. As tool ecosystems get larger, OpenAI argues that the naive approach -- dumping every tool definition into the prompt -- creates a tax paid on every request: cost, latency, and context pollution. GPT-5.4 introduces tool search in the API as a structural fix. Instead of receiving all tool definitions upfront, the model receives a lightweight list of tools plus a search capability, and it retrieves full tool definitions only when they're actually needed. OpenAI describes the efficiency win with a concrete comparison: on 250 tasks from Scale's MCP Atlas benchmark, running with 36 MCP servers enabled, the tool-search configuration reduced total token usage by 47% while achieving the same accuracy as a configuration that exposed all MCP functions directly in context. That 47% figure is specifically about the tool-search setup in that evaluation -- not a blanket claim that GPT-5.4 uses 47% fewer tokens for every kind of task. OpenAI's coding pitch is that GPT-5.4 combines the coding strengths of GPT-5.3-Codex with stronger tool and computer-use capabilities that matter when tasks aren't single-shot. GPT-5.4 matches or outperforms GPT-5.3-Codex on SWE-Bench Pro while being lower latency across reasoning efforts. Codex also gets workflow-level knobs. OpenAI says /fast mode delivers up to 1.5× faster performance across supported models, including GPT-5.4, describing it as the same model and intelligence "just faster." And it describes releasing an experimental Codex skill, "Playwright (Interactive)", meant to demonstrate how coding and computer use can work in tandem -- visually debugging web and Electron apps and testing an app as it's being built. Alongside GPT-5.4, OpenAI is announcing a suite of secure AI products in ChatGPT built for enterprises and financial institutions, powered by GPT-5.4 for advanced financial reasoning and Excel-based modeling. The centerpiece is ChatGPT for Excel and Google Sheets (beta), which OpenAI describes as ChatGPT embedded directly in spreadsheets to build, analyze, and update complex financial models using the formulas and structures teams already rely on. The suite also includes new ChatGPT app integrations intended to unify market, company, and internal data into a single workflow, naming FactSet, MSCI, Third Bridge, and Moody's. And it introduces reusable "Skills" for recurring finance work such as earnings previews, comparables analysis, DCF analysis, and investment memo drafting. OpenAI anchors the finance push with an internal benchmark claim: model performance increased from 43.7% with GPT-5 to 88.0% with GPT-5.4 Thinking on an OpenAI internal investment banking benchmark. OpenAI leans on benchmarks intended to resemble real office deliverables, not just puzzle-solving. On GDPval, an evaluation spanning "well-specified knowledge work" across 44 occupations, OpenAI reports that GPT-5.4 matches or exceeds industry professionals in 83.0% of comparisons, compared to 71.0% for GPT-5.2. The company also highlights specific improvements in the kinds of artifacts that tend to expose model weaknesses: structured tables, formulas, narrative coherence, and design quality. In an internal benchmark of spreadsheet modeling tasks modeled after what a junior investment banking analyst might do, GPT-5.4 reaches a mean score of 87.5%, compared to 68.4% for GPT-5.2. And on a set of presentation evaluation prompts, OpenAI says human raters preferred GPT-5.4's presentations 68.0% of the time over GPT-5.2's, citing stronger aesthetics, greater visual variety, and more effective use of image generation. OpenAI describes GPT-5.4 as its most factual model yet and connects that claim to a practical dataset: de-identified prompts where users previously flagged factual errors. On that set, OpenAI reports GPT-5.4's individual claims are 33% less likely to be false and its full responses are 18% less likely to contain any errors compared to GPT-5.2. In statements provided to VentureBeat from OpenAI and attributed early GPT-5.4 testers, Daniel Swiecki of Walleye Capital says that on internal finance and Excel evaluations, GPT-5.4 improved accuracy by 30 percentage points, which he links to expanded automation for model updates and scenario analysis. Brendan Foody, CEO of Mercor, calls GPT-5.4 the best model the company has tried and says it's now top of Mercor's APEX-Agents benchmark for professional services work, emphasizing long-horizon deliverables like slide decks, financial models, and legal analysis. In the API, OpenAI says GPT-5.4 Thinking is available as and GPT-5.4 Pro as . Pricing is as follows: This makes GPT-5.4 among the more expensive models to run over API compared to the entire field, as seen in the table below. Another important note: with GPT-5.4, requests that exceed 272,000 input tokens are billed at 2X the normal rate, reflecting the ability to send prompts larger than earlier models supported. In Codex, compaction defaults to 272k tokens, and the higher long-context pricing applies only when the input exceeds 272k -- meaning developers can keep sending prompts at or under that size without triggering the higher rate, but can opt into larger prompts by raising the compaction limit, with only those larger requests billed differently. An OpenAI spokesperson said that in the API the maximum output is 128,000 tokens, the same as previous models. Finally, on why GPT-5.4 is priced higher at baseline, the spokesperson attributed it to three factors: higher capability on complex tasks (including coding, computer use, deep research, advanced document generation, and tool use), major research improvements from OpenAI's roadmap, and more efficient reasoning that uses fewer reasoning tokens for comparable tasks -- adding that OpenAI believes GPT-5.4 remains below comparable frontier models on pricing even with the increase. Across the release and the follow-up clarifications, GPT-5.4 is positioned as a model meant to move beyond "answer generation" and into sustained professional workflows -- ones that require tool orchestration, computer interaction, long context, and outputs that look like the artifacts people actually use at work. OpenAI's emphasis on token efficiency, tool search, native computer use, and reduced user-flagged factual errors all point in the same direction: making agentic systems more viable in production by lowering the cost of retries -- whether that retry is a human re-prompting, an agent calling another tool, or a workflow re-running because the first pass didn't stick.
[12]
ChatGPT just got another brain boost with GPT-5.4 Thinking -- and it's built for bigger, more complex tasks
* OpenAI has introduced ChatGPT 5.4 Thinking in ChatGPT to handle more complex reasoning and multi-step tasks * ChatGPT 5.3 Instant will remain the default model for everyday conversations in ChatGPT * The new model allows Plus, Team, and Pro users to guide the AI as it plans and completes larger projects Barely a few days after OpenAI introduced ChatGPT 5.3 Instant, ChatGPT has received an additional upgrade called ChatGPT 5.4 Thinking, powered by the new GPT-5.4 model. ChatGPT 5.4 Thinking, along with a higher-performance ChatGPT 5.4 Pro are rolling out for the "Thinking" tool, while ChatGPT 5.3 Instant remains the default model powering most ChatGPT conversations. The most important detail is how the new model fits into ChatGPT's structure. ChatGPT 5.3 is basically the version optimized for speed and conversational flow, while ChatGPT 5.4 Thinking is designed for situations that require more complex reasoning. It appears as an optional mode for Plus, Team, and Pro subscribers, allowing users to switch into a more analytical version of the model when they need it. That distinction between speed and depth reflects a broader shift in how ChatGPT is being developed. Instead of relying on a single model to handle every situation, OpenAI is building a layered system in which different models serve different roles. The most visible difference with ChatGPT 5.4 Thinking is how the model explains its approach. When given a complicated prompt, the chatbot will offer a short explanation of how it plans to solve the problem. This preview allows the user to intervene while the response is still unfolding. If the model's plan misses the mark, the user can adjust its direction before the final output arrives. Essentially, ChatGPT 5.4 thinks out loud and lets you interrupt and correct it. How it works Imagine asking ChatGPT to plan a family vacation. With ChatGPT 5.4 Thinking enabled, the system might start by describing the steps it intends to take. It could say it will look for travel dates, estimate costs, compare flight options, and outline a daily itinerary. If the user prefers road trips instead of flights or wants to prioritize museums over beaches, that change can happen before the itinerary is generated. The feature also works well with longer prompts. ChatGPT 5.4 is designed to maintain a stronger awareness of earlier parts of the conversation while working through multi-step problems. That allows it to handle more complicated requests without losing track of the overall goal. OpenAI describes this improvement as part of a broader effort to make ChatGPT more useful for professional tasks rather than simply interesting conversation. OpenAI says GPT-5.4 performs especially well in areas such as document creation, spreadsheet analysis, and presentation design. The model was evaluated using tasks that resemble the kind of work performed across dozens of professional occupations. In those tests, GPT-5.4 matched or exceeded human professionals in a large majority of comparisons. Spreadsheet modeling is a particularly strong area. OpenAI says the model scored significantly higher than GPT-5.2 on tasks similar to the financial modeling assignments often handled by junior analysts. The improvements also extend to visual understanding. GPT-5.4 can analyze high-resolution images and complex documents more effectively, which helps when interpreting charts, diagrams, or scanned paperwork. For a regular ChatGPT user, that combination of capabilities means the chatbot becomes more useful for tasks that involve real information rather than abstract questions. A user might ask ChatGPT to analyze a dataset, produce a spreadsheet model, write a report summarizing the results, and create slides explaining the findings. GPT-5.4 is designed to handle that entire chain of tasks with fewer revisions. Speed versus power Even with the arrival of GPT-5.4 Thinking, GPT-5.3 Instant continues to play a central role in the ChatGPT experience. GPT-5.3 Instant remains the default model that powers the standard chat interface. It was introduced to make conversations smoother, reduce unnecessary refusals, and produce faster answers. That focus on speed and responsiveness makes it well-suited for everyday interactions such as brainstorming ideas, answering questions, or drafting quick messages. GPT-5.4 Thinking is positioned differently. Because it performs deeper reasoning and more complex analysis, it requires more computational effort. OpenAI has limited access to the Thinking mode to Plus, Team, and Pro subscribers. Users who need even more power can access GPT-5.4 Pro through Pro and Enterprise plans. That version is designed for particularly demanding tasks such as advanced coding or complex analytical work. This layered approach allows ChatGPT to balance speed and capability. GPT-5.3 Instant handles the majority of everyday conversations, while GPT-5.4 Thinking is available when users need more sophisticated reasoning. OpenAI's decision to keep GPT-5.3 Instant as the default also highlights something important about the direction of modern AI. Raw capability is only part of the equation. For most people, what matters just as much is speed, responsiveness, and a system that feels natural to use. The future of AI may not arrive through sudden breakthroughs alone, but through quieter updates like this one -- small shifts in how these systems reason, respond, and fit into everyday tasks. Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button! And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.
[13]
OpenAI launches GPT-5.4, its most powerful model for enterprise work | Fortune
The new model consolidates some of the capabilities that OpenAI had previously spread across separate models, bringing together the coding strengths of GPT-5.3-Codex -- the company's leading programming model -- improved reasoning skills, and the agentic ability for the model to navigate desktops, browsers, and software applications autonomously. The model is rolling out on Thursday to ChatGPT Plus, Team, and Pro subscribers, as well as through the company's API. A more powerful version, GPT-5.4 Pro, is also available for users who need maximum performance on complex tasks. Alongside the model launch, OpenAI is introducing ChatGPT for Excel and Google Sheets in beta, a version of ChatGPT embedded directly in spreadsheets to build, analyze, and update complex financial models. The company is also launching new ChatGPT app integrations -- including FactSet, MSCI, Third Bridge, and Moody's -- designed to let teams pull market, company, and internal data into a single workflow. It also puts OpenAI in more direct competition with rival Anthropic, which launched similar products as part of the Claude for Financial Services launch in July 2025 that were expanded on later in the year. Both companies are racing to capture the enterprise market with tools that can do real-work for sectors ready to adopt AI. The new announcement could also spark a fresh wave of investor anxiety about the impact of AI on traditional financial data providers, many of whose stocks have already been affected by broader fears of AI-driven disruption to enterprise software. Earlier this year, the release of Anthropic's Cowork plug-ins triggered a broad selloff across SaaS stocks, as markets spooked on the idea that AI tools could make legacy software providers obsolete. One of the most significant aspects of GPT-5.4 for enterprise customers is the out-of-the-box agentic capabilities. The model can operate computers and software autonomously, search for and use external tools on demand, and handle complex, multi-step tasks, all without developers having to build that infrastructure themselves. OpenAI says GPT-5.4 is the company's most factual and reliable model to date. The company said in a blogpost they had reduced the rate of hallucinations, with individual claims 33% less likely to be false and full responses 18% less likely to contain errors compared to GPT-5.2. "Developers don't just need a model that writes code. They need one that thinks through problems the way they do. We're seeing GPT-5.4 perform exceptionally well at logical reasoning and executing intricate, multi-step, tool-dependent Workflows," Mario Rodriguez, Chief Product Officer at GitHub, said of the model. These new features position OpenAI as a more direct player in the increasingly competitive field of agentic products, such as Perplexity Computer, Microsoft's Copilot Tasks, and OpenClaw. As demonstrated with the recent popularity of OpenClaw, users are increasingly looking for AI systems that can handle extended workflows with minimal human intervention. OpenAI also says GPT-5.4 is significantly more token-efficient than its predecessors -- meaning it uses fewer tokens to solve problems. Despite being priced slightly higher per token than GPT-5.2, OpenAI says those efficiency gains mean it requires fewer tokens to complete many tasks, which could offset the higher cost for some users.
[14]
OpenAI's new GPT-5.4 model makes ChatGPT better at handling your complex, multi-step workflows
The update brings smarter reasoning, better task automation, and a new "Thinking" mode to help you get more done. OpenAI today unveiled GPT-5.4, the latest update to its flagship AI model, bringing improvements in reasoning, coding, and real-world task automation. The model is rolling out across ChatGPT, the API, and developer tools, with new variants aimed at both everyday users and enterprise workloads. One of the biggest changes is the model's ability to interact with computers more directly. GPT-5.4 can interpret screenshots, operate browsers, and issue keyboard and mouse commands to complete tasks across different apps and services. This makes it capable of carrying out multi-step workflows that previously required human input. This marks a major step forward toward more autonomous AI agents. The update also improves the model's ability to research complex questions. OpenAI says GPT-5.4 can search across multiple rounds of information gathering and combine findings into clearer, more structured answers. The company claims the model is its most factual yet, reducing false claims by about 33 percent compared with GPT-5.2 A new "Thinking" mode for tougher questions Alongside the core model, OpenAI has introduced GPT-5.4 Thinking inside ChatGPT. It's designed for more complex prompts and provides a visible outline of the model's reasoning process while it works through a problem. Users can adjust instructions mid-response, helping guide the AI toward a more desirable outcome without restarting the conversation. GPT-5.4 is also designed to handle longer and more complex tasks, retaining information across multiple steps and extended workflows. These improvements could be particularly useful in coding tools like OpenAI Codex, where the model can help automate large or time-consuming development tasks. Recommended Videos The model is now rolling out to ChatGPT users on the web and Android, with iOS support expected soon. OpenAI is also offering a GPT-5.4 Pro version for enterprise and academic customers who need maximum performance on complex workloads.
[15]
OpenAI Launches GPT-5.4 Days Just After Last Version as 'QuitGPT' Exodus Gains Steam - Decrypt
Enterprise users benefit most as GPT-5.4 delivers faster AI agents with fewer tokens. OpenAI began rolling out GPT-5.4 -- its most capable model to date -- on Thursday as the company scrambles to contain a PR crisis that has seen an estimated 2.5 million users take actions against the company, either by canceling their subscription or sharing the boycott on social media. The so-called QuitGPT movement exploded after OpenAI revealed a deal with the U.S. Department of Defense hours after Anthropic publicly walked away from the same contract -- earning the Claude maker the public scorn of President Trump and other government officials. Anthropic's sticking point: The DoD refused to include language explicitly prohibiting the deployment of autonomous weapons and mass surveillance of U.S. citizens. OpenAI took the deal anyway. CEO Sam Altman, who has been fielding questions about the apparent gap between his company's stated safety red lines and the contract's actual language, needs those users back. Enter GPT-5.4... just two days after GPT-5.3 was introduced. The new model consolidates reasoning, coding, and agentic capabilities into a single release. It also has a million tokens of context capability, which translates in users having more freedom to handle large amounts of information in a single session. On paper, the numbers look promising. On GDPval -- a benchmark testing knowledge work across 44 occupations -- GPT-5.4 matches or beats industry professionals in 83.0% of comparisons, up from 70.9% for GPT-5.2. Computer use is the biggest leap: On OSWorld-Verified, which measures a model's ability to operate a desktop through screenshots and keyboard/mouse actions, GPT-5.4 hits a 75.0% success rate versus GPT-5.2's 47.3% -- and clears the human baseline of 72.4%. On BrowseComp, a test of deep web research, it jumps 17 percentage points over GPT-5.2. The 1 million token context window and a mid-response steering feature -- letting users redirect the model while it's still thinking -- round out the headline features. The feature saves time and computation by avoiding the need to discard all previously generated tokens when an error is detected. It's important to note that some benchmarks mostly compare GPT-5.4 -- and most of the time, reasoning was set to extra high effort, which free and Plus users don't get to enjoy -- to GPT-5.2, skipping over GPT-5.3 entirely. For users already on GPT-5.3, several gains may feel more incremental than the charts suggest. Coders have the most reason to temper expectations: On SWE-Bench Pro, the improvement from GPT-5.3-Codex (56.8%) to GPT-5.4 (57.7%) is barely a rounding error. The model also claims significantly fewer tokens are required to complete tasks compared to GPT-5.2. "GPT‑5.4 is our most token-efficient reasoning model yet, using significantly fewer tokens to solve problems when compared to GPT‑5.2", OpenAI said. That said, any improvement in this field is a positive for developers who use OpenAI models via API and get charged per token used. A model with an efficient chain of thought may provide the same results at a fraction of the cost, versus a model that tends to overthink things to ensure it reaches the proper conclusion. There's another wrinkle for anyone hoping to use the new model right now: OpenAI says GPT-5.4 will be released today, but it wasn't yet available as of this writing, so it is likely being slowly rolled out. For most users, the best model is GPT 5.3, and it can only be used for instant replies, meaning it provides answers that don't require too much effort. Users who rely on thinking -- OpenAI's terminology for extended chain-of-thought reasoning on complex tasks -- are still on GPT-5.2. In other words, the users most likely to push the model's limits are the last ones to get it. The clearest beneficiaries are enterprise users doing document-heavy work. On an internal spreadsheet modeling benchmark, GPT-5.4 scored 87.3% against GPT-5.2's 68.4%. Legal research firm Harvey said it scored 91% on its BigLaw Bench eval. Mainstay, which runs agents across 30,000 property tax portals, reported a 95% first-attempt success rate and sessions running "~3x faster while using ~70% fewer tokens." That's the kind of efficiency argument that might matter to enterprise procurement teams -- but it's a harder sell to the individual user reconsidering whether to delete their account.
[16]
OpenAI's GPT-5.4 AI Model Is Here, and It Can Use Your Computer
OpenAI released the GPT-5.4 artificial intelligence (AI) model on Thursday. The third major update to the GPT-5 foundational model arrives with improvements across parameters and new agentic capabilities, including computer-use. Unlike GPT-5.3-Codex, which was only released as a coding model for the Codex platform, the latest large language model (LLM) is being released for both ChatGPT and the application programming interface (API). Additionally, the San Francisco-based AI giant also released a more advanced version of the model dubbed GPT-5.4 Pro. GPT-5.4 AI Models Released In a post, the AI giant announced and detailed the release of the new AI models. GPT-5.4 is available in both Codex and API, as well as in ChatGPT as GPT-5.4 Thinking. At launch, the model will only be available to the Plus, Team, and Pro subscribers, replacing GPT-5.2 Thinking. On the other hand, the GPT-5.4 Pro is currently available only to Pro and Enterprise subscribers. OpenAI says that the GPT-5.4 model brings improvements in reasoning, coding, and agentic workflows. It incorporates the coding capabilities of 5.3-Codex, as well as improves general-purpose behaviour when working with tools and carrying out tasks involving spreadsheets, presentations, and documents. However, the most interesting new capability is computer-use. It refers to an LLM's ability to deploy AI agents that can access a virtual machine and carry out complex tasks involving web browsing, placing orders, and more. OpenAI claims GPT-5.4 can write code to operate computers, issue mouse and keyboard commands, and has a steerable behaviour. Based on internal evaluations, the AI giant claimed that the model scored a 75 percent success rate on the OSWorld-Verified benchmark test, which measures a model's ability to navigate a desktop environment through screenshots and keyboard and mouse actions. OpenAI says the model's improved computer-use skills are powered by its enhanced visual perception capabilities. On the MMMU-Pro benchmark, which measures an AI model's visual reasoning, GPT-5.4 scored an 81.2 percent success rate. Coming to safety, the company said GPT-5.4 is classified as High cyber capability, and is being deployed with an expanded cyber safety stack that includes monitoring systems, trusted access controls, and asynchronous blocking. OpenAI has also introduced chain-of-thought (CoT) controllability as a new open-source evaluation. It measures whether models can deliberately obfuscate their reasoning to avoid being monitored.
[17]
OpenAI launches GPT‑5.4 Thinking and Pro, its 'most factual and efficient' model yet
OpenAI has introduced GPT-5.4 Thinking and GPT-5.4 Pro, the newest upgrades to its GPT-5 AI models. The company says the model is more factual and token-efficient, with faster responses, improved research abilities, stronger context retention, better benchmark scores, and lower error rates, alongside new steerability and safety evaluation features. OpenAI has unveiled GPT‑5.4 Thinking and GPT‑5.4 Pro, the latest upgrades to its GPT-5 family of artificial intelligence (AI) models, designed to provide solutions for professional workflows. Taking to social media platform X, the company described GPT‑5.4 as its most "factual and efficient" model, using fewer tokens while providing faster responses. In ChatGPT, GPT‑5.4 Thinking offers improved deep web research and better context retention over longer interactions, the company said. "....and oh -- you can now interrupt the model and add instructions or adjust its direction mid-response," it added. GPT‑5.4: Key features The API version of GPT‑5.4 will support context windows as large as 1 million tokens, the largest available from OpenAI to date. OpenAI emphasised the model's improved token efficiency, noting it can solve the same problems with far fewer tokens than its predecessor. To put that in context, tokens refer to the fundamental, smallest units of data that AI models (especially Large Language Models) use to process, understand, and generate text or images. "GPT‑5.4 is our most token efficient reasoning model yet, using significantly fewer tokens to solve problems when compared to GPT‑5.2 -- translating to reduced token usage and faster speeds," OpenAI said in a blog post. "We've designed GPT‑5.4 to be performant across a wide range of computer-use workloads. It is excellent at writing code to operate computers via libraries such as Playwright, as well as issuing mouse and keyboard commands in response to screenshots ..... Developers can even configure the model's safety behavior to suit different levels of risk tolerance by specifying custom confirmation policies," it wrote. GPT‑5.4 shows record-breaking benchmark performance, including top scores in computer use benchmarks OSWorld-Verified and WebArena Verified, according to OpenAI. It achieved 83% on OpenAI's GDPval test for knowledge work. OpenAI has also continued its efforts to reduce hallucinations and factual errors. GPT‑5.4 is 33% less likely to make errors in individual claims compared with GPT‑5.2, and overall responses are 18% less likely to contain mistakes. Steerability GPT‑5.4 Thinking in ChatGPT introduces a preamble for longer, more complex queries, similar to Codex (OpenAI's coding agent, which understands and generates code from natural language). Users can add instructions or change the model's direction mid-response, making it easier to guide outputs without starting over or requiring multiple additional turns. This feature is already available on ChatGPT and the Android app, with iOS access coming soon. The model can also think for longer on difficult tasks while maintaining a strong awareness of earlier conversation steps. This allows it to handle longer workflows and more complex prompts while keeping responses coherent and relevant throughout. Availability and pricing GPT‑5.4 is being gradually rolled out from Friday across ChatGPT and Codex. * ChatGPT Plus, Team, and Pro users now have access to GPT‑5.4 Thinking, which replaces GPT‑5.2 Thinking. * GPT‑5.2 Thinking remains available for three months for paid users under the Legacy Models section, after which it will be retired on June 5, 2026. * Enterprise and Education plans can enable early access via admin settings. * GPT‑5.4 Pro is available for Pro and Enterprise plans. * Context windows in ChatGPT for GPT‑5.4 Thinking remain unchanged from GPT‑5.2 Thinking. Pricing: Safety A new safety evaluation has been added to examine the model's chain-of-thought (CoT), the running commentary used to explain reasoning in multi-step tasks. Researchers have long been concerned that models could misrepresent their CoT under certain conditions. "We find that GPT‑5.4 Thinking's ability to control its CoT is low, which is a positive property for safety, suggesting that the model lacks the ability to hide its reasoning and that CoT monitoring remains an effective safety tool," OpenAI said.
[18]
ChatGPT 5.4 Pro Adds Native Desktop Control for Real-Time Work
OpenAI's ChatGPT 5.4 Pro represents a significant development in artificial intelligence, excelling in tasks that require advanced reasoning and precision. According to AI Grid, the model achieved a 52% success rate on professional task benchmarks, effectively completing complex workflows in industries like finance and healthcare. Its performance on the Frontier Math benchmark, where it solved research-level mathematical problems with high accuracy, further highlights its ability to tackle knowledge-intensive challenges with speed and reliability. Discover how ChatGPT 5.4 Pro supports applications such as financial modeling, legal research and cybersecurity, while also addressing the dual-use risks associated with its deployment. The feature examines its real-time task execution capabilities, including navigating desktop environments, alongside the ethical and cost-related considerations that may influence its broader adoption. ChatGPT 5.4 Pro has achieved unprecedented results in rigorous performance evaluations, solidifying its position as the most advanced AI model to date. It surpasses competitors such as Gemini 3.1 and Claude Opus 4.6 in critical areas, including: One of its most notable achievements is its performance in the Frontier Math benchmark, where it solved research-level mathematical problems with unparalleled accuracy. In professional task benchmarks, ChatGPT 5.4 Pro achieved a 52% success rate, double that of earlier models, while completing tasks in a fraction of the time. Across 44 occupations, it matches or exceeds human-level performance in 83% of tasks, offering faster and more cost-effective solutions. These results highlight its potential to transform industries that rely on complex, knowledge-intensive workflows, such as finance, healthcare and engineering. The versatility of ChatGPT 5.4 Pro is one of its defining features, allowing it to excel across a wide range of professional and creative domains. Its applications include: For the first time, an AI model integrates native computer-use capabilities. ChatGPT 5.4 Pro can navigate desktop environments, execute tasks in real time and interact seamlessly with software applications. This functionality allows it to automate workflows, enhance productivity and streamline operations across industries. For example, in financial services, it can analyze market data and generate actionable insights, while in legal contexts, it can review case law and draft documents with remarkable efficiency. Its ability to adapt to diverse tasks makes it a valuable tool for businesses, researchers and creative professionals alike. By reducing the time and effort required for complex tasks, ChatGPT 5.4 Pro enables users to focus on higher-level strategic decisions and innovation. Check out more relevant guides from our extensive collection on ChatGPT 5 that you might find useful. Despite its new capabilities, the high operational costs of ChatGPT 5.4 Pro present a significant challenge to widespread adoption. At $30 per million input tokens and $180 per million output tokens, the financial burden may deter smaller organizations and independent professionals from using its full potential. While its efficiency and productivity gains can offset some of these costs, the price-to-performance ratio remains a critical consideration for many potential users. For larger enterprises, the investment may be justified by the model's ability to streamline operations and reduce labor costs. However, smaller businesses and startups may struggle to afford the technology, potentially widening the gap between organizations that can access innovative AI tools and those that cannot. This raises important questions about equity and accessibility in the rapidly evolving AI landscape. ChatGPT 5.4 Pro's advanced capabilities extend into the realm of cybersecurity, where it is classified as a "high capability" model. On one hand, it can assist organizations in identifying vulnerabilities, strengthening defenses and responding to potential threats. On the other hand, its ability to execute complex, multi-step cyberattacks poses significant risks if misused. The dual-use nature of ChatGPT 5.4 Pro underscores the importance of implementing robust access controls and security measures. For example, mandatory ID verification and tiered access levels could help prevent unauthorized exploitation of the model's capabilities. Without such safeguards, the risk of large-scale cyberattacks and other malicious activities increases, potentially endangering critical infrastructure and sensitive data. The broader implications of these risks highlight the need for comprehensive regulatory frameworks. Policymakers, researchers and industry leaders must collaborate to establish guidelines that ensure the responsible use of advanced AI technologies while minimizing potential harm. The rapid advancements demonstrated by ChatGPT 5.4 Pro have far-reaching implications for society, particularly in the context of workforce dynamics and ethical considerations. Its ability to perform complex tasks at a human level could disrupt job markets, especially in white-collar professions such as finance, law and research. As AI systems take on roles traditionally performed by humans, industries may need to adapt by reskilling workers and redefining job functions. At the same time, the accelerating pace of AI development raises pressing ethical and regulatory challenges. As models like ChatGPT 5.4 Pro approach human-level performance, questions about accountability, transparency and societal impact become increasingly urgent. For instance, who is responsible for decisions made by AI systems and how can biases in training data be mitigated to ensure fair outcomes? Addressing these challenges will require a collaborative approach involving policymakers, researchers and industry stakeholders. By prioritizing transparency, fairness and accountability, it is possible to harness the benefits of AI while minimizing its potential drawbacks. ChatGPT 5.4 Pro represents a pivotal moment in the evolution of artificial intelligence, offering unmatched capabilities in reasoning, problem-solving and professional task execution. Its potential to transform industries and enhance productivity is undeniable. However, its high cost and potential cybersecurity risks highlight the need for careful consideration of its broader implications. As AI technologies continue to advance, the challenge lies in balancing innovation with ethical responsibility. Making sure that the benefits of AI are realized without compromising safety, security, or societal well-being will require thoughtful regulation, transparent development practices and ongoing dialogue among stakeholders. By addressing these challenges proactively, it is possible to unlock the full potential of AI while safeguarding the interests of individuals, organizations and society as a whole. Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
[19]
Just a Day After Releasing GPT-5.3 Instant, OpenAI Teases GPT-5.4 Model
OpenAI teased the release of the GPT-5.4 artificial intelligence (AI) model on Wednesday. The San Francisco-based AI giant said that the major update to the existing model could arrive soon, making it the fastest launch teaser after the launch of a major model. Interestingly, on Tuesday, the company released the GPT-5.3 Instant to all ChatGPT users. The model focuses on improved conversations and better writing quality. Notably, the developments arrived after a website claimed that about 2.5 million ChatGPT users intend to quit the platform. GPT-5.4 AI Model Could Arrive Soon In a post on X (formerly known as Twitter), the official handle of OpenAI said, "5.4 sooner than you Think." For the unaware, the GPT-5.4 is the next major update to the GPT 5 series model and is expected to arrive with significant improvements across different capabilities. While teasing the launch of a new AI model is not a new thing for the company, the timing of it is interesting. Just a day before the post, OpenAI released the GPT-5.3 Instant model. It is a general-purpose model which is being rolled out to all ChatGPT users, regardless of their subscription status. While the model does not bring a significant jump in multimodal or coding capabilities, it focuses on conversations and writing-based tasks. In a post, the company said the GPT-5.3 Instant will show fewer disclaimers and better balance information sourced from the web with its own knowledge and reasoning. Additionally, the model is said to have a more "to-the-point" conversation style and a better grasp of writing content that deals with heavy emotions or involves an imaginative prompt. While it cannot be said when OpenAI might release the GPT-5.4 AI model, announcing it so soon after the release of GPT-5.3 Instant could be related to the reports of a large number of users planning to ditch ChatGPT. A website claims, based on individual pledges, social media posts, and app usage data, that as many as 2.5 million users could stop using the AI platform. It is said that the user exodus is related to OpenAI's deal with the Pentagon to let the US government use its technology for any lawful purpose.
[20]
ChatGPT 5.4: All-new 'powerful' OpenAI update for Excel users; Availability, usage in daily life and more
ChatGPT 5.4: OpenAI has launched ChatGPT for Excel in beta, a specialized add-in powered by the new GPT-5.4 Thinking model designed specifically for financial workflows. OpenAI introduced deep data integrations with trusted providers like FactSet, S&P Global, and LSEG, enabling finance professionals to pull real-time market data into ChatGPT to automate research, due diligence, and valuation tasks. OpenAI has officially entered the financial software market with the release of ChatGPT for Excel, a beta add-in that embeds its most advanced artificial intelligence directly into Microsoft spreadsheets. Powered by the newly unveiled GPT-5.4 Thinking model, the tool is engineered to automate labor-intensive tasks such as three-statement modeling, scenario analysis, and data extraction that previously took analysts days to complete. This launch is accompanied by a suite of high-level financial data partnerships, signaling a strategic shift by OpenAI to provide specialized, industry-specific solutions for investment banks, accountants, and researchers. ChatGPT for Excel is an embedded tool that lets users talk to their spreadsheets in plain English to create or change live models. Instead of manually typing every formula or tracing links between cells, a user can describe a financial scenario, and the AI will build the formatted Excel-native workbook automatically. It can also look at large, inherited templates to explain how formulas connect or find and fix errors in the logic. To ensure accuracy, the AI asks for permission before making changes and provides links to the exact cells it used for its calculations. The system runs on GPT-5.4, specifically the "GPT-5.4 Thinking" version, which is optimized for complex financial reasoning. In internal benchmarks for investment banking tasks, this model improved accuracy from 43.7% seen in GPT-5 to a significantly higher 87.3%. This model is designed to handle longer sets of data and more difficult tasks like underwriting and long-form investment research with much higher reliability than previous versions. OpenAI has added direct connections to major data providers including Moody's, S&P Global, Dow Jones Factiva, and LSEG. These integrations allow teams to pull trusted market and company data directly into their ChatGPT workflow without switching between different apps. This makes it much faster to produce cited outputs like earnings summaries, credit memos, and valuation snapshots because the AI can access the necessary filings and transcripts instantly. The beta is currently rolling out to ChatGPT Plus, Team, Enterprise, and Edu users in the United States, Canada, and Australia. While it is powerful, OpenAI notes that some complex formulas might still need a human to double-check them, and the AI might occasionally need help with specific formatting styles. For companies worried about privacy, OpenAI confirmed that data shared within the Enterprise version is not used to train their AI models. (You can now subscribe to our Economic Times WhatsApp channel)
[21]
OpenAI Launches GPT-5.4 to Automate Complex Professional Work | PYMNTS.com
The company released the model, GPT-5.4, Thursday (March 5) in ChatGPT (as GPT-5.4 Thinking), the API and Codex. It also released GPT-5.4 Pro in ChatGPT and the API for users seeking "maximum performance on complex tasks," OpenAI said in a Thursday press release. GPT-5.4 includes the coding capabilities of GPT-5.3-Codex as well as improvements in working across tools, software environments, and tasks that involve spreadsheets, presentations and documents, according to the release. GPT-5.4 Thinking improves deep web research, better maintains context, and provides users with an upfront plan of its thinking so they can adjust course while it's working, the release said. In the API and Codex, GPT-5.4 has computer-use capabilities, supports up to 1 million tokens of context, helps agents find the right tools, and is OpenAI's most token-efficient reasoning model yet, per the release. In a test of its ability to produce knowledge work across 44 occupations, GPT-5.4 matches or exceeds industry professionals in 83% of comparisons, according to the release. The test included tasks such as generating sales presentations, accounting spreadsheets, urgent care schedules, manufacturing diagrams or short videos. "The result is a model that gets complex real work done accurately, effectively and efficiently -- delivering what you asked for with less back and forth," OpenAI said in the release. OpenAI Chief Financial Officer Sarah Friar told CNBC in January that she expects the share of OpenAI's business that is made up of enterprise customers to increase from the current 40% to 50% by the end of the year. In February, OpenAI announced that it now offers both a platform and the services of human engineers to help enterprises use AI agents. The new platform, Frontier, helps enterprises build, deploy and manage AI agents. During the same month, OpenAI said it is collaborating with four consulting firms to help it deploy Frontier and help customers that adopt the enterprise platform "define strategy, integrate systems, redesign workflows, and scale deployment globally."
[22]
ChatGPT 5.4 Introduces Native Computer Control for Web Tasks
OpenAI has introduced ChatGPT 5.4, a new model that expands the boundaries of artificial intelligence with notable updates in functionality and scope. One key feature is its 1-million-token context window, which supports extended tasks such as drafting detailed overviews or processing large datasets without interruptions. According to Better Stack, the model also incorporates native computer control, allowing it to perform actions like executing mouse and keyboard commands or integrating with libraries such as Playwright. These capabilities enhance its range of applications but come with trade-offs, including slower processing speeds and higher operational costs. ChatGPT 5.4's dynamic search capabilities optimize workflows by retrieving information only when needed, reducing unnecessary token usage. You'll also see how its mid-response steering allows users to adjust outputs in real time, offering greater control over the generation process. Additionally, this feature examines the model's strengths in technical problem-solving and its challenges in creative tasks like interface design. ChatGPT 5.4 offers a suite of features that elevate its utility in both technical and professional environments. These capabilities are tailored to meet the demands of modern workflows: GPT-5.4 combines the technical expertise of Codeex 5.3 with the professional task-handling capabilities of GPT-5.2, making it one of the most versatile AI models available. It ranks among the top performers in intelligence benchmarks, tying with Gemini AI in various evaluations. Whether you are debugging complex code, solving iterative problems, or automating workflows, ChatGPT 5.4 delivers consistent and reliable results. Its ability to handle both technical and general-purpose applications ensures it remains a valuable asset across industries. Here are additional guides from our expansive article library that you may find useful on ChatGPT. Several aspects set GPT-5.4 apart from its predecessors and competitors, reinforcing its position as a leader in AI technology: Despite its impressive capabilities, ChatGPT 5.4 is not without its limitations. Users should be aware of the following challenges: The pricing for GPT-5.4 reflects its advanced capabilities and extended context functionality. Users can choose between different tiers based on their needs: ChatGPT 5.4 represents a significant evolution in artificial intelligence, offering advanced coding capabilities, professional task automation and an extended context window that supports complex workflows. Its innovative features, such as native computer control and mid-response steering, make it a versatile tool for a wide range of applications. However, potential users should carefully consider its slower token generation, higher costs and limitations in creative tasks before adopting the model. For those prioritizing precision, adaptability and functionality, GPT-5.4 is a powerful and valuable resource that can transform the way technical and professional tasks are approached. Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
[23]
OpenAI's New ChatGPT 5.4 Thinking Model Adds Computer Interaction for Apps & Web Workflows
The integration of GPT-5.4 Thinking into frontend development introduces a new level of efficiency and precision, particularly through its enhanced Computer Use Ability (CUA). This feature allows the model to interact with digital systems in a human-like manner, eliminating the need for external environments and streamlining complex workflows. OpenAI highlights how ChatGPT 5.4 Thinking can handle intricate tasks, such as designing and testing a 3D chess game with advanced textures and rule adherence, all while significantly reducing computational overhead. These capabilities not only simplify technical processes but also prioritize high-quality output and usability. In this overview, you'll explore how ChatGPT 5.4 Thinking enables developers to convert design inputs, like images, into fully functional websites with accurate styling and responsive layouts. You'll also learn how its self-checking mechanisms ensure alignment between design and output, reducing manual adjustments. Additionally, the model's ability to manage concurrent processes, such as generating visual assets and validating functionality, offers practical insights into optimizing workflows for both small-scale and complex projects. This breakdown provides a clear look at how these features can enhance your development process and elevate the end-user experience. GPT-5.4 Thinking Overview Human-Like Computer Interaction (CUA): A Breakthrough in Usability A defining feature of ChatGPT 5.4 Thinking is its enhanced Computer Use Ability (CUA), which enables the model to interact with digital systems in a manner that closely resembles human behavior. This eliminates the need for external tools or environments, allowing tasks to be executed seamlessly within existing systems. For instance, consider assigning the model a complex task such as building and testing a 3D chess game. GPT-5.4 Thinking can render advanced visual effects, including glass and marble textures, while adhering to intricate game rules like castling or en passant. This human-like interaction simplifies workflows, making sure both adaptability and precision. Additionally, the model significantly reduces computational overhead by cutting token usage by up to two-thirds in specific scenarios. This makes it a cost-efficient solution for managing intricate rule-based systems or designing interactive applications, allowing developers to save both time and resources without compromising on quality. From Design to Fully Functional Websites GPT-5.4 Thinking introduces a innovative capability: converting design inputs, such as images, into fully functional websites. By using advanced image-to-website conversion tools, the model analyzes design elements and translates them into cohesive, responsive web pages. For example, if you provide an image of a website layout, ChatGPT 5.4 Thinking can generate a corresponding site with accurate alignment, styling and functionality. This feature is further enhanced by its ability to handle concurrent image generation, allowing multiple visual assets to be created simultaneously. This not only accelerates the development process but also ensures consistency across all generated elements. To uphold quality, the model employs self-checking mechanisms that compare the final output with the original design input. This ensures that the resulting website aligns with your vision and meets high standards of usability and design, reducing the need for extensive manual adjustments. Discover other guides from our vast content that could be of interest on ChatGPT 5. Optimizing Efficiency and Streamlining Workflows Efficiency lies at the core of GPT-5.4 Thinking's design. By automating repetitive tasks and optimizing resource usage, the model allows developers to focus on higher-level decision-making and creative problem-solving. Its self-checking mechanisms play a crucial role in maintaining output quality, autonomously identifying and correcting potential errors. This minimizes the need for extensive manual reviews, saving valuable time and effort. The model's ability to handle concurrent processes further enhances workflow optimization. For example, during the development of a web application, GPT-5.4 Thinking can simultaneously generate visual assets, write code and validate functionality. This integrated approach ensures that even complex projects remain streamlined, efficient and on schedule. Expanding Capabilities: 3D Rendering and Rule-Based Precision Beyond frontend development, GPT-5.4 Thinking excels in specialized areas such as 3D rendering and rule-based task execution. Its ability to create realistic effects, like glass and marble textures, adds a new level of sophistication to digital design. Whether you're developing a gaming application or a virtual showroom, the model's rendering capabilities deliver visually stunning results that enhance user engagement. In addition, ChatGPT 5.4 Thinking's proficiency in handling complex rules makes it an ideal tool for applications requiring precision and logic. For instance, in a chess game, the model can manage intricate moves like castling or en passant with ease, making sure all interactions adhere to established rules. This level of accuracy extends to other rule-based systems, making it a versatile solution for a wide range of technical applications. Redefining AI-Driven Development ChatGPT 5.4 Thinking establishes a new standard in app and web development by combining human-like interaction, design-to-code automation and resource efficiency. Its ability to transform images into functional websites, optimize workflows and handle complex tasks with precision demonstrates the potential of AI to transform development environments. By integrating these capabilities into your projects, you can enhance productivity, reduce costs and deliver exceptional user experiences, setting a new benchmark for innovation in the digital landscape. Media Credit: OpenAI Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
[24]
OpenAI to release GPT-5.4 model with expanded context window - The Information By Investing.com
Investing.com - OpenAI plans to release its next GPT model, GPT-5.4, according to The Information, citing a person with knowledge of the development. The new model will feature a context window exceeding 1 million tokens, more than double the 400,000 token capacity of the current GPT-5.2 model. This expansion will allow the model to process queries containing significantly more words or data. The upgrade will match the context window capabilities of competing models from Google and Anthropic, which already support 1 million tokens. Previous OpenAI models, including GPT-4.1, supported a 1 million token context window, but this feature was not present in version 5.2. GPT-5.4 will demonstrate improved performance on tasks requiring several hours to complete, according to the person. The model will better retain information about user requests and operational parameters across multiple steps, while showing reduced error rates. These enhancements could benefit applications such as OpenAI's Codex coding tool, which uses AI to automate complex, extended tasks. The model will introduce an extreme reasoning mode, enabling it to allocate substantially more time and computational resources to difficult questions, the person said. This article was generated with the support of AI and reviewed by an editor. For more information see our T&C.
[25]
ChatGPT 5.4 vs Gemini 3.1 Pro Benchmarks : Thinking, Pro & Instant Tiers
GPT-5.4 represents a significant step forward in AI capabilities, offering tailored solutions through its three specialized versions: GPT-5.4 Thinking, GPT-5.4 Pro and GPT-5.3 Instant. Each version is designed to address specific user needs, from handling complex workflows to delivering rapid responses. Skill Leap AI explores how features like reduced hallucination rates and native computer integration make this model more reliable and versatile for tasks such as coding, document formatting and data analysis. For example, ChatGPT 5.4 Thinking allows users to adjust effort levels, making it particularly effective for in-depth research or strategic planning. In this guide, you'll find a breakdown of how GPT-5.4's advancements can enhance your daily workflows. Learn how its enhanced coding efficiency can save time for developers, how ChatGPT for Excel simplifies spreadsheet tasks and how native integration bridges the gap between AI and personal computing. Whether you're a professional managing high-stakes projects or someone looking for quick, accurate assistance, this overview highlights practical ways to make the most of GPT-5.4's capabilities. ChatGPT 5.4 Key Highlights Three Tailored Versions for Every User GPT-5.4 is available in three specialized variants, each optimized to meet specific needs and use cases: * GPT-5.4 Thinking: Designed for handling complex and detailed tasks, this version allows users to adjust effort levels, striking a balance between speed and thoroughness. It is ideal for in-depth research, strategic planning and intricate problem-solving. * GPT-5.4 Pro: Tailored for professionals, this version excels in high-precision projects and advanced research. It offers enhanced performance for tasks requiring accuracy and reliability, making it suitable for industries such as finance, law and engineering. * GPT-5.3 Instant: Focused on speed, this version is perfect for quick responses and fast-paced environments. It delivers high-quality outputs in minimal time, making it ideal for tasks like drafting emails, summarizing documents, or generating quick insights. These options empower users to select the version that best aligns with their specific goals, whether they require detailed analysis or rapid execution. Advancing Knowledge Work ChatGPT 5.4 introduces significant advancements in knowledge work, streamlining tasks such as creating spreadsheets, formatting documents and designing presentations. With minimal input, users can generate professional-grade outputs, saving both time and effort. The integration of ChatGPT for Excel as a paid add-on further enhances productivity by automating spreadsheet tasks, including data analysis, formula generation and formatting. For professionals managing multiple responsibilities, these features translate into tangible benefits. Whether drafting overviews, preparing presentations, or organizing data, GPT-5.4 simplifies complex workflows while maintaining a high standard of accuracy. Its ability to adapt to user prompts ensures a seamless experience, even for intricate tasks. Browse through more resources below from our in-depth content covering more areas on ChatGPT 5. Native Computer Integration: Bridging AI and Everyday Tasks For the first time, GPT-5.4 offers native integration with personal computers, allowing it to perform tasks such as data entry, email management and file organization directly within the user's system. This feature eliminates the need for manual intervention, creating a more seamless interaction between AI and everyday computer operations. For example, users can instruct GPT-5.4 to draft an email, attach relevant files and send it, all without leaving the ChatGPT interface. This integration not only enhances usability but also represents a significant step forward in making AI a practical tool for daily tasks. By bridging the gap between AI and traditional computer functions, GPT-5.4 redefines convenience and efficiency. Enhanced Coding Efficiency Developers will find GPT-5.4 particularly valuable for its improved coding capabilities. Matching the performance of GPT-5.3 Codeex, a specialized programming model, GPT-5.4 introduces greater efficiency in tool usage and reduces token consumption. These improvements lower costs and accelerate the coding process, making it an indispensable resource for software development. While occasional inconsistencies in coding outputs may require manual adjustments, GPT-5.4 remains a reliable assistant for generating code, debugging and optimizing workflows. Its ability to adapt to user feedback ensures that it can handle a wide range of programming tasks, from simple scripts to complex algorithms. Reducing Hallucination for Greater Reliability One of the persistent challenges in AI development has been hallucination, where models generate inaccurate or nonsensical information. GPT-5.4 addresses this issue by further reducing hallucination rates, making it more dependable for critical tasks. Whether conducting research, drafting legal documents, or analyzing data, users can trust GPT-5.4 to deliver accurate and relevant information. This improvement enhances the model's reliability, particularly for professionals who rely on AI for high-stakes projects. By minimizing errors and inconsistencies, GPT-5.4 ensures that users can focus on their work without the need for extensive fact-checking or corrections. Performance Benchmarks and Competitive Edge When compared to competitors such as Anthropic's OP 4.6 and Google's Gemini 3.1 Pro, GPT-5.4 consistently outperforms in key areas, including knowledge work, coding and adaptability. While the margin of improvement varies depending on the task, these benchmarks highlight GPT-5.4's position as a leading AI model in the current market. Its ability to handle diverse tasks with precision and efficiency sets it apart from other models, making it a preferred choice for users across industries. As the AI landscape continues to evolve, GPT-5.4's performance benchmarks will serve as a reference point for future innovations. Practical Applications Across Industries The versatility of GPT-5.4 is evident in its wide range of practical applications. From creating presentations and automating spreadsheet tasks to developing coding projects, the model adapts to user needs with minimal input. Its ability to follow prompts and adjust during tasks ensures a seamless experience, even for complex workflows. For instance, a business professional can use GPT-5.4 to draft a comprehensive proposal, complete with charts and data visualizations, based on initial instructions. Similarly, educators can use the model to create lesson plans, while developers can rely on it for debugging and optimizing code. This adaptability makes GPT-5.4 an invaluable tool for professionals across various fields. Addressing Limitations and Future Potential Despite its advancements, GPT-5.4 is not without limitations. Some inconsistencies in writing style adherence and prompt execution have been observed, requiring occasional user intervention. Additionally, while its coding functionality is robust, minor errors in outputs may necessitate manual review. These limitations, though relatively minor, highlight areas where future iterations could further refine the model's performance. By addressing these challenges, GPT-5.4 has the potential to become even more reliable and versatile, solidifying its position as a leader in the AI industry. Cost Efficiency and Token Optimization Although GPT-5.4 comes with a higher price tag than its predecessors, its improved token efficiency helps offset the cost. By consuming fewer tokens for the same tasks, the model provides better value for users with high-volume needs. This makes it a cost-effective choice for individuals and organizations seeking a balance between performance and affordability. For businesses managing large-scale projects, the combination of enhanced capabilities and token optimization ensures that GPT-5.4 delivers a strong return on investment. Its ability to handle complex tasks efficiently further underscores its value as a productivity tool. Shaping the Future of AI As the AI field continues to advance, GPT-5.4 faces competition from models like Google Gemini and Anthropic Claude. These emerging technologies will undoubtedly push the boundaries of what AI can achieve, creating a dynamic and competitive landscape. Future comparisons between these models will provide valuable insights into their respective strengths and weaknesses, helping users make informed decisions. GPT-5.4's combination of versatility, reliability, and efficiency positions it as a frontrunner in this evolving market. Whether managing data, writing code, or creating presentations, it offers practical solutions that streamline workflows and enhance productivity. Media Credit: Skill Leap AI Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
[26]
New ChatGPT 5.4 : 1M-Token Context & "Extreme Reasoning" Targets Long Tasks
OpenAI's GPT-5.4 introduces a range of advancements that push the boundaries of artificial intelligence, with a particular focus on extended memory and coding capabilities. As highlighted by World of AI, one of its standout features is the 1 million token context window, allowing the model to process and retain vast amounts of information. This makes it especially effective for handling intricate workflows, such as analyzing lengthy legal documents or managing multi-step technical tasks. Additionally, its Extreme Reasoning Mode is designed to tackle complex challenges with greater depth, offering nuanced solutions over extended problem-solving sessions. In this overview, you'll explore how GPT-5.4 performs in real-world scenarios, from generating full-fledged applications like a browser-based operating system to automating front-end development tasks. You'll also gain insights into its creative potential, including its ability to produce SVG graphics and procedural simulations. Whether you're a developer seeking to streamline workflows or a creative professional exploring new design possibilities, this breakdown will provide a clear understanding of what GPT-5.4 brings to the table. OpenAI ChatGPT 5.4 Overview Core Features and Technological Advancements GPT-5.4 introduces a range of features that significantly enhance its utility across diverse applications: * 1 Million Token Context Window: This feature enables the model to process and retain vast amounts of information, making it ideal for handling lengthy documents, intricate workflows and multi-step tasks with exceptional efficiency. * Extreme Reasoning Mode: Designed for tackling complex challenges, this mode allows the model to engage in deeper, more nuanced reasoning over extended periods, offering solutions to intricate problems. * Enhanced Memory Optimization: By improving memory usage, GPT-5.4 minimizes errors in multi-step workflows, making sure greater accuracy and reliability in long-running processes and detailed research tasks. * Automation Integration: Seamless compatibility with tools like OpenAI's Codex enhances its ability to automate coding and development tasks, streamlining workflows for technical professionals and reducing manual effort. These advancements collectively position GPT-5.4 as a versatile and powerful tool, capable of addressing complex challenges across industries with precision and efficiency. Real-World Applications and Use Cases The practical capabilities of GPT-5.4 extend beyond theoretical improvements, with real-world applications demonstrating its fantastic potential: * Advanced Coding Proficiency: GPT-5.4 excels in generating complex applications, such as a browser-based operating system resembling macOS and a fully functional Minecraft clone. Its ability to support 3D environment creation and procedural simulations further highlights its technical prowess. * Front-End Development: The model effectively produces tools like cropping applications, fitness trackers and React landing pages. While some outputs may exhibit minor inconsistencies in design quality, its overall performance in front-end development is highly commendable. * Creative Outputs: Enhanced capabilities in generating SVG graphics and procedural simulations make GPT-5.4 a valuable asset for creative industries, offering new possibilities for designers, animators and content creators. These examples illustrate the model's ability to deliver high-value solutions for developers, researchers and creative professionals, making it a practical choice for a wide range of industries. Below are more guides on ChatGPT 5 from our extensive range of articles. Design and Usability Considerations While GPT-5.4 demonstrates exceptional capabilities, certain design-related challenges persist. For instance, its focus on maintaining consistent UI styling can sometimes result in repetitive or overly uniform designs. Additionally, while its performance in generating React landing pages has improved compared to earlier iterations, there is still room for refinement in terms of design quality and creativity. These observations suggest that while the model is highly capable, further fine-tuning may be required to optimize its performance for specific design-focused tasks. Competitive Positioning in the AI Market GPT-5.4 enters a competitive AI landscape, where models like Google DeepMind's Gemini and Anthropic's Claude are pushing the boundaries of multimodal capabilities and extended context handling. OpenAI's latest model distinguishes itself with its advanced features, such as the 1 million token context window and extreme reasoning mode. However, its premium capabilities are likely to come at a higher cost, positioning GPT-5.4 as a tool primarily suited for professionals and enterprises with demanding requirements. This strategic focus ensures that the model caters to users who can fully use its advanced functionalities. Applications Across Diverse Industries The versatility of GPT-5.4 opens up a wide range of possibilities for various sectors, allowing innovation and efficiency in numerous fields: * Scientific and Academic Research: The model's ability to process extensive datasets and engage in deep reasoning makes it an invaluable tool for researchers tackling complex problems and conducting detailed analyses. * Software Development: Integration with tools like Codex allows developers to automate coding tasks, streamline workflows and enhance productivity, making it a powerful asset for technical professionals. * Creative Industries: Advanced capabilities in graphics generation and procedural simulations cater to designers, animators and other creative professionals, allowing them to explore new possibilities and push creative boundaries. * Long-Running Workflows: Optimized memory usage and reduced error rates make GPT-5.4 ideal for tasks requiring sustained focus and precision, such as legal document analysis, financial modeling and other complex processes. These applications underscore the model's potential to transform workflows, drive innovation and deliver tangible benefits across diverse fields. Shaping the Future of Artificial Intelligence GPT-5.4 represents a significant advancement in artificial intelligence, combining innovative reasoning, coding expertise and creative capabilities into a single, powerful model. While minor design inconsistencies remain, its overall performance and versatility position it as a leading contender in the competitive AI market. As OpenAI prepares for its release, GPT-5.4 is set to redefine the possibilities of AI, offering practical solutions and driving innovation for professionals and industries worldwide. Media Credit: WorldofAI Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
[27]
New OpenAI GPT-5.4 AI Model : Everything You Need to Know
GPT-5.4 brings significant updates designed to meet the demands of complex professional and technical workflows. According to Prompt Engineering, one notable improvement is the expanded context window, which can process up to 1 million tokens in a single input. This capability eliminates the need to divide large datasets into smaller segments, making it particularly effective for tasks such as analyzing lengthy legal documents or conducting multi-step reasoning. Another key feature is interruptible reasoning, which allows users to modify the model's approach during a task without restarting, offering flexibility for projects that require ongoing adjustments. You'll learn how GPT-5.4's dynamic resource management reduces token usage by 47%, improving efficiency in tasks like debugging and data processing. You'll also see how its native computer interaction enables precise navigation of software environments, supporting workflows in areas such as design and analysis. This overview provides a detailed look at these features and their practical applications across diverse professional contexts. GPT-5.4 delivers a suite of innovative features that enhance its versatility and utility across a wide range of applications. Below are its standout capabilities: One of the most notable advancements in GPT-5.4 is its ability to interact natively with computer systems. By navigating user interfaces with human-like precision, the model eliminates the need for custom agent-specific designs. This capability allows seamless integration into workflows for tasks such as file management, process automation and navigating complex software environments. The result is a more efficient and intuitive interaction between AI and existing tools, reducing the time and effort required for manual operations. Here are additional guides from our expansive article library that you may find useful on ChatGPT 5. The expanded context window of GPT-5.4, now capable of processing up to 1 million tokens, represents a major leap in handling large-scale tasks. This feature enables the model to tackle intricate workflows without requiring you to break down or simplify inputs. Professionals working on extensive coding projects, legal document analysis, or multi-step problem-solving will find this particularly beneficial. By retaining a broader context, GPT-5.4 ensures that outputs remain accurate and relevant, even for highly complex tasks. Dynamic tool loading is a key innovation in GPT-5.4, designed to enhance efficiency by accessing tools only when needed. Unlike preloading, this approach reduces token usage by nearly half, minimizing computational overhead while improving performance. For example, when debugging code, the model dynamically loads debugging tools, making sure resources are allocated effectively. This feature not only accelerates task completion but also reduces operational costs, making it an invaluable asset for resource-intensive applications. Interruptible reasoning introduces a new level of control and flexibility to AI interactions. If the model begins solving a problem in an unintended direction, you can intervene mid-task to redirect its approach without restarting the session. This feature fosters collaboration and ensures more accurate and relevant outcomes. It is particularly useful for complex or evolving tasks, where maintaining control over the AI's reasoning process is critical to achieving desired results. GPT-5.4 sets new performance benchmarks, scoring 83% on the GDP Wall benchmark, which evaluates AI effectiveness across 44 knowledge work occupations. This represents a 20% improvement over GPT-5.2 at comparable costs. Its coding capabilities are especially noteworthy, delivering faster and more efficient outputs for medium reasoning tasks. These advancements make GPT-5.4 a powerful tool for professionals in fields such as software development, data analysis and technical design, where precision and speed are paramount. For developers and designers, GPT-5.4 offers significant improvements that streamline both technical and creative workflows. Its coding performance rivals that of GPT-5.3 Codex, with better token efficiency and the introduction of a fast mode for quicker outputs. On the design side, the model has achieved a 9-position jump on the Design Arena benchmark, reflecting its ability to create and refine user interfaces with greater precision. These enhancements save time and effort, allowing professionals to focus on higher-level tasks. Efficiency lies at the heart of GPT-5.4's design. Improved token usage and reduced computational overhead translate to faster processing times and lower operational costs. The fast mode in Codex accelerates outputs, making the model ideal for time-sensitive projects. Whether you're generating detailed overviews, debugging complex code, or analyzing large datasets, GPT-5.4 ensures tasks are completed with speed and accuracy, meeting the demands of modern workflows. While GPT-5.4 is priced slightly higher than GPT-5.2, its enhanced capabilities and efficiency offset the additional cost. The Pro versions, tailored for specialized use cases, come at a premium but deliver unparalleled performance for demanding applications. For most users, the standard version strikes a balance between cost and functionality, offering a practical solution for a wide range of tasks. This balance ensures that GPT-5.4 remains accessible while delivering exceptional value. The release of GPT-5.4 underscores the rapid pace of AI innovation, with three major updates introduced in less than a year. This accelerated development cycle reflects a commitment to continuous improvement and responsiveness to user needs. By staying at the forefront of AI advancements, GPT-5.4 ensures you have access to the latest tools and technologies, empowering you to meet the demands of modern workflows with confidence and efficiency. GPT-5.4 represents a significant step forward in artificial intelligence, offering a comprehensive suite of features designed to enhance productivity, usability, and efficiency. From native computer interaction to expanded context windows and dynamic tool loading, the model is tailored to meet the needs of professionals across diverse fields. Its improved performance benchmarks, coding capabilities and UI design enhancements make it a versatile and powerful tool. While its cost may be slightly higher than previous versions, the value it delivers in terms of functionality and efficiency makes it a worthwhile investment for those looking to integrate advanced AI into their workflows. Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
[28]
OpenAI Leak Highlights "Extreme Reasoning Mode" for GPT-5.4
The recent leak of GPT-5.4 has sparked significant interest, offering a detailed glimpse into the model's upgraded capabilities. Universe of AI highlights one of the standout features: an expanded context window capable of processing up to 1 million tokens in a single session. This improvement allows users to handle large-scale data more efficiently, whether analyzing entire books, managing complex coding projects, or processing extensive datasets without losing context. Such advancements aim to address existing limitations in workflows that demand comprehensive analysis and continuity. In this breakdown, you'll explore how ChatGPT 5.4's Extreme Reasoning Mode enhances its ability to tackle intricate multi-step problems, making it a valuable resource for professionals in fields like engineering, law and science. Additionally, its improved memory retention across sessions ensures smoother collaboration on long-term projects. The guide also covers its lower error rates and expanded automation capabilities, offering insights into how these features can streamline operations and improve accuracy across industries. A standout feature of ChatGPT 5.4 is its expanded context window, which can process approximately 1 million tokens in a single session. This enhancement enables the model to handle vast amounts of data without losing context, making it particularly valuable for tasks that require comprehensive analysis. This capability positions GPT-5.4 as an indispensable tool for professionals who deal with complex, data-intensive tasks. GPT-5.4 introduces an Extreme Reasoning Mode, a feature tailored for solving intricate, multi-step logical problems with precision. This advancement is particularly beneficial for industries where logical accuracy is paramount. By enhancing its reasoning capabilities, GPT-5.4 becomes a powerful asset for tackling challenges that demand high levels of logical precision. Here are more detailed guides and articles that you may find helpful on ChatGPT 5. The model's upgraded memory capabilities allow it to retain information across multiple interactions, reducing the need for users to repeatedly provide context. This feature is especially useful in scenarios requiring continuity and consistency. This improvement enhances productivity and ensures smoother workflows in both professional and research environments. Accuracy remains a cornerstone of AI performance and ChatGPT 5.4 delivers notable improvements in reducing error rates. This makes it a reliable choice for tasks where precision is essential. These advancements make GPT-5.4 a dependable tool for industries where even minor errors can have significant consequences. GPT-5.4 is optimized for agent-based automation, allowing seamless integration with tools like Codex. This feature supports businesses in streamlining their operations and reducing manual intervention. This capability allows businesses to focus on innovation and strategic growth while relying on GPT-5.4 for operational efficiency. GPT-5.4 demonstrates significant potential in scientific research, offering tools to accelerate discoveries and innovations across various fields. By providing accurate and actionable outputs, ChatGPT 5.4 has the potential to transform the pace and scope of scientific research. For developers, GPT-5.4 offers enhanced coding capabilities, making it a versatile tool for technical projects. The model can generate up to 6,000 lines of code in a single prompt, supporting a wide range of applications. These advancements make GPT-5.4 an essential resource for developers seeking to push the boundaries of what is possible in software engineering. In a rapidly evolving AI landscape, GPT-5.4 distinguishes itself through its unique strengths. While competitors like Gemini excel in real-time adaptability and Claude in writing tasks, GPT-5.4 offers a balanced combination of features that cater to enterprise and research-focused use cases. These features position GPT-5.4 as a strong contender in the competitive AI market, appealing to a diverse range of users. OpenAI's commitment to monthly updates ensures that GPT-5.4 remains responsive to evolving user requirements. This approach allows the model to stay relevant in a fast-paced technological environment. This strategy underscores OpenAI's focus on maintaining ChatGPT 5.4's relevance and effectiveness over time. While GPT-5.4 excels in reasoning, memory and coding, its general-purpose writing capabilities still fall short compared to competitors like Claude. Tasks requiring nuanced, creative, or highly polished writing remain a challenge. However, OpenAI's focus on regular updates suggests that these gaps may be addressed in future iterations, making sure the model continues to evolve to meet user expectations. Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
[29]
OpenAI launches GPT 5.4 with better reasoning, coding and professional task support
The model is rolling out across ChatGPT, the API, and OpenAI Codex. OpenAI has launched a new AI model called GPT 5.4, which the company claims is its most 'capable and efficient frontier model' yet for professional work. The model is rolling out across ChatGPT, the API, and OpenAI Codex. Alongside it, OpenAI has also introduced GPT 5.4 Pro in ChatGPT and the API, a version designed for users who want the highest level of performance when working on complex tasks. According to the AI giant, GPT 5.4 combines the company's latest advances in reasoning, coding and agentic workflows into a single model. It builds on the strong coding abilities of GPT 5.3 Codex while improving how the AI interacts with different tools, software environments, and professional tasks. Also read: Anthropic CEO criticises OpenAI's defense deal, questions safety claims OpenAI also claims that the new model can complete complicated tasks with greater accuracy and efficiency, delivering what users asked for with less back and forth. In ChatGPT, the new model appears as GPT 5.4 Thinking and lets users see an early outline of how the AI plans to approach a task. By showing this plan before the final response is completed, users can review it and adjust the direction if needed while the model is still working. OpenAI says this helps users guide the AI more easily and get outputs that better match their expectations. Also read: Google faces lawsuit alleging Gemini AI manipulated man into suicide: Here's what happened The company also claims the model improves deep web research, especially for very specific queries. 'GPT 5.4 Thinking also improves deep web research, particularly for highly specific queries, while better maintaining context for questions that require longer thinking. Together, these improvements mean higher-quality answers that arrive faster and stay relevant to the task at hand,' OpenAI explains. For developers using the API or Codex, GPT 5.4 introduces native computer-use abilities. This allows AI agents to interact directly with computers and perform workflows across multiple applications. The model also supports up to 1 million tokens of context. OpenAI says GPT 5.4 is its 'most token efficient reasoning model' so far, using significantly fewer tokens compared to GPT 5.2. 'Together with advances in general reasoning, coding, and professional knowledge work, GPT 5.4 enables more reliable agents, faster developer workflows, and higher-quality outputs across ChatGPT, the API, and Codex,' OpenAI said.
[30]
GPT 5.4 may release soon: Features, release timeline and more details about OpenAI upcoming model
It could include an extreme reasoning mode for solving complex problems. OpenAI is reportedly working on releasing a new version of its ChatGPT model. Reports indicate that the new GPT 5.4 model is a complete upgrade from its predecessors. For instance, the new model is reportedly able to process a much larger context window. It can process over a million tokens. This means that the model can read books or long legal documents in a single session. It can also read large code projects in a single session. It is also believed that the model will have an extreme reasoning mode. This means that the model can solve problems using long calculations. Other improvements may include better memory and more consistent performance during long tasks. These changes could make a big difference in scientific research and multi-step reasoning. Read on to find out what to expect and when the GPT 5.4 may launch. Early reports suggest that GPT 5.4 will support an expanded context window exceeding one million tokens. When compared to GPT 5.2, the upcoming model is expected to offer more than double the roughly 400,000 tokens available in the latest GPT 5.2 models. This means it could read and work with entire books, long legal papers, or large code projects in a single go. would help a lot in research, coding, and data-heavy tasks. Also read: Amazon Electronic Premier League 2026: Date, offers and other details The GPT-5.4 is also said to come with an extreme reasoning mode which will let the AI tool spend more time and computing power thinking through really complex problems. Analysts suggest that the new model could improve scientific research, multi-step reasoning, and tasks that need careful, long chains of thought. Along with the above-mentioned, the reports have also indicated that the new model can also come with enhancements in memory retention and better sustained performance over extended workflows. Also read: Sony could drop PC releases and bring back strict PlayStation exclusives, report claims The company has not officially announced a launch date for the new GPT 5.4 model yet. Leaks and reports in the developing community indicate that it could launch as early as March or April 2026.
Share
Share
Copy Link
OpenAI launched GPT-5.4 Thinking and Pro models on Thursday, designed specifically for AI agents and enterprise applications. The company claims the new AI model delivers 33% fewer false claims and matches human professionals 83% of the time across 44 occupations. The release comes amid growing competition with Anthropic's Claude and controversy over OpenAI's Pentagon deal.
OpenAI released its latest AI model, GPT-5.4, on Thursday, just two days after launching GPT-5.3 Instant. The company describes GPT-5.4 as its "most capable and efficient frontier model for complex professional work," bringing together recent advancements in reasoning and coding alongside agentic workflows into a unified system
1
. Available as GPT 5.4 Thinking for ChatGPT subscribers and GPT 5.4 Pro through the API, the model is specifically designed to support AI agents—autonomous systems that can operate independently with minimal human intervention3
.
Source: Geeky Gadgets
The release positions OpenAI directly against Anthropic and its Claude models, particularly as both companies compete for enterprise applications and business professionals willing to pay monthly subscriptions
1
. Recent reports indicate Anthropic's popularity has surged, with Claude mobile apps claiming top spots in Apple's and Google's app stores, while online forums fill with advice on transferring data from ChatGPT to Claude1
.OpenAI calls GPT-5.4 its "most factual model yet," addressing ongoing concerns about AI hallucinations where models generate false information. According to OpenAI's benchmarks, responses from GPT-5.4 are 18% less likely to contain errors, while individual claims are 33% less likely to be false compared to GPT-5.2
1
3
. The company emphasizes that users should still fact-check AI-generated content despite these improvements.
Source: TechRadar
Perhaps most striking, OpenAI's testing reveals that GPT-5.4 can match or outperform human professionals 83% of the time across nine industries and 44 real-world occupations
2
. The company introduced GPTval in September, an evaluation test measuring AI performance on "economically valuable, real-world tasks" in industries contributing at least 5% to US gross domestic product2
. On the OSWorld-Verified benchmark, which monitors AI's ability to navigate desktop environments, GPT-5.4 scored 75%, up from 47.3% with GPT-5.2 and exceeding the average human result of 72.4%3
.One of GPT-5.4's most significant updates involves its ability to use native computer resources, enabling autonomous agents to complete complex tasks across multiple applications. The model can write code to operate computers, responding to mouse and keyboard commands based on screenshot analysis
3
. This capability allows developers to build agents that operate other services with limited human interaction, marking a substantial step toward fully autonomous systems3
.OpenAI emphasizes that GPT-5.4 can more efficiently support agentic activity, using less computing power and therefore costing less money than previous models
1
. The model also allows users to adjust answers mid-response while generating, enabling course corrections without starting fresh—a feature immediately available on Android and ChatGPT's website, with iPhone support coming soon3
.OpenAI announced a new suite of financial-services tools alongside GPT-5.4, designed to help professionals streamline financial analysis, investment memos, and other specialized work
4
. The product connects with ChatGPT apps from financial data firms like FactSet Research Systems Inc. and Third Bridge, while also enabling direct ChatGPT use in Microsoft Excel and Google Sheets for creating and examining financial models4
.Internal testing found that spreadsheets generated to emulate a junior investment banking analyst achieved a mean success rate of 87.3% with human raters
3
. The model demonstrates improved capabilities in generating AI-powered spreadsheets, documents, and presentations, requiring less back-and-forth interaction with users4
. OpenAI also introduced ChatGPT for Excel as a dedicated tool to help users run scenarios and generate outputs based on cells and formulas3
.Related Stories
The GPT-5.4 release arrives amid controversy surrounding OpenAI's relationship with the Department of Defense. After Anthropic was declared a "supply-chain risk" by the Pentagon following its refusal to allow AI use for mass surveillance of Americans or fully autonomous weapon systems, OpenAI struck a $200 million deal with the defense department in 2025
1
4
.
Source: Digit
Sam Altman later acknowledged that OpenAI's rush to forge the Pentagon agreement looked "opportunistic and sloppy," stating the company was working to "make some additions in our agreement to make our principles very clear"
4
. Altman clarified that safeguards would be implemented and that the technology wouldn't be made available to intelligence agencies like the NSA, though significant questions remain about how AI is being used by government agencies and defense contractors1
. Anthropic has reportedly resumed talks with the Pentagon following the initial breakdown4
.GPT 5.4 Thinking is rolling out now for Plus, Pro, and Team subscribers through ChatGPT, replacing the GPT 5.2 Thinking model
3
. The previous version will be moved to Legacy Models before removal on June 53
. GPT 5.4 Pro is available through the API for Pro and Enterprise plans, as well as for ChatGPT Enterprise and Edu subscribers5
. The model is also available in Codex, OpenAI's coding application, with API access beginning Friday1
2
. No announcement has been made regarding availability for free users.Summarized by
Navi
12 Nov 2025•Technology

08 Aug 2025•Technology

08 Aug 2025•Technology

1
Technology

2
Entertainment and Society

3
Policy and Regulation
