Curated by THEOUTPOST
On Wed, 26 Mar, 12:06 AM UTC
38 Sources
[1]
ChatGPT's image generation feature gets an upgrade | TechCrunch
During a livestream on Tuesday, OpenAI CEO Sam Altman announced the first major upgrade to ChatGPT's image generation capabilities in over a year. ChatGPT can now leverage the company's GPT-4o model to natively create and modify images and photos. GPT-4o has long underpinned the AI-powered chatbot platform, but until now, the model only been able to generate and edit text -- not images. Altman said GPT-4o native image generation is live today in ChatGPT and Sora, OpenAI's AI video generation product, for subscribers to the company's $200-a-month Pro plan. OpenAI says the feature is rolling out soon to Plus and free users of ChatGPT, as well as to developers using the company's API service. GPT-4o with image output "thinks" a bit longer than the image generation model it effectively replaces, DALL-E 3, to make what OpenAI describes as more accurate and detailed images. GPT-4o can edit existing images, including images with people in them -- transforming them or "inpainting" details like foreground and background objects. OpenAI didn't reveal which image data it used to enable the new image generation capabilities. Many generative AI vendors see training data as a competitive advantage and so keep it and any information related to it close to the chest. But training data details are also a potential source of IP-related lawsuits, another disincentive for companies to reveal much. OpenAI offers an opt-out form that allows creators to request that their works be removed from its training datasets. The company also says that it respects requests to disallow its web-scraping bots from collecting training data, including images, from websites. ChatGPT's upgraded image generation feature follows on the heels of Google's experimental native image output for Gemini 2.0 Flash, one of the company's flagship models. The powerful feature went viral on social media -- but not necessarily for the best reasons. Gemini 2.0 Flash's image component turned out to have few guardrails, allowing people to remove watermarks and create images depicting copyrighted characters.
[2]
ChatGPT's new AI image feature is delayed for free users | TechCrunch
OpenAI CEO Sam Altman announced on Wednesday that the rollout of ChatGPT's viral new AI image features to free users would be delayed, citing significantly higher demand than the company expected. "Images in ChatGPT are wayyyy more popular than we expected (and we had pretty high expectations)," Altman said in a post on X on Wednesday. On Tuesday, the company announced the launch of GPT-4o's native image generation, which lets users upload and modify images, saying it would soon come to all tiers of ChatGPT. As of Wednesday, OpenAI has only rolled out the feature to subscribers to ChatGPT Pro, Plus, and Teams. In recent months, OpenAI has faced a similar hiccups around product launches, repeatedly blaming a lack of compute capacity. Shortly after launching Sora in December, OpenAI disabled signups. Perhaps, the company's planned $500 billion Stargate data center project will help with those capacity constraints.
[3]
OpenAI's new image generator aims to be practical enough for designers and advertisers
The image generator, which is now part of the company's GPT-4o model, was promised by OpenAI last May but wasn't released. Requests for generated images on ChatGPT were filled by an older image generator called DALL-E. OpenAI has been tweaking the new model since then and will now release it over the coming weeks to all tiers of users starting today, replacing the older one. The new model makes progress on technical issues that have plagued AI image generators for years. While most have been great at creating fantastical images or realistic deepfakes, they've been terrible at something called binding, which refers to the ability to identify certain objects correctly and put them in their proper place (like a sign that says "hot dogs" properly placed above a food cart, not somewhere else in the image). It was only a few years ago that models started to succeed at things like "Put the red cube on top of the blue cube," a feature that is essential for any creative professional use of AI. Generators also struggle with text generation, typically creating distorted jumbles of letter shapes that look more like captchas than readable text. Example images from OpenAI show progress here. The model is able to generate 12 discrete graphics within a single image -- like a cat emoji or a lightning bolt -- and place them in proper order. Another shows four cocktails accompanied by recipe cards with accurate, legible text. More images show comic strips with text bubbles, mock advertisements, and instructional diagrams. The model also allows you to upload images to be modified, and it will be available in the video generator Sora as well as in GPT-4o. It's "a new tool for communication," says Gabe Goh, the lead designer on the generator at OpenAI. Kenji Hata, a researcher at OpenAI who also worked on the tool, puts it a different way: "I think the whole idea is that we're going away from, like, beautiful art." It can still do that, he clarifies, but it will do more useful things too. "You can actually make images work for you," he says, "and not just just look at them." It's a clear sign that OpenAI is positioning the tool to be used more by creative professionals: think graphic designers, ad agencies, social media managers, or illustrators. But in entering this domain, OpenAI has two paths, both difficult. One, it can target the skilled professionals who have long used programs like Adobe Photoshop, which is also investing heavily in AI tools that can fill images with generative AI.
[4]
OpenAI Is Bringing AI Image Generation Directly to ChatGPT
Katelyn is a writer with CNET covering social media, AI and online services. She graduated from the University of North Carolina at Chapel Hill with a degree in media and journalism. You can often find her with a novel and an iced coffee during her time off. OpenAI is bringing AI image generation directly into its flagship chatbot. Using ChatGPT's 4o model, you can now natively create AI images in your normal ChatGPT window -- no need to use Dall-E. This new feature is available now for free and paying users, with plans to bring it to those on enterprise and education plans next week. You can also try out image generation in Sora, the company's AI video generator. OpenAI's foray into image generation thus far has been focused on Dall-E, a separate program you can use through ChatGPT. You can use Dall-E to create everything from scenes from a sci-fi space opera to stock photography-style shots. Dall-E is our top pick for the best AI image generators, partly because its unique conversational style makes creating and editing images easy. Luckily, that "chat to edit" ability is coming to ChatGPT, too. (Dall-E will still be available to use.) ChatGPT is available for free, with paid plans offering more features starting at $20 per month. The limits of your current plan will apply to image generation -- if you're on the free plan, you may run into limits using the 4o model for messaging, file uploads and data analysis. The same goes for Sora users. ChatGPT Plus users will get one image per prompt. Image generation in ChatGPT 4o will focus more on creating work-related images, like infographics and diagrams. OpenAI says it's improved text rendering to make that happen -- something extremely necessary as AI consistently hallucinates and messes up words in images. You can also upload your own images and edit them with AI. There are some serious limitations to ChatGPT's ability. Most importantly, it says that you may not be able to precisely edit specific regions of an image -- an essential task as AI models can hallucinate things like eleven-fingered hands. If you upload your own image and make edits to a subject's face, those edits may be lost from edit to edit. You may also see issues with cropping and struggles with data visualizations and multilingual text. The company says in a blog post that it is working on improving these things and hopes to introduce fixes as early as next week. Like Dall-E, images made in ChatGPT don't have any visible watermarks denoting they are AI-generated. OpenAI said that its images will have C2PA metadata, an industry standard that lets folks know behind-the-scenes that an image is made by AI. In terms of safety, OpenAI says it will follow the same content guidelines as the rest of the 4o model. It said it has "heightened restrictions" around nudity and graphic violence.
[5]
ChatGPT finally gets a much better image generator - how to try it for free
Say goodbye to warped and misspelled text in your image generations - and hello to looser safeguards. OpenAI has continually expanded its ChatGPT offerings, adding an AI voice assistant, file and image understanding, advanced research capabilites, AI agents, and more. However, there's been one glaring omission -- a really capable image generator. On Tuesday, OpenAI launched 4o image generation. This image model is significantly better -- albeit slower -- than the DALL-E models previously offered by OpenAI. It tackles very difficult prompts such as realistic images and, most impressively, accurate text. Also: I tried ChatGPT's new Advanced Voice Mode update - here's what changed For example, in the live stream demo, OpenAI CEO Sam Altman, joined by researchers Gabriel Goh and Prafulla Dhariwal, prompted 4o to create a photo from a specific POV with a flyer that included lots of text. After loading for a few seconds, it got the cinematic direction right and accurately printed all the text. It also boasts many other capabilities OpenAI's previous image generator didn't have, such as image referencing, which can be used to render a new version of the image (such as an anime version or a selfie), or as inspiration for creating a completely new work. Because this tool is meant to integrate into creatives' workflows, it can generate images on transparent backgrounds, use specific colors from HEX codes, or implement the chatbot's advanced conversational capabilities in the generations. For example, when prompted to include "humor" in the photo during the demo, it included text that met that criteria. Because the image generator is accessible in ChatGPT, users can also refine images through a multi-turn conversation. This makes tweaking images easier and allows the model to use the context of previous generations to create new ones. Since GPT-4o has access to the web, that context is also added to creating the images. According to the company, GPT-4o's image generation also has strong instruction adherence. It can handle 10-20 different objects, which means you can prompt it to generate a high volume of objects in one go. Another new aspect of the image generator is that it can now create more risque content, something Elon Musk's Grok model is known for. During the live stream, Altman shared that you will be able to use GPT-4o's image generation to create offensive content "within reason." In an X post after the livestream, Altman added: "What we'd like to aim for is that the tool doesn't create offensive stuff unless you want it to, in which case within reason it does. As we talk about in our model spec, we think putting this intellectual freedom and control in the hands of users is the right thing to do, but we will observe how it goes and listen to society." Also: Grok 3 AI is now free to all X users - here's how it works The blog post announcing the model noted that it will block requests that violate content policies, including child sexual abuse materials and sexual deepfakes. Another safeguard in place is limiting what can be created when real people are in the context, including "particularly robust safeguards around nudity and graphic violence." Users can visit the System Card for all the safety information in the 4o image generation model. The updated image generation features are rolling out today in ChatGPT and Sora. Regardless of whether they are subscribed, all users (including free) will have access to GPT-4o image generation as the default. If users still want to access DALL-E, they can do so through a dedicated DALL-E GPT. Enterprise and Education users will be given access soon, with access to developers via the API slated for the upcoming weeks. Also: The best AI image generators: Tested and reviewed When DALL-E first launched, it lived on its standalone website; at the time, it felt like the greatest and latest. Since then, it has been moved to only reside in ChatGPT; there, the model paled compared to more advanced image generation models from competitors such as Midjourney, Google, and Adobe. This update now helps level the playing field, enabling it to compete better with other models.
[6]
ChatGPT's new image generator creates stunning images - for some users
Say goodbye to warped and misspelled text in your image generations - and hello to looser safeguards. OpenAI has continually expanded its ChatGPT offerings, adding an AI voice assistant, file and image understanding, advanced research capabilites, AI agents, and more. However, there was one glaring omission -- a really capable image generator. On Tuesday, OpenAI launched 4o image generation. This image model is significantly better -- albeit slower -- than the DALL-E models previously offered by OpenAI. It tackles very difficult prompts such as realistic images and, most impressively, accurate text. Also: I tried ChatGPT's new Advanced Voice Mode update - here's what changed For example, in the live stream demo, OpenAI CEO Sam Altman, joined by researchers Gabriel Goh and Prafulla Dhariwal, prompted 4o to create a photo from a specific POV with a flyer that included lots of text. After loading for a few seconds, it got the cinematic direction right and accurately printed all the text. It also boasts many other capabilities OpenAI's previous image generator didn't have, such as image referencing, which can be used to render a new version of the image (such as an anime version or a selfie), or as inspiration for creating a completely new work. Because this tool is meant to integrate into creatives' workflows, it can generate images on transparent backgrounds, use specific colors from HEX codes, or implement the chatbot's advanced conversational capabilities in the generations. For example, when prompted to include "humor" in the photo during the demo, it included text that met that criteria. Because the image generator is accessible in ChatGPT, users can also refine images through a multi-turn conversation. This makes tweaking images easier and allows the model to use the context of previous generations to create new ones. Since GPT-4o has access to the web, that context is also added to creating the images. According to the company, GPT-4o's image generation also has strong instruction adherence. It can handle 10-20 different objects, which means you can prompt it to generate a high volume of elements in one go. Another new aspect of the image generator is that it can now create more risque content, something Elon Musk's Grok model is known for. During the live stream, Altman shared that you will be able to use GPT-4o's image generation to create offensive content "within reason." In an X post after the livestream, Altman added: "What we'd like to aim for is that the tool doesn't create offensive stuff unless you want it to, in which case within reason it does. As we talk about in our model spec, we think putting this intellectual freedom and control in the hands of users is the right thing to do, but we will observe how it goes and listen to society." Also: Grok 3 AI is now free to all X users - here's how it works The blog post announcing the model noted that it will block requests that violate content policies, including child sexual abuse materials and sexual deepfakes. Another safeguard in place is limiting what can be created when real people are in the context, including "particularly robust safeguards around nudity and graphic violence." Users can visit the System Card for all the safety information in the 4o image generation model. The updated image generation features are rolling out now in ChatGPT and Sora. At launch, the model was announced to be coming to all users (including free), with GPT-4o image generation becoming the new default. However, due to high demand, Altman announced a day after that the rollout to the free tier would now be "delayed for awhile." This means that to access the image generation, you now have to be subscribed. For individual users, the best option is ChatGPT Plus, which costs $20 per user per month and comes with many other perks, including OpenAI's Sora video generator. At the time of writing this article, I was able to access the image generator from my Plus account. Enterprise and Education users will be given access soon, with access to developers via the API slated for the upcoming weeks. Also: The best AI image generators: Tested and reviewed When DALL-E first launched, it lived on its standalone website; at the time, it felt like the greatest and latest. Since then, it has been moved to only reside in ChatGPT; there, the model paled compared to more advanced image generation models from competitors such as Midjourney, Google, and Adobe. This update now helps level the playing field, enabling it to compete better with other models. However, if users still want to access DALL-E, they can do so through a dedicated DALL-E GPT.
[7]
ChatGPT Gets Native AI Image Generation
Users can drop prompts (and images) and get AI-generated images within the regular ChatGPT window, thanks to GPT-4o. It's available now to Plus, Pro, Team, and Free users with limitations. OpenAI has added AI image generation capabilities to ChatGPT. Users can now select the GPT-4o model, provide prompts, and get desired images within the regular ChatGPT window. Previously, ChatGPT was dependent on OpenAI's DALL-E model for images. Now, it uses the 4o model's native multimodal capabilities to provide "precise, accurate, photorealistic outputs." OpenAI touts GPT‑4o's skill for "accurately rendering text, precisely following prompts, and leveraging 4o's inherent knowledge base and chat context -- including transforming uploaded images or using them as visual inspiration." Translation: Expect fewer weird results. This was achieved by training the models on "the joint distribution of online images and text, learning not just how images relate to language, but how they relate to each other," OpenAI says. GPT-4o can also handle more objects within an image than usual. While other chatbots can generate up to eight objects for an image, GPT-4o can produce up to 20, according to OpenAI. It can also edit and improve user-uploaded images. In a demo video, an OpenAI researcher is seen uploading a hand-drawn sketch for a comic book page and getting a full-colored digital version delivered by ChatGPT. Still, OpenAI warns, "Our model isn't perfect. We're aware of multiple limitations at the moment, which we will work to address through model improvements after the initial launch." OpenAI will embed each output with C2PA metadata. This will allow AI image detectors to identify images generated by GPT-4o accurately. Additionally, ChatGPT will reject requests for child sexual abuse materials (CSAM) and sexual deepfakes. "When images of real people are in context, we have heightened restrictions regarding what kind of imagery can be created, with particularly robust safeguards around nudity and graphic violence," OpenAI says. In an addendum added later, OpenAI said it won't block GPT-4o from generating images of adult public figures, but those "who wish for their depiction not to be generated can opt out." At launch, ChatGPT's native image generation is available for all Plus, Pro, Team, and Free users, with support for Enterprise and Edu customers coming soon. The feature is also available on OpenAI's video-generation tool, Sora. OpenAI hasn't announced a daily limit for free users but tells The Verge that it will mirror DALL-E, which limits users to three free images per day. However, these numbers "may change over time based on demand," a spokesperson adds. None of this means DALL-E is going away. "For those who hold a special place in their hearts for DALL-E, it can still be accessed through a dedicated DALL-E GPT," OpenAI says.
[8]
Now you can generate images directly from ChatGPT and Sora
OpenAI just announced that all users will soon be able to generate images directly inside of ChatGPT. It's rolling out to ChatGPT Plus, Pro, Team and, most importantly, Free users. This will be the default image generation tool in 4o, so there will be no need to open Dall-E whenever you want to whip up a picture of a cat in space eating lasagna or whatever. The feature's also coming to Sora. The company says that the platform will "generate high-quality images based on your prompt, conversation and uploaded files." To the latter point, it'll be able to transform pre-existing images based on prompts. OpenAI is also boasting about significant improvements in text rendering and contextual understanding. These new tools are intended for both personal and professional use. As such, OpenAI gives a number of examples as to where this type of image generation could come in handy. These include the creation of infographics, social media promotional graphics and images with plenty of text, as seen below. This being a modern generation tool, it can also handle high-end visuals. The company says it offers a "strong capability for photorealism, including light, shadow, and texture accuracy." The ability to understand context could also be useful, as OpenAI says this could be used to create a "poster of birds found in Central Park" or a "visualization of an art history era discussed previously in the conversation." It's built on GPT-4o, an AI model that was . The "o" stands for "omni", which is a reference to the model's multimodal capabilities. This is what allows many of the aforementioned features, like being able to iterate on uploaded files. Today's news looks like another step on the long road toward the "one AI to rule them all" functionality that .
[9]
OpenAI delays rollout of ChatGPT's image generator to free users
It's currently available to those with Plus, Pro and Team accounts. Free ChatGPT users will have to wait a while longer to be able to use its built-in image generation capability. OpenAI has just launched a feature that will allow users to generate images directly inside of ChatGPT, and it was supposed to roll out to all Plus, Pro, Team and Free users. But according to company CEO Sam Altman, it has been way more popular than OpenAI had expected even though they already had high expectations to begin with. As such, its rollout to the free tier is "unfortunately going to be delayed for a while." People have been posting ChatGPT's output all over social media. One popular trend that emerged from the launch is turning existing photos into Studio Ghibli-style illustrations. Even Altman poked fun at it and used a Ghibli-fied photo of himself as his profile picture on X/Twitter. Users have been creating Ghibli versions of everything from personal images to photos of political figures and memes. As VentureBeat has noted, Ghibli creator Hayao Miyazaki likely wouldn't approve of them, seeing as he reacted quite negatively to AI-powered animation years ago, but we doubt that would stop the trend until people get tired of it. Altman didn't give a timeline for when the feature will make its way to free users. For now, those who want to use the chatbot's built-in generator will have to log into ChatGPT with a Plus, Pro or Team account and then select GPT-4o as their AI model in the lower left-corner.
[10]
ChatGPT's viral image-generation AI is 'melting' OpenAI's GPUs
Sam Altman, co-founder and C.E.O. of OpenAI, speaks during the New York Times annual DealBook summit at Jazz at Lincoln Center on December 04, 2024 in New York City. OpenAI CEO Sam Altman on Thursday announced that viral use of ChatGPT's new image-generation AI, introduced earlier this week, is overloading the company's servers. While it's "super fun seeing people love images" in ChatGPT, "our GPUs are melting," Altman posted on X Thursday, adding that the company will temporarily limit the feature's usage as it works to make it more efficient. The rate limits affect OpenAI's Tuesday debut of native image generation within ChatGPT. The company debuted the "high-quality" image-generation tool as a way to produce everything from diagrams, infographics and logos to business cards and stock photos. The feature can also use an image as a starting point for art, such as a custom painting of a pet or editing a professional headshot. The image-generation feature began rolling out to ChatGPT PLUS, Pro and Team users on Tuesday, as well as users of the chatbot's free tier when they use OpenAI's 4o model. ChatGPT Enterprise and Edu users will receive access next week, the company said. Images of anime-style renderings of users' uploaded photos have been going viral on X and other social media apps since the feature's Tuesday launch. Altman, for example, changed his X profile photo to an image generated by the new feature. One of the company's first hit products was the Dall-E model launched in 2021. That was one of the first artificial intelligence image generators, and was integrated into ChatGPT in 2023. Users of ChatGPT's free tier will soon be able to generate three images per day, Altman wrote.
[11]
OpenAI Unveils New Image Generator for ChatGPT
Sign up for the On Tech newsletter. Get our best tech reporting from the week. Get it sent to your inbox. Chatbots were originally designed to chat. But they can generate images, too. On Tuesday, OpenAI beefed up its ChatGPT chatbot with new technology designed to generate images from detailed, complex and unusual instructions. For instance, if you describe a four-panel comic strip, including the characters who appear in each panel and what they are saying to one another, the technology can instantly generate an elaborate cartoon. Though previous versions of ChatGPT could generate images, they could not reliably create images by blending such a wide array of concepts. The new version of ChatGPT is indicative of a wider change in artificial intelligence technology. After beginning as systems that merely generated text, chatbots are morphing into tools that combine chatting with various other abilities. The technology than underpins the new version of ChatGPT -- called GPT 4-o -- also allows the chatbot to receive and respond to voice commands, images and videos. It can even speak. Released at the end of 2022, the original ChatGPT learned its skills by analyzing enormous amounts of text from across the internet. It learned to answer questions, write poetry and generate computer code. It could not generate images. But about a year later, OpenAI released a new version of ChatGPT that could generate images called DALL-E. But ChatGPT and DALL-E were separate systems. Now, OpenAI has built a single system that learns a wide range of skills from both text and images. In generating its own images, this system can draw on everything ChatGPT has learned from the internet. "This is a completely new kind of technology under the hood," said Gabriel Goh, an OpenAI researcher. "We don't break up image generation and text generation. We want it all to be done together." Traditionally, A.I. image generators have struggled to create images that were markedly different from any existing image. If you asked an image generator to create an image of bicycle with triangular wheels, for instance, it struggled. Mr. Goh said that the new ChatGPT could handle this kind of request. OpenAI said that, beginning Tuesday, this new version of ChatGPT would be available to people using both the free and paid versions of the chatbot. This includes both ChatGPT Plus, a $20-a-month service, and ChatGPT Pro, a $200-a-month service that provides access to all of the company's latest tools. (The New York Times sued OpenAI and its partner, Microsoft, in December for copyright infringement of news content related to A.I. systems.)
[12]
ChatGPT Can Finally Generate Images With Legible Text
GPT-4o image generation is now available in ChatGPT. The new image generation model, which replaces DALL-E 3, is most notable for its accurate text rendering, improved "binding" capabilities, and ease of use. Unlike traditional diffusion image generation methodology, which "paints" details on top of random noise, GPT-4o utilizes a top-to-bottom, side-to-side autoregressive system. It's slower than diffusion, but the benefits of autoregression are as clear as day. GPT-4o is capable of spitting out images with perfectly legible text -- something that AI models like DALL-E 3 have continually failed to achieve. Not only that, but you can specify textual content for generated images. Write out a prompt like "give me a photorealistic image of a girl writing on a whiteboard with messy handwriting," tell the AI whatever words you want to see on the whiteboard, and it'll give you something fairly accurate. And, perhaps more importantly, the model is quite good at writing 2D stylized text for restaurant menus, advertisements, or other items that may be useful to businesses or hobbyists. The autoregressive approach also seems to help with "binding," which is a fancy way of saying that the AI doesn't get confused by prompts that contain multiple subjects. If you ask DALL-E 3 to draw a red circle, a blue triangle, a green heart, a pink star, and a purple square, it may trip over itself and spit out the wrong shapes or colors. GPT-4o, on the other hand, can accurately handle up to 20 different objects. When paired with the model's text rending capabilities, improved binding clearly creates some interesting opportunities for corporate art or advertising, though it's also just a generally useful thing that makes image generation easier to use. Of course, GPT-4o image generation is just "better" than DALL-E 3. Photorealistic images look more true to life, digital art looks less soupy or grainy, and new inferencing techniques reduce the need to type out long, complicated prompts. The model also boasts improved "character consistency," meaning that a character or object generated in one prompt can be accurately carried over to subsequent prompts -- if you tell the AI to reuse a cyborg cat that it created, it won't change the color of the cat, and so on. OpenAI admits that its new image generation model is imperfect. It still struggles with hallucinations, mathematic representations (like charts or graphs), multilingual text, and more. Still, it's clearly an improvement over the company's previous image generation models. Related Linux Kernel 6.14 Released With Improvements for Gaming and AI This update boasts significant frame rate improvements for certain Windows games. Posts OpenAI says that GPT-4o image generation contains safeguards to prevent misuse, plus advanced watermarking techniques to help people differentiate AI-generated content from real, human-made stuff. But I'll go out on a limb and assume that these safeguards can, with effort, be circumvented. And OpenAI is still using C2PA watermarking, which is just metadata. It takes very little effort to remove this metadata from an image -- C2PA is ineffective at preventing the spread of misinformation. The new GPT-4o image generator won't alleviate concerns about copyright or fair use, either. It was trained on a mix of "publicly available" data and licensed data, according to a statement provided to The Wall Street Journal. AI companies are known to brazenly defy basic copyright law, and OpenAI does not share its training data with the public, so feel free to draw your own conclusions on this matter. (For what it's worth, OpenAI doescare about copyright when it's work is stolen.) Related 9 Reasons to Create Your Own Custom GPTs in ChatGPT Transform ChatGPT into your perfect AI tool. Posts GPT-4o image generation is available today. Open ChatGPT in your browser, ask the AI to generate an image, and enjoy. Note that the rollout is not complete, so some users may still encounter the old DALL-E 3 model. The best way to tell the difference is to observe how a generated image loads. DALL-E 3 loads images with a spinning wheel, while GPT-4o images load with a pleasant top-down side-to-side flatbed scanner-ish animation. All ChatGPT users can access GPT-4o image generation, including free users. However, free users face usage limits, just as they did when using DALL-E 3. By the way, DALL-E 3 will remain available in custom GPTs for those who want to use it. Source: OpenAI
[13]
OpenAI releases impressive 4o image generator for free and paid users
Earlier this week, OpenAI released their "most advanced image generator yet" and made it available through ChatGPT using the GPT-4o model. ChatGPT previously relied on Dall-E to generate images. According to OpenAI, the improved 4o model is able to produce precise, accurate, and photorealistic results. They claim that it's also particularly good at rendering text, following instructions precisely, and even understanding the context of a chat. All of this includes the transformation of uploaded images or using uploaded images as visual inspiration. GPT-4o can also continually build on existing images to ensure the subject remains consistent across images. On top of that, GPT-4o is now able to handle 10 to 20 different objects in images (compared to most other AI image generators, which struggle to include 5 to 8 different objects). At the same time, OpenAI warns that GPT-4o may have problems with cropping, hallucinations, too many elements, precise graphs, rendering text in non-Latin alphabets, precise editing, and displaying lots of dense text on a small surface. The new and improved 4o image generator is rolling out now for ChatGPT Free, Plus, Pro, and Team users. (Enterprise and Edu users will gain access at a later time.) If you think Dall-E made better images, you have the option to switch and use that instead.
[14]
ChatGPT promises image generation for free tier as feature is 'melting' GPUs
ChatGPT launched GPT 4o some days ago, and it has been met with a barrage of users trying out the AI model's new image generation features. The company announced that the new surge of users is "melting" GPUs, and free-tier ChatGPT users will soon get limited image generation access. ChatGPT 4o brings a new image generation feature that's actually very good, in terms of AI generation. The addition gained popularity as users have been sending each other Studio Ghibli versions of themselves. The trend has garnered even more attention towards the chatbot, and what attention had waned for the AI model has come back in full force. ChatGPT CEO Sam Altman took to Twitter/X to express appreciation for the sudden surge of activity and the positive feedback. With that, he also announced that the internal GPUs used for processing image generations are "melting." Though this is likely hyperbolic, there could be some truth to that with overheating concerns, but the post doesn't go into detail. What it does mean is that so many users are asking ChatGPT to create Studio Ghibli versions of themselves that the company's servers can't keep up. One nice announcement to come out of the post was that users on the free ChatGPT tier will soon get 3 image generations per day, as opposed to none. Currently, users need to pay $20/month to try ChatGPT Plus. With that, you get access to ChatGPT 4o and all its image-generation prowess. Users even get access to Sora. chatgpt free tier will get 3 generations per day soon. The post goes on to note that rate limits will be imposed temporarily. It's unclear if those limits have cleared or if they'll stay in place for some time. There is no date just yet for the free tier expansion, but it could be expected relatively soon if the post is to be believed.
[15]
OpenAI just unveiled new ChatGPT image generator powered by Sora -- here's what you can do now
OpenAI today announced the GPT-4o image generator, introducing advanced image generation capabilities integrated within the ChatGPT-4o language model. The company says that GPT-4o represents a significant leap forward in image generation that should create images that are not only visually stunning but practically useful. The GPT-4o image model focuses on 'useful image generation,' which means users can now use the AI model for everyday needs such as logos, diagrams, and infographics. Unlike previous generative models that often produce surreal but impractical visuals, GPT-4o was designed to deliver more contextually relevant and accurate imagery. Key features of GPT-4o include upgraded text rendering, allowing seamless integration of textual information into images. This capability supports visual communication, elevating the utility of generated images. Additionally, GPT-4o supports multi-turn generation, enabling users to refine and adjust images through natural conversational interactions, maintaining consistency throughout iterative design processes. The image generation from GPT-4o is capable of managing complex prompts involving up to 20 distinct objects, which is an improvement over existing systems. Through in-context learning, GPT-4o can analyze user-uploaded images, seamlessly incorporating these details into subsequent image generations, thus creating a more personalized and contextually informed visual output. Built upon extensive multimodal training on vast online image and text datasets, GPT-4o has developed sophisticated visual fluency, allowing the model to produce images that are contextually aware, stylistically diverse, and photorealistically convincing. Despite its advanced capabilities, OpenAI acknowledges certain limitations, such as occasional cropping issues, hallucinated details, difficulties rendering dense information at small scales, and precision editing challenges. Multilingual text rendering, especially for complex non-Latin scripts, remains an area under active development. Safety continues to be a paramount focus, with rigorous measures in place to block harmful content, including explicit materials or images that violate content policies. Provenance tools, such as C2PA metadata tagging and internal reverse search, ensure transparency and accountability in generated visuals. GPT-4o image generation is available now across all ChatGPT platforms, including Plus, Pro, Team, and Free tiers, with Enterprise and Education access anticipated soon. Regardless of tier, users can specify detailed image requirements -- from exact colors and aspect ratios to transparent backgrounds -- making professional-quality image creation as straightforward as a simple chat interaction. OpenAI's GPT-4o signifies a major advancement in AI-driven visual communication, turning generative image creation into an accessible, practical and powerful tool for everyday users and professionals alike.
[16]
I just went hands-on with ChatGPT-4o's enhanced image generator -- and I can't believe this is free
Can this version provide more realistic and detailed images? OpenAI just enhanced its image generating capabilities with a new model integrated within ChatGPT-4o. Promising more realistic, richly detailed images from the simplest prompts, the model sounded too good to be true, especially since I haven't been all that impressed with ChatGPT's image generating capabilities in the past. I have to admit I was skeptical. OpenAI's new image generator is designed to be fast and intuitive, so I wanted to try it with 7 prompts. I tested everything from cinematic landscapes to mock product concepts to see how well it works and what sets it apart from other AI image generator tools. Here's what happened when I put this ChatGPT-4o image generator to the test. Prompt: Create a photograph of a suburban house in early spring. These houses are beautiful, and the chatbot catches the season correctly, but these do not look like photographs. They are pretty close and, for a free chatbot, definitely eye-catching and clear. But they still give off that AI-glow. I could see images like this being used for a brochure or travel company, but in terms of fooling someone that it's a photograph of your house, nah. The driveway alone gives it away. Prompt: Create a logo for an ice cream shop on the boardwalk. Knowing that it is this easy to create a logo makes me wish I owned a business. With a simple prompt, I had several logos to choose from in seconds. After the first few logos that had the same aesthetic, I asked the chatbot to switch the style. It went from whimsical to bland really fast, but with a little prompt tweaking, I'm sure the chatbot would come up with a usable. I did notice a few misspellings and blurred letters, so be sure to do a spell check after you generate your logo. Prompt: Create a cartoon with two cats chatting about the weather. I can see myself playing around with these prompts a lot. Maybe Tom's Guide needs a weekly AI comic strip? When I first prompted the chatbot, it generated just one comic. From there, I asked for a comic strip. I wasn't incredibly impressed with the layout. I was expecting something more like what you find in the Sunday paper. I then prompted the chatbot to do a third comic strip in the style of Disney. Definitely not "The Aristocats," but I thought it was pretty close. Prompt: Create the package renderings for a chocolate bar called Amanda. This was pretty cool and the most realistic prompt yet. I thought it was interesting that the chatbot went a "fancy" direction for the chocolate bar, but that's probably because I used my name. Had I used "Crazy 4 chocolate," it probably would have done something simpler. But what's interesting is when I asked the chatbot to generate two more images, it gave me an error message: "I wasn't able to generate the two additional chocolate bar package designs due to a content policy restriction. It might've flagged something in the description, even though the request seems fine on the surface. "If you'd like, I can rephrase the prompt and try again with slightly different wording -- just let me know if you want to stick with a luxury theme or explore something more fun, playful, or eco-friendly!" After that, I tried again, but the chatbot was done with that prompt. Prompt: Create a photograph of New York City in the summer. The first image the model generated looked more like a painting than a photograph. I keep hoping ChatGPT will have the ability to create photorealistic images, but I'm finding it fails most of the time. I asked the chatbot to be "more realistic," but even the second image didn't cut it for me. I also find it interesting that the chatbot focused on the greener aspects of NYC in the summer rather than generating an image of the busy sidewalks. Because I have AI Madness on the brain and OpenAI has said that the ChatGPT-4o image generator can create images with little prompting, I decided to try it here. The chatbot did a nice job coming up with some realistic-looking awards. The first one is a little busy, but once I prompted the chatbot to go for a simpler style, the image was much cleaner. This would be great for parents who want to give their kids awards for chores, office parties, and similar events for recognizing a job well done. Prompt: Create a realistic image of a dinosaur, a potato, and a cat. The potato was the most impressive of the three images. The dinosaur looked like a generic image of a prehistoric creature and the cat was downright comical. I purposely did a mixed bag here to test the AI's ability to understand the word "realistic." The AI seems to be somewhat unpredictable with it's style, so my biggest recommendation is to be as specific as possible when making your prompts. OpenAI says you don't need to, but I highly recommend it. Overall, the ChatGPT-4o image generator feels like an upgrade for several reasons. It is easy to use directly within the chat window, the chatbot pivots when prompted or given more information, and anyone can use it, even without an account and for free. That last point is a huge plus. Although the chatbot tends to have trouble with on-image text and occasionally needs extra prompting, the enhanced image generating tool is something I plan to keep in my AI toolbox.
[17]
ChatGPT's new image generator is now 'delayed for a while' for free users
Images created by ChatGPT's latest AI image generator have been doing the rounds online. But while they were originally meant to be available in all versions of ChatGPT, both free and paid, that plan has now been delayed. Sam Altman, CEO of OpenAI, announced on Wednesday via a post on X that the feature had been much more popular than first expected. This unexpected popularity has resulted in a delay in the rollout of the plan. No date was announced for when this feature will come to the free version of ChatGPT, with Altman simply stating it will be "delayed for a while." However, anyone can use it on ChatGPT Plus (for $20 a month) or on the more expensive Pro version. This new image tool utilizes OpenAI's GPT-4o - an upgraded model that was announced for ChatGPT last year. That was originally only available for text-based requests until now, with the introduction of the 4o image generation. This brought about a variety of new and improved features, but the focus here is an increase in realism and detail in your images. While it takes slightly longer to generate the images, OpenAI has claimed to improve background details and let the model spend more time on finer details. Equally, this latest model has an increased focus on communication. It can hold the information you have discussed with the model, allowing for fluid conversations to result in a more detailed image at the end. Equally, improvements have been made to the generation of texts and words in images -- a task that has proved challenging for AI models in the past. In our initial tests with the model, we were impressed by its ease of use and ability to pivot on style and input based on your conversations with it. However, even with the improvements in text detection, it does still have a habit of messing up words in images.
[18]
'Insane': OpenAI introduces GPT-4o native image generation and it's already wowing users
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More We're coming up on the one year anniversary since OpenAI released its first "omni" or multimodal model, GPT-4o back in May 2024, but that old standby still has some tricks up its sleeve. Case-in-point, today OpenAI finally turned on the native multimodal image generation capabilities of GPT-4o for users of its hit chatbot ChatGPT on the Plus, Pro, Team, and Free usage tiers, though the company said it would also soon be made available for Enterprise, Edu, and through its application programming interface (API). Unlike the previous generative AI image model available in ChatGPT -- OpenAI's DALL-E 3, a classic diffusion transformer model that was trained to reconstruct images from text prompts by removing noise from pixels -- this new image generator is part of the same model that spits out text and code, as OpenAI trained the entire model to understand all these forms of media at once. OpenAI president Greg Brockman had long ago previewed this native capability of GPT-4o back in May 2024, but for reasons that still remain unknown publicly, the company held onto it until now -- following the public release of what many AI power users saw as a similar feature from Google AI Studio with its Gemini 2 Flash Experimental model. This has resulted in a much higher quality image generator that produces far more lifelike images and accurate text baked in, and it's already impressing users -- one of whom calls the quality "insane." By the same token (pun intended), OpenAI still hasn't said precisely what data GPT-4o's image generation capabilities were trained on -- and given the history of the company and other model providers, it likely includes many artworks scraped from the web, some of which are presumably copyrighted, which is likely to anger the artists behind them. Bringing Image Generation to ChatGPT and Sora OpenAI has long aimed to make image generation a core capability of its AI models. With GPT-4o, users can now generate images directly in ChatGPT, refining them through conversation and adjusting details on the fly. The model also integrates into Sora, OpenAI's video-generation platform, further expanding multimodal capabilities. In an announcement on X, OpenAI confirmed that GPT-4o's image generation is designed to: Users can describe an image in ChatGPT, specifying details such as aspect ratio, color schemes (hex codes), or transparency, and GPT-4o will generate it within a minute. As independent AI consultant Allie K. Miller wrote on X, it's a "Huge leap in text generation," and is "the best" AI image generation model she's seen. Key capabilities and use cases GPT-4o is designed to make image generation not just visually stunning but also practical. Some of the key applications include: How GPT-4o improves generative images over DALL-E According to OpenAI's official thread on X, GPT-4o introduces several improvements over previous models: Limitations Despite its advancements, GPT-4o still has some known challenges: OpenAI is actively addressing these issues through ongoing model refinements. Safety and labeling measures As part of OpenAI's commitment to responsible AI development, all GPT-4o-generated images include C2PA metadata, allowing users to verify their AI origin. Moreover, OpenAI has built an internal search tool to help detect AI-generated images. Strict safeguards are in place to block harmful content and prevent misuse, such as prohibiting explicit, deceptive, or harmful imagery. OpenAI also ensures that images featuring real people are subject to heightened restrictions. OpenAI CEO Sam Altman described the release as a "new high-water mark for creative freedom", emphasizing that users will be able to create a wide range of visuals, with OpenAI observing and refining its approach based on real-world usage. As AI-generated images become more precise and accessible, GPT-4o represents a significant step forward in making text-to-image generation a mainstream tool for communication, creativity, and productivity.
[19]
ChatGPT's New AI Image Generator Looks Scarily Good
OpenAI has launched a new AI image generator that is a technological step forward, and some of the examples the company shared achieve a frightening degree of verisimilitude. Called "Images in ChatGPT", the feature differs from DALL-E -- OpenAI's previous image generator which seems like it's being retired -- because the images come from within ChatGPT-4o. Describing the model as a "step change", research lead Gabriel Goh tells The Verge that GPT-4o is "omnimodal" -- a model that can generate any kind of data like text, image, audio, and video. This new type of model is indicative of a wider change in the AI industry where systems combine all types of data. Yesterday, PetaPixel reported on Google's "Project Astra" which can see the world around it via a smartphone camera and answer questions. In a blog post revealing Images in ChatGPT, OpenAI shared some impressive examples. The pictures of an "OpenAI researcher" working on a whiteboard in a room "overlooking the Bay Bridge" with the photographer's reflection are scarily good. OpenAI also shared other examples which showcase the model's ability to generate photorealistic images. Images in ChatGPT doesn't have a visual watermark the way DALL-E did. However, ChatGPT multimodal product lead Jackie Shannon tells The Verge that "all of our generated images will include standard C2PA metadata to mark the image as having been created by OpenAI." The new version of ChatGPT started rolling out yesterday (Tuesday) and will be available to people using the free and paid versions of the chatbot.
[20]
'Our GPUs are melting' - OpenAI puts limits on image creation and delays rollout to free accounts
Amid the growing controversy over its AI mimicking the artistic style of Studio Ghibli, OpenAI is being forced to limit how many images ChatGPT can produce on the free tier to 3 a day because it's proving too popular. In a recent tweet on X, Sam Altman, CEO of OpenAI, said "It's super fun seeing people love images in ChatGPT, but our GPUs are melting. We are going to temporarily introduce some rate limits while we work on making it more efficient. Hopefully won't be long! ChatGPT free tier will get 3 generations per day soon." ChatGPT's new image generation capabilities are clearly a step up in the development of AI image generation, proving superior in our tests to DALL-E 3, which is the model previously used by ChatGPT, and also what it will still default to once you've run out of generations in the new model. In his X-thread Sam Altman also goes on to say that "(also, we are refusing some generations that should be allowed; we are fixing these as fast we can.)" This could explain the frustrations I've been experiencing getting ChatGPT to produce text in images. It's quite possible that you don't have access to ChatGPT's image creation tools quite yet anyway. While ChatGPT Plus and Pro users all seem to have access, not all free-tier users do. On March 26 Altman tweeted that rollout to the free tier was going to be delayed: "Images in ChatGPT are way more popular than we expected (and we had pretty high expectations). Rollout to our free tier is, unfortunately, going to be delayed for a while." As we've said in our testing, even on the Plus tier, ChatGPT is already very slow when it comes to generating images, and when the rollout to the free tier is complete we would expect it to be even slower. The move to limit the free tier to 3 images on a temporary basis, while understandable, will inevitably lead to people feeling frustrated with the company. Have you been able to sample ChatGPT's new image creation abilities yet? Let us know what you think in the comments below.
[21]
OpenAI's latest model creates life like images and readable text, try it free
OpenAI has introduced its 4o model into ChatGPT to enable native image generation within the chatbot atmosphere. This upgrade makes it so you don't have to use OpenAI's Dall-E image generation model as a separate entity, though Dall-E remains available for those as a preference. The AI brand has also enabled its Sora AI video generator within ChatGPT. The new features are currently available for ChatGPT free users, as well as for ChatGPT Plus, Team, and Pro users. Availability will be coming to enterprise and education users next week. Previously, Dall-E 3 was the image generation plug-in for paid ChatGPT subscribers. Meanwhile, those who wanted to try the generator for free could do so through the basic tier of Microsoft Copilot. Recommended Videos The model has been lauded as one of the top image generators available, particularly in its paid version. Despite the benefit of all ChatGPT users being able to use image generation natively with the 4o model, those using the free tier of ChatGPT should be prepared to run into some limitations, such as maximums for file uploads and data analysis, CNET noted. Even so, ChatGPT will benefit from having more realistic images with more legible text after OpenAI spent a year having GPT-4o go through a post-launch training effort called "reinforcement learning from human feedback" (RLHF), according to the Wall Street Journal. After announcing GPT-4o in May 2024, OpenAI had a team of over 100 "human trainers" scouring the model for typos, as well as common errors in hands and faces, the project's lead researcher, Gabriel Goh told the publication. The GPT-4o model will also bring to ChatGPT the ability to create transparent backgrounds. This should be a major benefit for business users and creatives, as it will allow them to create logos or other iconography, ChatGPT multimodal product lead, Jackie Shannon also noted to WSJ. Despite the improvements that OpenAI has made, the updated GPT-4o model as a whole still has its shortcomings. It still has a propensity toward hallucinations, which is a common AI feature that has yet to be resolved. Maintaining editing consistency remains a challenge within the ChatGPT atmosphere; however, OpenAI has promised rapid updates, as early as next week. Another ongoing issue for OpenAI is the matter of ethics and legality. The brand insists its model was trained on "publicly available data," and through proprietary data it owns via partnerships with brands including Shutterstock, WSJ noted. Images generated through ChatGPT based on the 4o model won't have AI watermarks. However, the brand has indicated images will include C2PA metadata denoting them as AI-generated. This remains the industry standard.
[22]
OpenAI's New Image Generator Can Do Near-Perfect Text
Until now, the chatbot used the company's separate DALL-E model to dream up pictures. With this latest update, users will be able to access a new feature dubbed "Images in ChatGPT," leveraging OpenAI's flagship GPT-4o model, which has underpinned the chatbot for nearly a year. The upgrade is also available in Sora, OpenAI's video generation tool. "This model is a step change above previous models," research lead Gabriel Goh told The Verge. The most noticeable change is how the model handles text, something that it and its competitors have long struggled with. Words tended to come out looking like gobbledygook, and the text that was legible looked sloppy, filled with formatting errors and misspellings. Not anymore, according to OpenAI. One example shared by the company shows an employee writing out the pros and cons of the ChatGPT image update on a whiteboard, following to the letter what was specified in the prompt; ditto for a four-panel comic strip about a snail -- all with cleanly rendered text. "This was just like a process of iteration that took many, many months to get right," Goh told The Verge. "It's been just many months of small improvements." The model still struggles with very small lettering, but overall, the text quality is consistently usable, Goh said. Unlike image generators like DALL-E, which use a diffusion model, GPT-4o uses an autoregressive approach that produces images from left to right and top to bottom, per The Verge, similar to how text -- at least in English -- is written. Beyond improved penmanship, OpenAI says the model will now follow instructions better, as a common issue with older iterations was that they'd ignore certain details in lengthier prompts. It's also been fine-tuned to be able to generate more photorealistic images. There are caveats. For one, it'll take longer to generate the outputs. And like all generative models, it's still prone to making up information, or hallucinating. It also struggles with generating non-Latin scripts, hallucinating characters when trying to write out languages like Korean. With greater capabilities come greater safety and misinformation concerns. To this end, OpenAI stressed that it has particularly "robust safeguards" in place around nudity, violence, and depictions of real people. Moreover, all images that the AI model generates will be embedded with C2PA metadata identifying that it was made with GPT-4o. But this hidden watermark of sorts can easily be stripped -- in fact, many social media platforms automatically remove an image's metadata once it's uploaded. "Ultimately, no system is perfect for this type of thing, but we're continuously improving our safeguards and we think of this as a starting point," ChatGPT multimodal product lead Jackie Shannon told The Verge. For now, GPT-4o image generation is only available to subscribers of OpenAI's ludicrous $200 per month Pro subscription tier, with plans to roll out the feature to Plus and free users in the near future.
[23]
OpenAI halts free GPT-4o image generation after Studio Ghibli viral trend
After only one day, OpenAI has put a halt on the free version of its in-app image generator, powered by the GPT-4o reasoning model. The update is intended to improve realism in images and text in AI-generated context; however, users have already created a runaway trend that has caused the AI company to rethink its rollout strategy. Not long after the update became available on ChatGPT, users began sharing images they had fashioned to social media platforms in the style of Studio Ghibli, the popular Japanese animation studio. Creations ranged from Studio Ghibli-based personal family photos to iconic scenes from the 2024 Paris Olympics, scenes from movies including "The Godfather" and "Star Wars", and internet memes including distracted boyfriend and disaster girl. While OpenAI CEO, Sam Altman, also joined in on the fun on Wednesday by changing his X profile picture to a Studio Ghibli style image of himself, which pundits assume was fashioned using GPT-4o, the executive also followed up with a prompt response, stating on the social media platform that the image generator update would be delayed for the ChatGPT free tier for the time being. He didn't give any indication of whether or when the feature would return. Currently, paid tiers of ChatGPT, including Plus, Pro, and Team, still have access to the update. Recommended Videos The Verge noted that the GPT-4o reasoning model uses a method called "autoregressive approach" to generate images, which develops the image from left to right and top to bottom, instead of all at once. This aids accuracy and realism when generating images. AI text on older models is known for being either blurry or gibberish, while GPT-4o text can generate sharp and coherent text. OpenAI also told the Wall Street Journal that it put the reasoning model through human-directed post-launch training to find common errors in text and image designs. Ethics and legality is a long-standing issue when it comes to AI. While the brand attempted to cover its bases, divulging to the WSJ the sources of its data for training, it was impossible to know how the public will respond to the product once it's available. Neal & McDevitt intellectual property lawyer Evan Brown told TechCrunch that ChatGPT users utilizing the GPT-4o-powered product to generate Studio Ghibli-style photos technically is not breaking the law because "style is not explicitly protected by copyright." However, it does lie in a legally gray area. An OpenAI spokesperson also told the publication in a statement that ChatGPT can emulate a studio's style, but not the style of an individual artist. The AI company does seem to be aiming to cover its bases by removing the GPT-4o update from its ChatGPT free tier until it figures out its next move. There have been many AI trends based on popular designs. One older trend from late 2023 allowed users to generate Funko Pop! figures in Microsoft Designer with a few prompts. While the main premise of this trend was to make figures of yourself, many people made Funko Pop! figures of notable people, such as Elon Musk, and Beyonce. Similarly, this trend brought to light the legal restraints that can arise while trying to have fun with AI online. You had to input certain nickname workarounds, such as "Elon X" and "Queen Bey," with proper descriptions of the characters to generate your desired results.
[24]
ChatGPT delays roll out of image generator for free users because it's too 'popular'
OpenAI is delaying the rollout of ChatGPT's latest image generation features, Images in ChatGPT, for users on its free subscription tier, CEO Sam Altman said in an X post on Wednesday. The reason why? It's too "popular." "images in chatgpt are wayyyy more popular than we expected (and we had pretty high expectations). rollout to our free tier is unfortunately going to be delayed for awhile," Altman wrote in an X post on Wednesday. OpenAI announced on Tuesday that it will be integrating its latest image-generation technology into its GPT-4o model chatbots, making them a one-stop shop for artificially generated content. At the time of the release, OpenAI said one of the biggest advancements in the chatbot's image ability is in the quality of its text rendering abilities. The company claims its systems can now generate images with text that is meaningful and readable -- not warped and typo-ridden. OpenAI also boasted in the press release that the image generator is now better at following instructions, especially when users upload images as visual inspirations. This latter feature ended up being the real star of the show for users. ChatGPT had a viral moment on Wednesday as people started sharing pictures of themselves in the distinct style of Japanese animation company Studio Ghibli. Co-created by famous Japanese filmmaker Hayao Miyazaki, Studio Ghibli is behind famous animation films like Spirited Away and My Neighbor Totoro. While he hasn't commented on the recent trend, Miyazaki has in the past spoken out against AI-generated animation. In a documentary feature published in 2016, Miyazaki said he was "utterly disgusted" and called the technology "an insult to life itself." And when told that the goal of AI animation is to "draw pictures like humans do," Miyazaki said that he felt "like we are nearing to the end of the times. We humans are losing faith in ourselves." Even Sam Altman himself jumped on the viral trend, changing his X profile picture to a ChatGPT-generated photo of himself in the style of Ghibli, as social media users flocked to ChatGPT to create their own versions. While the feature is no longer available for free users, at least for now, it is still accessible for ChatGPT's plus, pro, and team subscription tiers. The company said the feature will also roll out soon for enterprise and educational users, as well as for developers using the API.
[25]
OpenAI Launches GPT-4o's New Image Generation Into ChatGPT, Showing 'Unbelievably Better' Results - Decrypt
OpenAI is integrating image generation capabilities directly into ChatGPT, allowing users to create images without leaving the chat interface. The company announced the feature Tuesday as part of its broader push to make AI tools more useful and accessible across different media, staying relevant in the AI art scene. The feature is an evolution of DALL·E 3, OpenAI's image generator, which launched in September 2023 but fell out of favor among AI enthusiasts who preferred the next generation of models, including Flux, MidJourney v6, SD 3.5, Recraft, and Reve. Before this release, OpenAI offered two different models on the same platform, with GPT generating text and DALL·E 3 handling image generation. Now, GPT-4o will do everything on its own, and DALL·E 3 will disappear. "GPT‑4o image generation excels at accurately rendering text, precisely following prompts, and leveraging 4o's inherent knowledge base and chat context -- including transforming uploaded images or using them as visual inspiration," OpenAI claimed in an official blog post. The integration of DALL·E 3 continues to make good on the company's plan to make GPT-4o an "omni" model, trained with multimodal data and capable of handling all tasks. The result is a model that is much more capable, accurate and intelligent than its predecessors. "We know we've made you wait, but we think it's really worth it, and we think you're going to love it," Sam Altman, OpenAI's CEO, said in a video showing GPT-4o's new capabilities. "It's such a huge step forward that the best way to explain it to you is just to show it." In the video, the company showed off the system's capabilities with several examples, including manga pages explaining the theory of relativity -- with inputs in english and mandarin -- custom trading cards based on personal and real photos, commemorative coins combining multiple images with transparent backgrounds, and a very accurate image based on and extraordinarily long and detailed prompt. The model is slow at generating images, but it seems to be highly accurate. Altman pointed to the significant quality upgrade as worth the longer waiting time. "Images are much slower than our previous image generation (model), but unbelievably better. We think it's super worth the wait," Altman said during the demonstration. "We also will be able to make it faster over time." The rollout appears to be happening gradually, and we weren't able to get our hands on the new model as of press time. Users can tell which system they're using based on how images appear: Besides the apparent quality gap, DALL·E 3 images pop up fully formed after a loading screen, while the new GPT-4o renders images progressively from top to bottom in real time. The company emphasized that the technology extends beyond creating fancy images. "What's really exciting about this release is that now these models can actually visualize what they know and externalize it in a visual way," explained a research scientist at OpenAI, invited by Sam Altman to talk about this new feature. This capability allows for educational applications like detailed scientific diagrams or informational posters with accurately rendered text and even image editing with subject consistency. OpenAI has also implemented guardrails to prevent the generation of deepfakes, illegal content, and the removal of watermarks. While the generated images won't have visible watermarks, they will include C2PA metadata to identify them as AI-created. The company is also developing tools to track image provenance. The company plans to bring the feature to its API, allowing developers to integrate the technology into their own applications. OpenAI's Terms of Use also say that users will retain ownership of images they generate, subject to the company's usage policies.
[26]
ChatGPT Just Got a Huge Image-Generation Upgrade
OpenAI has significantly leveled up the image generating capabilities of ChatGPT, adding the update as part of the GPT-4o model introduced last May. The new and improved AI generator is rolling out now for all ChatGPT users, across paid plans and the free tier (though free users are more restricted in how much they can use it). It's been possible to generate images through the ChatGPT interface for a while now, though behind the scenes the work was farmed out to the DALL-E 3 image model. Now, everything will be handled by GPT-4o, for a more consistent and native experience. There are lots of improvements here, which cover some of the areas that AI image creator tools have typically struggled with: rendering text, keeping characters consistent across pictures, and drawing diagrams. OpenAI says you can now expect more "precise, accurate, [and] photorealistic" results from your prompts. Images made with AI often come with an artificial sheen that tells you they've been dreamt up by algorithms, and that should be less obvious with GPT-4o images. One of the demo pictures shown off by OpenAI has a woman writing on a whiteboard, with a view reflected in it -- all pretty life-like, though note the small caption at the bottom that tells you this was the best of eight attempts ChatGPT had at the prompt. The AI art users create should also stick more closely to the prompts given, OpenAI says. So, if you want specific objects in specific places, or you need people in certain positions, then these instructions will apparently be carried out more faithfully. One of the more impressive example images shows a four-panel comic strip rendered by ChatGPT, without any obvious errors or inconsistencies. I tried to get ChatGPT to turn an Austen novel into a comic strip, and produce a photorealistic image of a stately home with a garden, and the results were impressive -- if not quite perfect. They're certainly significantly better than the images ChatGPT was previously producing, although the rendering takes longer to complete (typically minutes rather than seconds). Trying to get AI to render text and diagrams accurately has long been a challenge: The way these tools are built means they're much better at inventing and remixing the images they've been trained on, rather than reproducing an exact copy of the alphabet or a series of rectangles and arrows. The new GPT-4o model can render text and diagrams to a high level of detail and accuracy, so you shouldn't see as many strange mistakes and inconsistencies. OpenAI's showreel included a menu, an invitation, a boarding pass, and a diagram explaining Newton's prism experiment, all generated from a single text prompt. When I asked ChatGPT to produce an infographic explaining DNA in simple terms, and a book cover with a specified title and author, it followed the brief pretty exactly -- the graphic was basic but accurate (as per the prompt), and the book cover looked like something you might see in a store. Just as importantly, there were no weird artifacts or inconsistencies in the images. I've written before about the limitations of ChatGPT image editing, and this is another area that's been upgraded. It's now easier to keep characters and scenes consistent between images, to only tweak parts of a picture and leave the rest untouched, and to build up different layers of an image. You can even create transparent backgrounds, if needed, or specify colors using hex codes. Other improvements come in the way ChatGPT can accept and remix your own images, and incorporate other information (from the web and its training data): So one of the demo OpenAI pictures was built from the prompt "make a visual infographic describing why SF is so foggy" and ChatGPT did just that (well, best of three). In my own tests, I found ChatGPT much better at editing images, and pretty competent at remixing pictures in different styles. It still struggles to some extent keeping consistency between images -- especially with complex objects and characters. It's definitely better than it was at this, but there's still a tendency to overdo the edits, making the AI less useful for tweaking images or making a series of several images that need to match. As with any generative AI announcement, issues around copyright, misuse, and energy demands are once again brought to mind. OpenAI is on record as saying it's impossible to build these tools without training on copyrighted images, though it has recently started signing content deals with providers such as Shutterstock. Brad Lightcap, OpenAI's chief operating officer, told the Wall Street Journal that the GPT-4o image generator will reject requests to mimic the work of any living artist. When it comes to safety, OpenAI says generated images all come with C2PA metadata to identify them as AI-generated -- though this metadata can be easily removed with something as simple as a screenshot. The AI generator is also built to rebuff any attempts to create "child sexual abuse materials and sexual deepfakes" OpenAI says, as well as other prompts that violate its content policies. This is clearly a major step forward for AI images: The upgraded technology is genuinely jaw-dropping at times, and a lot of the tell-tale signs of AI and the errors made by the tech are vanishing. It does raise some big questions about the future we're all barreling towards though, one where fakes are so easily made, where creative work is done by robots rather than people -- and where we collectively lose our ability to sketch a picture, craft a sentence, or write a line of code. And then how will generative AI find more training data?
[27]
OpenAI upgrades ChatGPT's image generation capabilities - SiliconANGLE
OpenAI today rolled out what it describes as a major upgrade to ChatGPT's built-in image generation tool. Until now, the feature was powered by an algorithm called DALLE-3 that debuted in 2023. It's the third iteration of a text-to-image model that first debuted two years earlier. The model's original version was a modified edition of GPT-3 adapted to rendering tasks. As part of today's update, OpenAI is switching ChatGPT's image generation tool from DALL-E to GPT-4o. The latter algorithm is a multimodal large language model that launched last April. OpenAI says that the upgrade will significantly enhance ChatGPT's graphic design skills. The chatbot's image generator can now take on more complex tasks than before. In one internal test, OpenAI asked ChatGPT to visualize an early physics experiment carried out by Isaac Newton. In response, the chatbot generated a detailed illustration complete with explanatory text. ChatGPT can customize the images it generates based on user instructions. After creating the illustration of Newton's experiment, OpenAI engineers asked the chatbot to overlay the drawing on a notebook. The chatbot successfully completed the task, which involved both changing the angle of the illustration and adding a complex background. According to OpenAI, competing AI image generators struggle with prompts that ask them to draw more than a handful of objects. The company says that GPT-4o can accurately draw up to 20 different items specified by the user. That includes text, which the model generates more reliably than DALL-E 3. Users can optionally supply ChatGPT with reference images. An interface designer, for example, could upload a dropdown menu template and ask the chatbot to make improvements. Another selling point of ChatGPT's upgraded image generator is that it can create objects with transparent backgrounds. A transparent background makes it easier to combine visual assets with one another. That simplifies tasks such as integrating a newly created logo into an existing application interface. According to the Wall Street Journal, OpenAI trained GPT-4o using publicly available data and assets licensed from partners such as Shutterstock Inc. "We trained our models on the joint distribution of online images and text, learning not just how images relate to language, but how they relate to each other," OpenAI staffers wrote in a blog post. After the initial training phase, the company used a method called RLHF to further refine ChatGPT's output quality. It's a variation of reinforcement learning, an industry-standard approach to developing AI models. In reinforcement learning projects, an AI model's training process is coordinated by a second neural network. RLHF, the machine learning OpenAI used to build GPT-4o, enhances that second neural network using feedback from human experts. The improvements the experts make help increase the quality of the AI being trained.
[28]
Users In Awe of OpenAI's GPT-4o Native Image Generation Feature
'Creating and customising images is as simple as chatting using GPT‑4o'. OpenAI, the company behind the GPT family of AI models, unveiled native image generation capabilities in GPT-4o on Tuesday. This makes it possible for GPT-4o to generate images of varied natures, like infographics, comic strips, signboards, graphics, menus, memes, street signs, and more. It is also possible to refine and edit images generated with follow-up prompts. OpenAI has introduced native image generation features for users with Plus, Pro, Team, and Free plans. Access to Enterprise and Edu plans will be available shortly. Access to the API will be rolled out in the next few weeks. Native image generation indicates that GPT-4o can generate images using its inherent knowledge, meaning it doesn't have to rely on any external diffusion models, such as the company's very own DALL-E. OpenAI also mentioned that users can continue to use DALL-E as usual. "Creating and customising images is as simple as chatting using GPT‑4o - just describe what you need, including any specifics like aspect ratio, exact colors using hex codes, or a transparent background," said the company. In no time, users were blown away by its capabilities. Tobias Lutke, CEO of Shopify, shared in a post on X how the model could describe the anatomy of an unknown animal on his son's t-shirt. After he saw the results, he remarked, "How is this even real?". Besides, the model is also capable of generating texts without any distortions or errors. The model is also capable of generating user interfaces based on details in a prompt without any reference images. Users have also been experimenting with style transformations on existing photos. Grant Slatton, a founding engineer at Row Zero, showcased an example of how GPT-4o could convert a regular photo into a 'Studio Ghibli'-style anime image. His post quickly gained traction, inspiring many others to share their own AI-generated creations. In another instance, users could reproduce advertisement images, including the copy material. A user on X shared an ad image as a reference and asked GPT-4o to recreate it for their app. He also requested that the app screenshot in the original ad be replaced with a screenshot of their app. "Within minutes, it had almost perfectly replicated it," he said. Besides, people are also amazed by the model's capabilities of generating photorealistic images. OpenAI's announcement comes a few days after Google introduced native image generation in the Gemini 2.0 Flash AI model. Initially introduced to trusted testers in December, this feature is now accessible across all regions supported by Google AI Studio. "Developers can now test this new capability using an experimental version of Gemini 2.0 Flash (gemini-2.0-flash-exp) in Google AI Studio and via the Gemini API," Google said.
[29]
'Our GPUs are melting' -- OpenAI puts limiter in after Ghibli-tsunami
OpenAI CEO Sam Altman says while it's been "super fun" seeing everyone's image creations using ChatGPT, some limits need to be put in place. ChatGPT creators OpenAI have introduced rate limits after a viral social media trend that saw nearly everything "Ghiblifyied" -- turned into AI art in the style of the famous Japanese animation studio. OpenAI CEO Sam Altman was one of the first to take part in the trend, posting a portrait of himself generated by the model on March 25 but said in a subsequent post two days later that all image requests have started to tax the firm's infrastructure. "It's super fun seeing people love images in ChatGPT but our GPUs are melting. We are going to temporarily introduce some rate limits while we work on making it more efficient," he said. "Also, we are refusing some generations that should be allowed; we are fixing these as fast we can," he added. OpenAI launched the upgraded image generation offering in ChatGPT-4o on March 25, resulting in users splashing images across social media in the art style of Studio Ghibli -- known for its anime films Spirited Away and My Neighbor Totoro. Altman didn't give a definitive timeline on how long the rate limits would last but said, "Hopefully, it won't be long! ChatGPT free tier will get three generations per day soon." Rate limits are generally applied to help OpenAI manage the aggregate load on its infrastructure, according to OpenAI. Related: Ghibli memecoins surge as internet flooded with Studio Ghibli-style AI images "If requests to the API increase dramatically, it could tax the servers and cause performance issues. By setting rate limits, OpenAI can help maintain a smooth and consistent experience for all users," OpenAI says on its rate limit explanation page. Meanwhile, Bloomberg reported on March 26 that OpenAI expects to more than triple its revenue this year to $12.7 billion, citing a person familiar with the matter. Altman said on Feb. 12 his firm wants to ship GPT-4.5 and GPT-5 in the coming weeks or months.
[30]
ChatGPT's Latest Image Generation Update Is Wild -- But Some Users Have to Wait
How ChatGPT's Deep Research Can Be Improved With 8 Key Features OpenAI's latest ChatGPT update is one of its best, bringing a massive upgrade to GPT-4o image generation. Now, instead of switching to DALL-E 3, you can generate extremely high-quality images with GPT-4o without starting a new chat. However, there's a catch for ChatGPT's free users -- at least, there is for now. ChatGPT's Image Generation Receives an Enormous Upgrade OpenAI's GPT-4o model is the go-to for most default conversations on ChatGPT. It works for a huge range of questions, with OpenAI labelling the model "Great for most questions." And it is; it hits a wide range of information with speed and accuracy -- but only text. Following its March 25, 2025, update, GPT-4o can generate high-quality images with amazing detail, exceeding the quality of the model it basically replaces (DALL-E 3). Furthermore, GPT-4o can now perform editing and image adjustments far beyond the capabilities of DALL-E 3. Image editing, color and style adjustments, subtle changes, foreground and background objects, and more are all possible. Close Text generation is also startlingly improved. DALLE-3's text generation was never a strong point (though it's not a strong point for most AI models). Now, GPT-4o can create detailed infographics with detailed explanations on nearly any topic. Text placement within images is now precise, using specific text styles and direction without the risk of creating blurry AI-slop. Detail retention is also another clear improvement, with specific features now easily transferable from one image to another without ChatGPT completely altering the scene or generating a completely new image. Close Part of the upgraded image quality comes from OpenAI's partnerships with companies like Shutterstock and a group of trainers that spent long hours specifying mistakes GPT-4o made during image generation. Talking to the Wall Street Journal, lead researcher Gabriel Goh explained that this reinforcement learning technique enables GPT-4o to accurately follow human instructions over a vast range of image generation requests but that it only required "a little more than 100 human workers" to complete the process. ChatGPT's new image generation really is amazing, and I'd advise anyone to try it on any topic you desire. Now, it's not perfect; no AI image generation tool is. But it's a remarkable jump in quality from an AI image generation model like DALL-E 3, which was useful but not amazing, to the quality GPT-4o is now capable of. ChatGPT's Amazing Image Update Is Rolling Out to (Nearly) Everyone Unfortuantely, due to the unprecidented demand on GPT-4o, ChatGPT's free users are one again forced to wait for this new feature. On March 26, 2025, OpenAI CEO Sam Altman confirmed the delay without adding a specific period for its end. The surge in interest was expected, but not quite as much. Unfortunately, that means you'll have to wait a little longer or upgrade to ChatGPT Plus for 20 bucks per month. In fairness, with upgrades like this, upgrading to ChatGPT Plus becomes worth the money -- more so than the launch of OpenAI's previous model upgrade, GPT-4.5. It's also worth noting that as part of the GPT-4o upgrade, DALL-E 3 becomes "OpenAI's legacy image generation model." This means OpenAI will eventually remove DALL-E 3 from ChatGPT, though there is currently no specified timeline for this change.
[31]
ChatGPT Can Now Edit Photos Better and Create Charts for Work
OpenAI is trying to incorporate more advanced image features OpenAI is making it easier to edit images in ChatGPT and create visuals for work that include lengthy, legible text, potentially broadening the chatbot's appeal for businesses and everyday users. During a livestreamed event on Tuesday, the San Francisco-based company showed how ChatGPT users will be able to refine images by having a series of conversations with the chatbot. A user might, for example, request an image of a snail in a city, then ask to change part of the backdrop and add a hat. ChatGPT will also be able to better create images with coherent text, according to the company, making it more adept at spitting out diagrams, infographics and logos for professional uses. A person might prompt the chatbot to generate a photorealistic image of a custom menu, for instance, or a map. It will also be able to carry out more complicated instructions from users regarding the composition of an image, the company said. OpenAI has increasingly tried to position ChatGPT as a kind of everything app that offers a search engine, voice assistant and video generator. With the latest improvements, OpenAI is trying to incorporate more advanced image features that could be applicable in a mix of personal and professional uses. It also may help ChatGPT keep ahead of chatbots from rivals like Elon Musk's xAI, which have also added image generators. As with other AI uses, however, ChatGPT may make things up when generating images, such as including text with fake country names on a picture. In a blog post, the company said these errors can happen with less detailed prompts from users. OpenAI also said its AI can have a hard time creating small-sized text and text in non-Latin alphabets, among other issues. It can also take up to a minute to produce images with the updated feature, according to a company blog post. During the livestream, OpenAI Chief Executive Officer Sam Altman said it takes longer because the images are more detailed. The new capabilities will be available Tuesday via OpenAI's GPT-4o model, which is open to free and paid users. The company said it will roll out the new features over the next few weeks to software developers who use its application programming interface. © 2025 Bloomberg L.P.
[32]
OpenAI Adds Image Generation to GPT-4o, But Free Tier Will Have to Wait
OpenAI added image generation capability to its existing GPT-4o artificial intelligence (AI) model on Tuesday. The San Francisco-based AI firm released the 4o Image Generation model and integrated it into the GPT-4o. The company said that the focus of this image generator is on usefulness instead of decorativeness. It comes with accurate text rendering, high prompt adherence, character consistency, and it offers image editing capability via text prompts. OpenAI has also taken several steps to mitigate the risk of deepfakes and the generation of harmful content. Even before this new addition, ChatGPT could generate images powered by one of the DALL-E models. However, this was a basic image-generation experience where character consistency and text generation were sub-par. In a blog post, the company explained that it now intends to add the image-generation function as a primary capability of language models. This means that the company's large language models (LLMs) will now be able to inherently generate images and make edits to generated outputs. Due to the large parameter size of these models and post-training efforts, these models are well suited to understand the context behind user prompts to provide exactly what they're looking for. Also, since these are language models, they can better process and render text accurately. The new image generator was trained on the joint distribution of online images and text. OpenAI claims that the model understands how images relate to language and how images relate to other images. As a result, it now comes with enhanced character consistency, and users can generate multiple images with the same character without much back-and-forth. Images with text generated using GPT 4o Photo Credit: OpenAI/Derya Unatmaz and Les Morgan Additionally, it can also generate images with a large volume of accurate text. This means it can accurately generate images with signboards, restaurant menus, and text written on a whiteboard. Users can also share an image as input, and the chatbot can recreate it in different styles and make edits to it. ChatGPT will also offer multi-turn generation with the latest image generator. Users will be able to ask the AI chatbot to make changes and additions to a generated image with prompts, and it can refine the output without changing other elements. OpenAI claimed that the model can handle up to 10-20 different objects in a single image and add these elements accurately. These features are currently available to ChatGPT Plus, Team, and Pro subscribers. While it was initially available to the free tier as well, OpenAI CEO Sam Altman stated in a post on X (formerly known as Twitter) that due to high request volume, rollout to the free tier is being delayed indefinitely. Notably, several users have taken to social media platforms to share Ghibli-styled recreations of their images and popular memes generated using GPT-4o. Altman also changed his profile picture on X to a Ghibli-style rendition of his image. Ghibli was also trending globally on the social platform. Coming to safety, OpenAI is adding Coalition for Content Provenance and Authenticity (C2PA) information into the metadata of all the AI-generated images so that they can easily be distinguished from authentic images. The AI firm has also built an internal search tool that can verify if an image was generated by the company's model. Apart from this, the company blocks requests for images that include harmful content such as child sexual abuse material and sexual deepfakes. Additionally, when users are editing images of real people, the company has added restrictions to the kind of imagery that can be created.
[33]
OpenAI unveils new image generator for ChatGPT
OpenAI has updated ChatGPT with new technology allowing it to generate images from detailed instructions, such as creating a comic strip. The upgraded version, GPT-4o, combines text, image, voice, and video processing. This new feature is available to people using both the free and paid versions of the chatbot. Chatbots were originally designed to chat. But they can generate images, too. On Tuesday, OpenAI beefed up its ChatGPT chatbot with new technology designed to generate images from detailed, complex and unusual instructions. For instance, if you describe a four-panel comic strip, including the characters who appear in each panel and what they are saying to one another, the technology can instantly generate an elaborate cartoon. Though previous versions of ChatGPT could generate images, they could not reliably create images by blending such a wide array of concepts. The new version of ChatGPT is indicative of a wider change in artificial intelligence technology. After beginning as systems that merely generated text, chatbots are morphing into tools that combine chatting with various other abilities. The technology that underpins the new version of ChatGPT -- called GPT-4o -- also allows the chatbot to receive and respond to voice commands, images and videos. It can even speak. Released at the end of 2022, the original ChatGPT learned its skills by analyzing enormous amounts of text from across the internet. It learned to answer questions, write poetry and generate computer code. It could not generate images. But about a year later, OpenAI released a new version of ChatGPT that could generate images called DALL-E. But ChatGPT and DALL-E were separate systems. Now, OpenAI has built a single system that learns a wide range of skills from both text and images. In generating its own images, this system can draw on everything ChatGPT has learned from the internet. "This is a completely new kind of technology under the hood," said Gabriel Goh, an OpenAI researcher. "We don't break up image generation and text generation. We want it all to be done together." Traditionally, AI image generators have struggled to create images that were markedly different from any existing image. If you asked an image generator to create an image of bicycle with triangular wheels, for instance, it struggled. Goh said that the new ChatGPT could handle this kind of request. OpenAI said that, beginning Tuesday, this new version of ChatGPT would be available to people using both the free and paid versions of the chatbot. This includes ChatGPT Plus, a $20-a-month service, and ChatGPT Pro, a $200-a-month service that provides access to all of the company's latest tools.
[34]
OpenAI Launches New Powerful 4o Image Generator in ChatGPT and Sora
OpenAI has introduced a new update to its GPT-4 model, incorporating native image generation capabilities directly into ChatGPT and Sora. This enhancement represents a significant advancement in artificial intelligence, seamlessly merging text, images, and other modalities into a unified platform. With this update, you can now create, edit, and interact with images in ways that were previously limited to specialized tools. The result is a platform that unlocks a wide range of creative and practical possibilities, making AI more versatile and accessible than ever before. Imagine being able to bring your ideas to life with just a few words -- no specialized tools, no steep learning curve, just you and your imagination. Whether it's designing a custom graphic for your business, creating a unique meme for your social media, or even drafting a manga page, the possibilities are endless. For many of us, the thought of producing professional-quality visuals has always felt out of reach, reserved for those with technical expertise or expensive software. But what if that barrier no longer existed? OpenAI's latest update to its GPT-4 model is here to change the game, introducing native image generation directly within ChatGPT and Sora. It's a leap forward in AI technology, blending text and visuals seamlessly to make creativity more accessible than ever. With this new feature, you don't need to be a designer or a tech wizard to create something extraordinary. OpenAI's multimodal capabilities allow you to interact with the AI naturally, combining text prompts with images to refine and customize outputs to your exact needs. Whether you're a teacher looking to create engaging visuals for your lessons, a small business owner crafting marketing materials, or simply someone exploring a creative hobby, this tool adapts to you. And while the possibilities are exciting, what's even more remarkable is how this innovation levels the playing field, putting advanced creative tools into the hands of anyone who wants to use them. These capabilities make it easier to create exactly the image you envision, helping you communicate more effectively through visuals and advancing image generation into a practical tool with precision and power. This unlocks a new class of AI-generated visuals designed for both personal and professional applications, including: At the heart of this update lies the ability to generate images natively within ChatGPT. Unlike earlier AI tools that required external platforms for visual creation, this feature enables you to produce high-quality, contextually accurate visuals directly within the chat interface. Whether you're designing professional graphics, crafting memes, or rendering intricate manga pages, the system delivers outputs tailored to your specific needs. By integrating this functionality into a single platform, OpenAI has eliminated the need to switch between tools, significantly streamlining workflows for both casual users and professionals. This innovation not only saves time but also enhances productivity, making it easier to bring your ideas to life. GPT-4's multimodal capabilities extend far beyond text and images, incorporating audio and other formats to create a seamless user experience. For instance, you can combine text prompts with existing images to refine outputs or generate entirely new content. This flexibility allows you to interact with the AI in a natural and intuitive manner, making sure that your creative vision is accurately realized. By bridging multiple modalities, OpenAI has developed a tool that adapts to a wide range of applications, from educational resources to marketing campaigns. The ability to work across formats makes GPT-4 a versatile solution for users with diverse needs, fostering innovation across industries. Enhance your knowledge on OpenAI by exploring a selection of articles and guides on the subject. One of the standout features of this update is the enhanced level of creative control it offers. You can specify styles, adjust intricate details, and iteratively refine outputs to align with your vision. For example, if you're designing a trading card, you can dictate the layout, color scheme, and text placement to ensure the final product meets your expectations. This customization enables users to fully use AI-driven creativity, whether you're an experienced designer or a beginner exploring new possibilities. The ability to fine-tune outputs ensures that the tool caters to both personal projects and professional requirements, making it a valuable resource for a wide audience. The practical applications of GPT-4's image generation capabilities span numerous industries, offering solutions tailored to specific needs. Some key examples include: These examples highlight how GPT-4's capabilities can be adapted to meet the demands of various fields, making it a versatile tool for innovation and efficiency. OpenAI has prioritized accessibility in this update, making sure that advanced image generation tools are available to users of all skill levels. The intuitive interface and responsive outputs make it possible for even those without technical expertise to create high-quality visuals. This widespread access of AI-driven creativity levels the playing field, allowing individuals and small teams to compete with larger organizations in terms of content quality and innovation. By removing barriers to entry, OpenAI has made it easier for anyone to harness the power of AI for creative and practical purposes. Looking ahead, OpenAI is committed to expanding the capabilities of GPT-4's image generation features. Plans include API integration, which will allow businesses and developers to embed these tools into their own platforms and applications. This development will further broaden the reach of GPT-4, allowing its use in a variety of customized environments. Additionally, OpenAI is working to improve the system's speed and efficiency, making sure it can meet the demands of an ever-growing user base. These advancements reflect OpenAI's dedication to continuous innovation, paving the way for new possibilities in AI-driven creativity. The introduction of native image generation in GPT-4 marks a pivotal moment in the evolution of artificial intelligence. By uniting text, images, and other modalities into a single, cohesive platform, OpenAI has created a tool that is both powerful and accessible. Whether you're a creative professional, an educator, or a small business owner, this update opens up new opportunities to harness AI for innovation and efficiency. As the technology continues to evolve, its potential applications will expand, shaping a future where AI-driven creativity becomes an integral part of everyday life.
[35]
OpenAI 4o vs MidJourney : AI Image Generation Comparison
Have you ever found yourself wishing for a tool that could bring your creative ideas to life without the hassle of juggling multiple platforms? Whether you're a designer, a marketer, or just someone with a spark of imagination, the struggle to find a seamless, user-friendly solution for creating and editing visuals is all too real. The new OpenAI AI image generator might just be the answer you've been looking for. By integrating advanced image generation and editing capabilities directly into ChatGPT 4o, OpenAI has created a tool that's not only powerful but also accessible to everyone -- even those using the free tier. Imagine being able to craft hyperrealistic images, tweak them to perfection, and align them with your unique vision, all in one place. What sets this tool apart isn't just its innovative technology -- it's the way it simplifies the creative process for users of all skill levels. From handling complex text prompts with ease to offering brand-specific customization and even exporting images with transparent backgrounds, OpenAI's 4o model redefines what's possible in AI-powered creativity. Whether you're prototyping a logo, designing marketing materials, or crafting a comic strip. AI Advantage explores how this innovation is reshaping the world of image generation and compares it to other AI image generators. A defining feature of OpenAI's new tool is its unmatched accessibility. Unlike many competitors that restrict advanced functionalities to premium users, OpenAI ensures that this technology is available to everyone. Whether you are a free-tier user or a subscriber, you can generate or edit images simply by typing prompts like "create an image of..." or "edit this image." This intuitive approach ensures that even individuals with minimal technical expertise can fully use the platform's capabilities. By providing widespread access to access to innovative AI tools, OpenAI enables users from all backgrounds to explore their creative potential. OpenAI's image generation model offers a comprehensive suite of tools designed to cater to a wide range of creative applications. Its standout features include: These features collectively position OpenAI's model as a versatile and powerful tool for both casual users and professionals seeking to streamline their creative workflows. Find more information on AI Image Generation by browsing our extensive range of articles, guides and tutorials. OpenAI's model not only competes with but often surpasses its rivals in several critical areas. Its strengths and limitations are as follows: By addressing the needs of a broad audience while maintaining high performance, OpenAI's model demonstrates its ability to stand out in a competitive landscape. The versatility of OpenAI's image generation and editing model makes it an invaluable resource across multiple industries. Its ability to handle diverse creative tasks ensures its relevance for professionals and hobbyists alike. Key applications include: These use cases highlight the model's ability to adapt to the needs of marketers, designers, content creators, and other professionals, making it a versatile tool for modern creative workflows. OpenAI's model distinguishes itself through its seamless integration with GPT-4, combining advanced image generation, editing capabilities, and language processing into a single platform. This integration eliminates the need for multiple tools, offering users a streamlined and efficient workflow. Here's how it compares to other platforms: By offering a comprehensive solution that integrates multiple functionalities, OpenAI's model provides a distinct advantage over its competitors, catering to a wide range of creative and professional needs. The true strength of OpenAI's image generation and editing model lies in its seamless integration into ChatGPT. By combining advanced image creation, precise editing tools, and GPT-4's language processing capabilities, the platform eliminates the need for switching between multiple applications. This integration not only saves time but also enhances productivity, allowing users to focus on their creative vision without technical interruptions. Whether you are a casual user exploring creative possibilities or a professional streamlining your workflow, OpenAI's model offers a unified solution that simplifies and enhances the creative process.
[36]
OpenAI Adds New Image Generation Capabilities to GPT-4o | PYMNTS.com
"GPT-4o image generation excels at accurately rendering text, precisely following prompts and leveraging 4o's inherent knowledge base and chat context -- including transforming uploaded images or using them as visual inspiration," the company said in a Tuesday (March 25) blog post. The model's newly improved capabilities include text rendering that is better able to blend precise symbols with imagery, multiturn generations that allow users to refine and experiment with images through natural conversation, and instruction following that enables GPT-4o to follow detailed prompts and handle as many as 20 different objects, according to the post. GPT-4o can also learn from user-uploaded images to inform the image generation, link its knowledge of text and images to work more efficiently, and produce images with photorealism and a variety of styles, the post said. All images generated by GPT-4o will include C2PA metadata to identify them as being artificial intelligence (AI) generated, and OpenAI will continue to block requests that violate its content policies, per the post. OpenAI began rolling out GPT-4o image generation to Plus, Pro, Team and Free users Tuesday, and said it will soon add Enterprise and Edu users. "Creating and customizing images is as simple as chatting using GPT-4o -- just describe what you need, including any specifics like aspect ratio, exact colors using hex codes, or a transparent background," the company said in the post. OpenAI launched GPT-4o in May, saying that the AI model featured text, vision and audio processing improvements and would be free for all users, with paid subscribers getting up to five times the capacity limits. "OpenAI's ChatGPT was already the most impressive chatbot for humanlike conversations given the range of voices and its ability to understand nuance," Antony Cousins, who was executive director of AI strategy at Cision at the time, told PYMNTS. "The speed enhancements now make this next to indistinguishable from a human conversation."
[37]
OpenAI unveils GPT‑4o Image Generation with contextual learning
On Tuesday, OpenAI announced the release of GPT‑4o Image Generation, its most advanced image generation feature to date. Integrated into GPT‑4o, this capability aims to create visuals that are both "beautiful and useful," according to the company. Text Rendering Precision: GPT‑4o has been designed to seamlessly incorporate symbols and text into imagery, allowing users to communicate with clarity and precision. Interactive Refinement: Users can engage in multi-turn interactions, refining images through conversation. For instance, when designing a video game character, GPT‑4o ensures that traits and features remain consistent across iterations. Detailed Prompt Following: OpenAI highlighted that while earlier systems struggled with rendering ~5-8 objects, GPT‑4o can accurately process prompts involving up to 10-20 objects, offering better control over traits, relationships, and details. Contextual Awareness: The system analyzes and learns from user-uploaded images, integrating their details to inform and enhance its image generation. Stylistic Variety and Realism: With training on a vast array of styles, GPT‑4o is capable of producing photorealistic images or transforming visuals into artistic representations tailored to user preferences. Despite its advancements, OpenAI acknowledged certain shortcomings of GPT‑4o Image Generation. For example, the model "occasionally crops longer images, like posters, too tightly, especially near the bottom." OpenAI emphasized plans to address these issues through subsequent updates. OpenAI reiterated its commitment to ethical and responsible AI use, citing the following measures: The company explained that humans have long used visual tools -- from cave paintings to modern infographics -- to communicate and analyze information. GPT‑4o bridges the gap between artistic expression and practical utility, enabling the creation of visuals such as logos, diagrams, and informational designs that communicate precise meanings. The rollout began on March 25, 2025, for Plus, Pro, Team, and Free users of ChatGPT. Access for Enterprise and Edu users is expected to follow soon. Additionally, Sora users now have access to GPT‑4o's image generation capabilities. OpenAI noted that developers would gain API access within the coming weeks. Users can generate customized visuals by simply describing their requirements through GPT‑4o. The system supports detailed specifications, such as aspect ratios, color hex codes, and transparent backgrounds. However, OpenAI highlighted that rendering these highly detailed images may take up to one minute.
[38]
OpenAI launches GPT‑4o image generation: Availability, features, and more
GPT-4o image generation is currently rolling out to Plus, Pro, Team, and Free users as the default image generator in ChatGPT. OpenAI has introduced GPT‑4o image generation, making it easier for users to create high-quality visuals directly within ChatGPT. According to OpenAI, GPT‑4o image generation "excels at accurately rendering text, precisely following prompts, and leveraging 4o's inherent knowledge base and chat context -- including transforming uploaded images or using them as visual inspiration." "Our model isn't perfect. We're aware of multiple limitations at the moment which we will work to address through model improvements after the initial launch," OpenAI explained. Also read: Apple WWDC 2025 dates confirmed: iOS 19, iPadOS 19, and more expected OpenAI has implemented strong safety measures to prevent misuse. All generated images include C2PA metadata, marking them as AI-generated. Additionally, OpenAI has developed an internal search tool to help verify if an image was created by GPT‑4o. The company also blocks harmful content, including explicit images and deepfakes. These efforts align with OpenAI's broader commitment to responsible AI development. Also read: iOS 18.4: These Apple Intelligence features will come to your iPhone with upcoming update GPT-4o image generation is currently rolling out to Plus, Pro, Team, and Free users as the default image generator in ChatGPT. It will soon be available to Enterprise and Edu users. Sora users can also access it.
Share
Share
Copy Link
OpenAI integrates GPT-4o's powerful image generation features into ChatGPT, offering improved accuracy, detail, and practical applications for creative professionals.
OpenAI has announced a significant upgrade to ChatGPT's image generation capabilities, integrating the powerful GPT-4o model directly into its flagship chatbot. This move marks a major advancement in AI-powered image creation and editing, positioning OpenAI to compete more effectively in the rapidly evolving field of generative AI 1.
The new GPT-4o image generation model boasts several improvements over its predecessor, DALL-E 3:
Improved Accuracy and Detail: GPT-4o generates more accurate and detailed images, albeit with slightly longer processing times 1.
Advanced Editing: Users can now edit existing images, including those containing people, through transformation and "inpainting" of foreground and background elements 1.
Better Text Rendering: The model shows significant progress in generating readable text within images, addressing a long-standing challenge in AI image generation 3.
Improved "Binding": GPT-4o demonstrates enhanced ability to correctly identify and place objects within images, a crucial feature for professional use 3.
The new feature is currently available to ChatGPT Pro subscribers, with plans to extend access to Plus and free users in the near future 2. However, due to unexpectedly high demand, the rollout to free users has been temporarily delayed 2.
OpenAI is positioning this tool to cater to creative professionals such as graphic designers, advertisers, and social media managers 3. The focus on practical applications, such as creating infographics, diagrams, and work-related images, sets it apart from purely artistic AI image generators 4.
While OpenAI has implemented safety measures, including content policies and metadata tagging for AI-generated images, the company has also loosened some restrictions:
Expanded Content Range: GPT-4o can now create more risqué content "within reason," potentially competing with less restricted models like Elon Musk's Grok 5.
User Control: OpenAI aims to give users more intellectual freedom and control over content generation, while still maintaining safeguards against extreme violations 5.
As OpenAI continues to refine GPT-4o's image generation capabilities, the company plans to address current limitations such as precise region editing, cropping issues, and multilingual text rendering 4. The integration of this feature into ChatGPT represents a significant step in OpenAI's efforts to create more versatile and powerful AI tools for both casual users and professionals alike.
Reference
[3]
MIT Technology Review
|OpenAI's new image generator aims to be practical enough for designers and advertisersOpenAI has integrated DALL-E 3 image generation into ChatGPT, allowing free users to create up to two AI-generated images per day. This move expands access to advanced AI tools and enhances the ChatGPT experience.
10 Sources
10 Sources
OpenAI's new image generation feature in ChatGPT goes viral with Studio Ghibli-style creations, raising concerns about copyright, AI ethics, and content moderation.
64 Sources
64 Sources
OpenAI integrates a powerful new image generation capability into ChatGPT, surpassing previous models in quality and versatility, while raising concerns about media manipulation and copyright.
9 Sources
9 Sources
OpenAI releases GPT-4.5, its latest AI model, with limited availability due to GPU shortages. The update brings incremental improvements but raises questions about the company's focus on AGI versus practical applications.
14 Sources
14 Sources
OpenAI has updated its GPT-4o model with enhanced problem-solving abilities, improved instruction following, and increased creativity. The company cites GPU shortages as the reason for not updating GPT-4.5.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved