OpenAI Upgrades ChatGPT with Advanced AI Image Generation Capabilities

ChatGPT finally gets a much better image generator - how to try it for free

Say goodbye to warped and misspelled text in your image generations - and hello to looser safeguards. OpenAI has continually expanded its ChatGPT offerings, adding an AI voice assistant, file and image understanding, advanced research capabilites, AI agents, and more. However, there's been one glaring omission -- a really capable image generator. On Tuesday, OpenAI launched 4o image generation. This image model is significantly better -- albeit slower -- than the DALL-E models previously offered by OpenAI. It tackles very difficult prompts such as realistic images and, most impressively, accurate text. Also: I tried ChatGPT's new Advanced Voice Mode update - here's what changed For example, in the live stream demo, OpenAI CEO Sam Altman, joined by researchers Gabriel Goh and Prafulla Dhariwal, prompted 4o to create a photo from a specific POV with a flyer that included lots of text. After loading for a few seconds, it got the cinematic direction right and accurately printed all the text. It also boasts many other capabilities OpenAI's previous image generator didn't have, such as image referencing, which can be used to render a new version of the image (such as an anime version or a selfie), or as inspiration for creating a completely new work. Because this tool is meant to integrate into creatives' workflows, it can generate images on transparent backgrounds, use specific colors from HEX codes, or implement the chatbot's advanced conversational capabilities in the generations. For example, when prompted to include "humor" in the photo during the demo, it included text that met that criteria. Because the image generator is accessible in ChatGPT, users can also refine images through a multi-turn conversation. This makes tweaking images easier and allows the model to use the context of previous generations to create new ones. Since GPT-4o has access to the web, that context is also added to creating the images. According to the company, GPT-4o's image generation also has strong instruction adherence. It can handle 10-20 different objects, which means you can prompt it to generate a high volume of objects in one go. Another new aspect of the image generator is that it can now create more risque content, something Elon Musk's Grok model is known for. During the live stream, Altman shared that you will be able to use GPT-4o's image generation to create offensive content "within reason." In an X post after the livestream, Altman added: "What we'd like to aim for is that the tool doesn't create offensive stuff unless you want it to, in which case within reason it does. As we talk about in our model spec, we think putting this intellectual freedom and control in the hands of users is the right thing to do, but we will observe how it goes and listen to society." Also: Grok 3 AI is now free to all X users - here's how it works The blog post announcing the model noted that it will block requests that violate content policies, including child sexual abuse materials and sexual deepfakes. Another safeguard in place is limiting what can be created when real people are in the context, including "particularly robust safeguards around nudity and graphic violence." Users can visit the System Card for all the safety information in the 4o image generation model. The updated image generation features are rolling out today in ChatGPT and Sora. Regardless of whether they are subscribed, all users (including free) will have access to GPT-4o image generation as the default. If users still want to access DALL-E, they can do so through a dedicated DALL-E GPT. Enterprise and Education users will be given access soon, with access to developers via the API slated for the upcoming weeks. Also: The best AI image generators: Tested and reviewed When DALL-E first launched, it lived on its standalone website; at the time, it felt like the greatest and latest. Since then, it has been moved to only reside in ChatGPT; there, the model paled compared to more advanced image generation models from competitors such as Midjourney, Google, and Adobe. This update now helps level the playing field, enabling it to compete better with other models.

[6]

ChatGPT's new image generator creates stunning images - for some users

Say goodbye to warped and misspelled text in your image generations - and hello to looser safeguards. OpenAI has continually expanded its ChatGPT offerings, adding an AI voice assistant, file and image understanding, advanced research capabilites, AI agents, and more. However, there was one glaring omission -- a really capable image generator. On Tuesday, OpenAI launched 4o image generation. This image model is significantly better -- albeit slower -- than the DALL-E models previously offered by OpenAI. It tackles very difficult prompts such as realistic images and, most impressively, accurate text. Also: I tried ChatGPT's new Advanced Voice Mode update - here's what changed For example, in the live stream demo, OpenAI CEO Sam Altman, joined by researchers Gabriel Goh and Prafulla Dhariwal, prompted 4o to create a photo from a specific POV with a flyer that included lots of text. After loading for a few seconds, it got the cinematic direction right and accurately printed all the text. It also boasts many other capabilities OpenAI's previous image generator didn't have, such as image referencing, which can be used to render a new version of the image (such as an anime version or a selfie), or as inspiration for creating a completely new work. Because this tool is meant to integrate into creatives' workflows, it can generate images on transparent backgrounds, use specific colors from HEX codes, or implement the chatbot's advanced conversational capabilities in the generations. For example, when prompted to include "humor" in the photo during the demo, it included text that met that criteria. Because the image generator is accessible in ChatGPT, users can also refine images through a multi-turn conversation. This makes tweaking images easier and allows the model to use the context of previous generations to create new ones. Since GPT-4o has access to the web, that context is also added to creating the images. According to the company, GPT-4o's image generation also has strong instruction adherence. It can handle 10-20 different objects, which means you can prompt it to generate a high volume of elements in one go. Another new aspect of the image generator is that it can now create more risque content, something Elon Musk's Grok model is known for. During the live stream, Altman shared that you will be able to use GPT-4o's image generation to create offensive content "within reason." In an X post after the livestream, Altman added: "What we'd like to aim for is that the tool doesn't create offensive stuff unless you want it to, in which case within reason it does. As we talk about in our model spec, we think putting this intellectual freedom and control in the hands of users is the right thing to do, but we will observe how it goes and listen to society." Also: Grok 3 AI is now free to all X users - here's how it works The blog post announcing the model noted that it will block requests that violate content policies, including child sexual abuse materials and sexual deepfakes. Another safeguard in place is limiting what can be created when real people are in the context, including "particularly robust safeguards around nudity and graphic violence." Users can visit the System Card for all the safety information in the 4o image generation model. The updated image generation features are rolling out now in ChatGPT and Sora. At launch, the model was announced to be coming to all users (including free), with GPT-4o image generation becoming the new default. However, due to high demand, Altman announced a day after that the rollout to the free tier would now be "delayed for awhile." This means that to access the image generation, you now have to be subscribed. For individual users, the best option is ChatGPT Plus, which costs $20 per user per month and comes with many other perks, including OpenAI's Sora video generator. At the time of writing this article, I was able to access the image generator from my Plus account. Enterprise and Education users will be given access soon, with access to developers via the API slated for the upcoming weeks. Also: The best AI image generators: Tested and reviewed When DALL-E first launched, it lived on its standalone website; at the time, it felt like the greatest and latest. Since then, it has been moved to only reside in ChatGPT; there, the model paled compared to more advanced image generation models from competitors such as Midjourney, Google, and Adobe. This update now helps level the playing field, enabling it to compete better with other models. However, if users still want to access DALL-E, they can do so through a dedicated DALL-E GPT.

[7]

PC Magazine

ChatGPT Gets Native AI Image Generation

Users can drop prompts (and images) and get AI-generated images within the regular ChatGPT window, thanks to GPT-4o. It's available now to Plus, Pro, Team, and Free users with limitations. OpenAI has added AI image generation capabilities to ChatGPT. Users can now select the GPT-4o model, provide prompts, and get desired images within the regular ChatGPT window. Previously, ChatGPT was dependent on OpenAI's DALL-E model for images. Now, it uses the 4o model's native multimodal capabilities to provide "precise, accurate, photorealistic outputs." OpenAI touts GPT‑4o's skill for "accurately rendering text, precisely following prompts, and leveraging 4o's inherent knowledge base and chat context -- including transforming uploaded images or using them as visual inspiration." Translation: Expect fewer weird results. This was achieved by training the models on "the joint distribution of online images and text, learning not just how images relate to language, but how they relate to each other," OpenAI says. GPT-4o can also handle more objects within an image than usual. While other chatbots can generate up to eight objects for an image, GPT-4o can produce up to 20, according to OpenAI. It can also edit and improve user-uploaded images. In a demo video, an OpenAI researcher is seen uploading a hand-drawn sketch for a comic book page and getting a full-colored digital version delivered by ChatGPT. Still, OpenAI warns, "Our model isn't perfect. We're aware of multiple limitations at the moment, which we will work to address through model improvements after the initial launch." OpenAI will embed each output with C2PA metadata. This will allow AI image detectors to identify images generated by GPT-4o accurately. Additionally, ChatGPT will reject requests for child sexual abuse materials (CSAM) and sexual deepfakes. "When images of real people are in context, we have heightened restrictions regarding what kind of imagery can be created, with particularly robust safeguards around nudity and graphic violence," OpenAI says. In an addendum added later, OpenAI said it won't block GPT-4o from generating images of adult public figures, but those "who wish for their depiction not to be generated can opt out." At launch, ChatGPT's native image generation is available for all Plus, Pro, Team, and Free users, with support for Enterprise and Edu customers coming soon. The feature is also available on OpenAI's video-generation tool, Sora. OpenAI hasn't announced a daily limit for free users but tells The Verge that it will mirror DALL-E, which limits users to three free images per day. However, these numbers "may change over time based on demand," a spokesperson adds. None of this means DALL-E is going away. "For those who hold a special place in their hearts for DALL-E, it can still be accessed through a dedicated DALL-E GPT," OpenAI says.

[0]

ChatGPT's new image generator shattered my expectations - and now it's free to try

The newly released model can finally compete with Midjourney, Google's Imagen 3, and Adobe's Firefly. OpenAI may have kicked off the text-to-image generation craze with its DALL-E model, but since those earlier glory days, the AI company's offering has been lapped by much more capable image models. As a result, when OpenAI released its latest and greatest GPT-4o image generation model, I was skeptical. After testing it, I have changed my mind entirely. When DALL-E first launched, it lived on its standalone website; since then, it has moved to ChatGPT. The move came with many benefits, including the ability to ask the AI chatbot for an image you want in the same interface where you are already chatting about something else, thereby eliminating the need for constant context switching. With the release of GPT-4o image generation, OpenAI kept this convenient format, switching the default image generator from DALL-E to GPT-4o for paid subscribers. As a result, it was super easy to start creating new images from my ChatGPT Plus account. All I had to do was enter the prompt for what I wanted to see, and then it generated them. Users can also access it from the Sora interface. Also: How to use OpenAI's Sora to create stunning AI-generated videos You can also generate images if you are a free user. At launch, the model was announced to be coming to all users, including free ones, but then OpenAI CEO Sam Altman announced a day later that the rollout to the free tier would now be "delayed for awhile," only to make it available to free users again a week later. However, if you are unimpressed when you try it in the free version, it is because the only method that activates the use of GPT-4o is typing in the shortcut "/create image." If you simply type a request such as "Create an image of XYZ," it will default to the DALL-E model, which renders significantly lower-quality photos. OpenAI does not explicitly state limits, but after generating three images from my free account, I hit my daily limit. Therefore, ChatGPT Plus is still a good option for higher access to image generation. The moment you have been waiting for -- the images. After you insert a prompt, the AI outputs the generation in under a minute. The process does take a bit longer than it used to, but the images are worth the wait, delivering lots of details, texture, realism, and even text accuracy. Instead of describing it, I will include examples below so you can see for yourself. Prompt: Can you generate a realistic image of a chameleon, up close, shot as if it were in National Geographic in 16:9 ratio? Prompt: Can you generate an image of a laptop open on a desk that says, "This model is so good that it can even get text and hands right, which are usually major challenges for AI models," with hands typing on a keyboard in 16:9 ratio? Prompt: Can you generate a realistic photo of a close-up of a woman in a crowd in Times Square looking at the camera and smiling, with the quality of one taken on a DSLR? As seen above, the image generator does a great job of adhering to the prompt and delivering high-quality, realistic images. However, when testing an AI model, one of the true performance metrics is how it compares to competitors on the market. To give you a good indicator of this, I made it generate the same prompt I tested across all of the major AI image generators, including Midjourney, Google's Imagen 3, Adobe Firefly, and more. I am attaching GPT-4o's rendition below. You can see how it fares against all of the other AI image generators in this article, including DALL-E's rendition, which clearly is far behind what the new model can do. Prompt: Can you generate an image of a vibrant, realistic hummingbird perched on a tree? Even though the quality of the images is perhaps one of the model's biggest wins, there are other benefits as well. One of the biggest is that it lives in the chatbot's interface, which makes it easy to tweak the generations with simple natural language prompts. Also, because the chatbot has the context of what you just asked it, it can consider that in building the image. For example, if you are chatting with it about throwing a birthday party, you may be able to say, "Can you now create an invite that has the information above on it?" instead of having to retype. For example, I started chatting with ChatGPT about throwing a housewarming, and when asking it to create an invite, I did not have to repeat the information I previously provided. You can also upload reference images and then ask ChatGPT to create a different version or use them as elements of a new one. For example, you can input it as a selfie and have it generated in anime style, as seen in Altman's new X post. All of these customization features make it a really strong offering for creatives, who can also request that it be rendered on a transparent background or incorporate brand style guides such as hex codes or logos. Speaking of Altman, I was able to generate an image of him wearing a party hat. I could do so because the new model has much looser safeguards, meant to allow users to lean into their creative freedom. The blog post announcing the model noted that it limits what can be created when real people are in the context, including "particularly robust safeguards around nudity and graphic violence." I cannot tell if there is a practical use case for this feature, but it is a notable change I needed to try out for myself. When I tried to create an image of Mickey Mouse, it said it could not due to copyright implications, so it seems not all public figures are fair game. Overall, the GPT-4o image generator is a big win over the DALL-E models and perhaps among the best of the many I have tested. Is it worth the $20 per month? If you are just interested in high-quality image generation, there are still free versions you can explore that are really capable, such as Adobe Firefly or Google's Imagen 3. Also: The best AI image generators: Tested and reviewed Having said this, the updated image generation features are rolling out now in ChatGPT and Sora, and all users, including free ones, can access the model. However, free users must type the shortcut "/create image" to activate GPT-4o, or else the system defaults to the lower-quality DALL-E model. If you are a frequent ChatGPT user, the upgrade to ChatGPT Plus becomes significantly more enticing. You will have access to all of OpenAI's latest and greatest chatbot features, as well as high-quality image and video generation, all for $20 a month, which is not a bad deal, especially considering other offerings on the market. For example, Midjourney's subscription starts at $10 per month and only offers image generation.

[8]

Reuters

Ghibli effect: ChatGPT usage hits record after rollout of viral feature

April 1 (Reuters) - The frenzy to create Ghibli-style AI art using ChatGPT's image-generation tool led to a record surge in users for OpenAI's chatbot last week, straining its servers and temporarily limiting the feature's usage. The viral trend saw users from across the globe flood social media with images based on the hand-drawn style of the famed Japanese animation outfit, Studio Ghibli, founded by renowned director Hayao Miyazaki and known for movies such as "Spirited Away" and "My Neighbor Totoro". Average weekly active users breached the 150 million mark for the first time this year, according to data from market research firm Similarweb. "We added one million users in the last hour," OpenAI CEO Sam Altman said in an X post on Monday, comparing it with the addition of one million users in five days following ChatGPT's red-hot launch more than two years ago. Active users, in-app subscription revenue and app downloads reached an all-time high last week, according to SensorTower data, after the AI company launched updates to its GPT-4o model, enabling advanced image generation capabilities. Global app downloads and weekly active users on the ChatGPT app grew 11% and 5%, respectively, from the prior week, while in-app purchase revenue increased 6%, the market intelligence firm said. "It's super fun seeing people love images in ChatGPT. But our GPUs are melting," Altman said last week in response to the viral trend. LEGAL UNCERTAINTY The extensive usage of the AI tool for the Ghibli effect has also led to questions about potential copyright violations. "The legal landscape of AI-generated images mimicking Studio Ghibli's distinctive style is an uncertain terrain. Copyright law has generally protected only specific expressions rather than artistic styles themselves," said Evan Brown, partner at law firm Neal & McDevitt. OpenAI did not immediately respond to a request for comment on the data used to train its AI models and the legality surrounding its latest feature. Studio Ghibli co-founder Miyazaki's comments from 2016 on AI-generated images resurfaced after the trend blew up last week. "I am utterly disgusted," Miyazaki had said after being shown an early render of an AI-generated. "I would never wish to incorporate this technology into my work at all." Reporting by Akash Sriram Kritika Lamba in Bengaluru; editing by Arpan Varghese and Anil D'Silva Our Standards: The Thomson Reuters Trust Principles., opens new tab Suggested Topics:Artificial Intelligence Akash Sriram Thomson Reuters Akash reports on technology companies in the United States, electric vehicle companies, and the space industry. His reporting usually appears in the Autos & Transportation and Technology sections. He has a postgraduate degree in Conflict, Development, and Security from the University of Leeds. Akash's interests include music, football (soccer), and Formula 1.

[0]

ChatGPT's stunning new image generator is now free for everyone

OpenAI has continually expanded its ChatGPT offerings, adding an AI voice assistant, file and image understanding, advanced research capabilities, AI agents, and more. However, there was one glaring omission: A really capable image generator. Last week, OpenAI launched 4o image generation. This image model is significantly better -- albeit slower -- than the DALL-E models previously offered by OpenAI. It tackles very difficult prompts, such as realistic images and, most impressively, accurate text. Also: I tried ChatGPT's new Advanced Voice Mode update - here's what changed For example, in the live stream demo, OpenAI CEO Sam Altman, joined by researchers Gabriel Goh and Prafulla Dhariwal, prompted 4o to create a photo from a specific POV with a flyer that included lots of text. After loading for a few seconds, it got the cinematic direction right and accurately printed all the text. It also boasts many other capabilities that OpenAI's previous image generator did not have, such as image referencing, which can be used to render a new version of the image (such as an anime version or a selfie) or as inspiration for creating a completely new work. Because this tool is meant to integrate into creatives' workflows, it can generate images on transparent backgrounds, use specific colors from HEX codes, or implement the chatbot's advanced conversational capabilities in the generations. For example, when prompted to include "humor" in the photo during the demo, it included text that met that criteria. The image generator is accessible in ChatGPT, so users can refine images through a multi-turn conversation. This makes tweaking images easier and allows the model to use the context of previous generations to create new ones. Since GPT-4o has access to the web, that context is also added to creating the images. According to the company, GPT-4o's image generation also has strong instruction adherence. It can handle 10-20 different objects, which means you can prompt it to generate a high volume of elements in one go. Another new aspect of the image generator is that it can now create more risque content, something Elon Musk's Grok model is known for. During the live stream, Altman shared that you will be able to use GPT-4o's image generation to create offensive content "within reason." In an X post after the livestream, Altman added: "What we'd like to aim for is that the tool doesn't create offensive stuff unless you want it to, in which case within reason it does. As we talk about in our model spec, we think putting this intellectual freedom and control in the hands of users is the right thing to do, but we will observe how it goes and listen to society." Also: Grok 3 AI is now free to all X users -- here's how it works The blog post announcing the model noted that it will block requests that violate content policies, including child sexual abuse materials and sexual deepfakes. Another safeguard in place is limiting what can be created when real people are in the context, including "particularly robust safeguards around nudity and graphic violence." Users can visit the System Card for all the safety information in the 4o image generation model. The updated image generation features are rolling out now in ChatGPT and Sora. All users, including free ones, can access the model. However, if you're unimpressed when you try it in the free version, it's because the only method that activates the use of GPT-4o is typing in the shortcut "/create image." If you just type in a request such as "Create an image of XYZ," it will default to the DALL-E model, which will render significantly lower-quality photos. OpenAI doesn't explicitly state the limits to image generation, but after creating three images from my free account, I was told I hit my limit for the day. This means that to get higher access to image generation, you need to be subscribed. For individual users, the best option is ChatGPT Plus, which costs $20 per user per month and comes with many other perks, including OpenAI's Sora video generator. At the time of writing this article, I was able to access the image generator from my Plus account. Enterprise and Education users will be given access soon, with access to developers via the API slated for the upcoming weeks. Also: The best AI image generators: Tested and reviewed When DALL-E first launched, it lived on its standalone website; at the time, it felt like the greatest and latest. Since then, it has been moved to only reside in ChatGPT; there, the model paled compared to more advanced image generation models from competitors such as Midjourney, Google, and Adobe. This update now helps level the playing field, enabling it to compete better with other models. However, if users still want to access DALL-E, they can do so through a dedicated DALL-E GPT.

[9]

Now you can generate images directly from ChatGPT and Sora

OpenAI just announced that all users will soon be able to generate images directly inside of ChatGPT. It's rolling out to ChatGPT Plus, Pro, Team and, most importantly, Free users. This will be the default image generation tool in 4o, so there will be no need to open Dall-E whenever you want to whip up a picture of a cat in space eating lasagna or whatever. The feature's also coming to Sora. The company says that the platform will "generate high-quality images based on your prompt, conversation and uploaded files." To the latter point, it'll be able to transform pre-existing images based on prompts. OpenAI is also boasting about significant improvements in text rendering and contextual understanding. These new tools are intended for both personal and professional use. As such, OpenAI gives a number of examples as to where this type of image generation could come in handy. These include the creation of infographics, social media promotional graphics and images with plenty of text, as seen below. This being a modern generation tool, it can also handle high-end visuals. The company says it offers a "strong capability for photorealism, including light, shadow, and texture accuracy." The ability to understand context could also be useful, as OpenAI says this could be used to create a "poster of birds found in Central Park" or a "visualization of an art history era discussed previously in the conversation." It's built on GPT-4o, an AI model that was . The "o" stands for "omni", which is a reference to the model's multimodal capabilities. This is what allows many of the aforementioned features, like being able to iterate on uploaded files. Today's news looks like another step on the long road toward the "one AI to rule them all" functionality that .

[10]

CNBC

ChatGPT's viral image-generation AI is 'melting' OpenAI's GPUs

Sam Altman, co-founder and C.E.O. of OpenAI, speaks during the New York Times annual DealBook summit at Jazz at Lincoln Center on December 04, 2024 in New York City. OpenAI CEO Sam Altman on Thursday announced that viral use of ChatGPT's new image-generation AI, introduced earlier this week, is overloading the company's servers. While it's "super fun seeing people love images" in ChatGPT, "our GPUs are melting," Altman posted on X Thursday, adding that the company will temporarily limit the feature's usage as it works to make it more efficient. The rate limits affect OpenAI's Tuesday debut of native image generation within ChatGPT. The company debuted the "high-quality" image-generation tool as a way to produce everything from diagrams, infographics and logos to business cards and stock photos. The feature can also use an image as a starting point for art, such as a custom painting of a pet or editing a professional headshot. The image-generation feature began rolling out to ChatGPT PLUS, Pro and Team users on Tuesday, as well as users of the chatbot's free tier when they use OpenAI's 4o model. ChatGPT Enterprise and Edu users will receive access next week, the company said. Images of anime-style renderings of users' uploaded photos have been going viral on X and other social media apps since the feature's Tuesday launch. Altman, for example, changed his X profile photo to an image generated by the new feature. One of the company's first hit products was the Dall-E model launched in 2021. That was one of the first artificial intelligence image generators, and was integrated into ChatGPT in 2023. Users of ChatGPT's free tier will soon be able to generate three images per day, Altman wrote.

[11]

ChatGPT gained one million new users in an hour today

The new image generation feature from OpenAI is so popular it's causing issues with new signups. OpenAI has been doubling its audience for ChatGPT at a rapid rate, and the addition of its latest image generation feature has increased the AI assistant's popularity. Today, CEO Sam Altman posted to X that the service "added one million users in the last hour," calling it a "biblical demand" for the image generation. When the company announced the rollout of image generation in ChatGPT last week, the tool was meant to be available to all user tiers. However, the high degree of interest meant that access for free users was walked back. Now, the company is reporting "issues with new signups," which has been its status for more than a day. While ChatGPT's ability to create original images has been a delight (and occasional horror show) for users, the AI platform continues to draw the ire of artists and creatives. Many of these people have their work taken, either as training material for the large-language model scraped from the internet or improperly taken and modified by ChatGPT users, without any credit or compensation. One of the more visible examples have been the many memes image inspired by the animation style of Studio Ghibli. Altman is currently using one such image as his X profile photo, but other users have used ChatGPT to apply the look of the studio's films to violent or triggering content, raising a fresh round of questions about how creators can protect their work from misuse.

[12]

OpenAI delays rollout of ChatGPT's image generator to free users

It's currently available to those with Plus, Pro and Team accounts. Free ChatGPT users will have to wait a while longer to be able to use its built-in image generation capability. OpenAI has just launched a feature that will allow users to generate images directly inside of ChatGPT, and it was supposed to roll out to all Plus, Pro, Team and Free users. But according to company CEO Sam Altman, it has been way more popular than OpenAI had expected even though they already had high expectations to begin with. As such, its rollout to the free tier is "unfortunately going to be delayed for a while." People have been posting ChatGPT's output all over social media. One popular trend that emerged from the launch is turning existing photos into Studio Ghibli-style illustrations. Even Altman poked fun at it and used a Ghibli-fied photo of himself as his profile picture on X/Twitter. Users have been creating Ghibli versions of everything from personal images to photos of political figures and memes. As VentureBeat has noted, Ghibli creator Hayao Miyazaki likely wouldn't approve of them, seeing as he reacted quite negatively to AI-powered animation years ago, but we doubt that would stop the trend until people get tired of it. Altman didn't give a timeline for when the feature will make its way to free users. For now, those who want to use the chatbot's built-in generator will have to log into ChatGPT with a Plus, Pro or Team account and then select GPT-4o as their AI model in the lower left-corner.

[13]

NYT

OpenAI Unveils New Image Generator for ChatGPT

Sign up for the On Tech newsletter. Get our best tech reporting from the week. Get it sent to your inbox. Chatbots were originally designed to chat. But they can generate images, too. On Tuesday, OpenAI beefed up its ChatGPT chatbot with new technology designed to generate images from detailed, complex and unusual instructions. For instance, if you describe a four-panel comic strip, including the characters who appear in each panel and what they are saying to one another, the technology can instantly generate an elaborate cartoon. Though previous versions of ChatGPT could generate images, they could not reliably create images by blending such a wide array of concepts. The new version of ChatGPT is indicative of a wider change in artificial intelligence technology. After beginning as systems that merely generated text, chatbots are morphing into tools that combine chatting with various other abilities. The technology than underpins the new version of ChatGPT -- called GPT 4-o -- also allows the chatbot to receive and respond to voice commands, images and videos. It can even speak. Released at the end of 2022, the original ChatGPT learned its skills by analyzing enormous amounts of text from across the internet. It learned to answer questions, write poetry and generate computer code. It could not generate images. But about a year later, OpenAI released a new version of ChatGPT that could generate images called DALL-E. But ChatGPT and DALL-E were separate systems. Now, OpenAI has built a single system that learns a wide range of skills from both text and images. In generating its own images, this system can draw on everything ChatGPT has learned from the internet. "This is a completely new kind of technology under the hood," said Gabriel Goh, an OpenAI researcher. "We don't break up image generation and text generation. We want it all to be done together." Traditionally, A.I. image generators have struggled to create images that were markedly different from any existing image. If you asked an image generator to create an image of bicycle with triangular wheels, for instance, it struggled. Mr. Goh said that the new ChatGPT could handle this kind of request. OpenAI said that, beginning Tuesday, this new version of ChatGPT would be available to people using both the free and paid versions of the chatbot. This includes both ChatGPT Plus, a $20-a-month service, and ChatGPT Pro, a $200-a-month service that provides access to all of the company's latest tools. (The New York Times sued OpenAI and its partner, Microsoft, in December for copyright infringement of news content related to A.I. systems.)

[0]

OpenAI's built-in image generator for ChatGPT is now available to free users

Due to high demand, it was limited to paying subscribers upon launch. ChatGPT's built-in image generation feature is now available to everyone. OpenAI CEO Sam Altman said last week that the company is delaying its rollout to free tier "for a while," because the tool was way more popular than they had expected. But the company made the feature available to free users over the weekend, allowing them to generate images from within ChatGPT and without having to switch to OpenAI's DALL-E generator. Prior to its rollout to the free tier, the tool was only available to Plus, Pro and Team subscribers. Altman previously said that free users will get a limit of three images per day. Based on our experience trying it out, some accounts are limited to just one a day. Those who find themselves able to generate more than one will still have to wait anywhere between a few minutes and a few hours. One of the tool's most popular uses that we've seen so far is the creation of Ghibli-style images using real-life photos. Users simply have to upload the photo they want to use and then instruct ChatGPT in natural language to create a Ghibli-style version of it. The trend had raised concerns, yet again, about the legality of using copyrighted work as training data for artificial intelligence. It also resurfaced the reaction of Hayao Miyazaki, Studio Ghibli's founder, upon being shown AI-generated animation years ago. "I strongly feel that this is an insult to life itself," the director said. In a tweet, Altman said that the tool could still exhibit erratic behavior and could refuse some generations that should be allowed. OpenAI is still fixing the issue. The company also told TechCrunch that it "takes action" if a generated image violates company guidelines, though it didn't exactly detail about what those actions are.

[14]

TechSpot

ChatGPT image tool is "melting" GPUs, OpenAI lands $40 billion in new funding

The big picture: Only days ago we were reporting that ChatGPT's updated image generator was sparking a wave of Studio Ghibli-style recreations across social media. But the wave was just getting started and in mere days it hit the mainstream, becoming a full-blown fad. While CEO Sam Altman is celebrating the newfound attention, the massive growth in GPT's user base, and a fresh round of funding, the tool's explosive popularity is putting serious strain on OpenAI's infrastructure. Altman revealed that demand has been so intense it's been "melting" their GPUs. While Apple faces delays with its Apple Intelligence rollout and Google struggles to impress with Gemini - delivering sometimes underwhelming and mostly annoying AI search results - OpenAI is charging ahead. The company announced plans to release its first "open" language model since GPT-2, promising a version users can run on their own hardware later this year. Also, OpenAI just closed a staggering $40 billion funding round, the largest private tech deal on record, cementing its position at the forefront of the AI race. OpenAI's latest image-generation tool has taken the internet by storm this past week, although its runaway popularity is putting significant strain on the company's infrastructure. Altman acknowledged the overwhelming response on X. "It's super fun seeing people love images in ChatGPT. But our GPUs are melting," he wrote. To manage the strain, OpenAI has temporarily limited usage while working to optimize efficiency. Altman assured users that the free tier of ChatGPT will soon allow up to three image generations per day, though he did not specify the exact rate limits being implemented. This latest update to ChatGPT's image generation capabilities launched last week and has quickly gained traction for its ability to produce highly realistic visuals with improved text rendering. However, much of the buzz has centered on users leveraging the tool to create memes and portraits in the style of Studio Ghibli, the iconic Japanese animation studio known for films like My Neighbor Totoro and Spirited Away. Social media platforms have been inundated with Ghibli-inspired reinterpretations of people, animals, historical events, and even podcasts. Altman also joined in on the trend, updating his X profile picture to a Ghibli-style version of himself. But this viral phenomenon has reignited debates over copyright infringement in AI-generated art. Critics argue that AI models are often trained on datasets containing copyrighted works without explicit permission from creators. OpenAI is already facing several lawsuits related to this issue. To address these concerns, OpenAI stated that it has implemented safeguards to prevent users from generating images in the style of living artists. However, this policy has raised questions about consistency. For example, Studio Ghibli's co-founder Hayao Miyazaki - who famously denounced AI as an "insult to life itself" in a 2016 documentary - is still alive. While the free version of ChatGPT explicitly refuses to produce images mimicking Ghibli's style, it appears to allow broader replication of the studio's aesthetic. When I asked the tool to create a Ghibli-style image of the movie Thelma & Louise, it generated the image successfully (above). A few minutes later, I requested an image of Batman and Robin in the same style, and the tool responded with an error. However, many others prompting ChatGPT to create a Ghibli-style image report that ChatGPT sometimes declines, citing content policy guidelines and instead offering an alternative suggestion: "Perhaps a symbolic representation of technology and art clashing in a fantasy setting." OpenAI clarified to TechCrunch that while individual artists' styles are off-limits, broader studio styles remain permissible within its guidelines. Meanwhile, the tool's meteoric rise has drawn commentary from prominent tech and venture capital figures. David Sacks, a venture capitalist advising President Donald Trump on cryptocurrency and AI policy, responded to Altman's announcement with a link to a blog post stating: "The next big thing will start out looking like a toy." Altman replied humorously: "Yeah, I just didn't think it would be this toy :)."

[15]

How-To Geek

ChatGPT Can Finally Generate Images With Legible Text

GPT-4o image generation is now available in ChatGPT. The new image generation model, which replaces DALL-E 3, is most notable for its accurate text rendering, improved "binding" capabilities, and ease of use. Unlike traditional diffusion image generation methodology, which "paints" details on top of random noise, GPT-4o utilizes a top-to-bottom, side-to-side autoregressive system. It's slower than diffusion, but the benefits of autoregression are as clear as day. GPT-4o is capable of spitting out images with perfectly legible text -- something that AI models like DALL-E 3 have continually failed to achieve. Not only that, but you can specify textual content for generated images. Write out a prompt like "give me a photorealistic image of a girl writing on a whiteboard with messy handwriting," tell the AI whatever words you want to see on the whiteboard, and it'll give you something fairly accurate. And, perhaps more importantly, the model is quite good at writing 2D stylized text for restaurant menus, advertisements, or other items that may be useful to businesses or hobbyists. The autoregressive approach also seems to help with "binding," which is a fancy way of saying that the AI doesn't get confused by prompts that contain multiple subjects. If you ask DALL-E 3 to draw a red circle, a blue triangle, a green heart, a pink star, and a purple square, it may trip over itself and spit out the wrong shapes or colors. GPT-4o, on the other hand, can accurately handle up to 20 different objects. When paired with the model's text rending capabilities, improved binding clearly creates some interesting opportunities for corporate art or advertising, though it's also just a generally useful thing that makes image generation easier to use. Of course, GPT-4o image generation is just "better" than DALL-E 3. Photorealistic images look more true to life, digital art looks less soupy or grainy, and new inferencing techniques reduce the need to type out long, complicated prompts. The model also boasts improved "character consistency," meaning that a character or object generated in one prompt can be accurately carried over to subsequent prompts -- if you tell the AI to reuse a cyborg cat that it created, it won't change the color of the cat, and so on. OpenAI admits that its new image generation model is imperfect. It still struggles with hallucinations, mathematic representations (like charts or graphs), multilingual text, and more. Still, it's clearly an improvement over the company's previous image generation models. Related Linux Kernel 6.14 Released With Improvements for Gaming and AI This update boasts significant frame rate improvements for certain Windows games. Posts OpenAI says that GPT-4o image generation contains safeguards to prevent misuse, plus advanced watermarking techniques to help people differentiate AI-generated content from real, human-made stuff. But I'll go out on a limb and assume that these safeguards can, with effort, be circumvented. And OpenAI is still using C2PA watermarking, which is just metadata. It takes very little effort to remove this metadata from an image -- C2PA is ineffective at preventing the spread of misinformation. The new GPT-4o image generator won't alleviate concerns about copyright or fair use, either. It was trained on a mix of "publicly available" data and licensed data, according to a statement provided to The Wall Street Journal. AI companies are known to brazenly defy basic copyright law, and OpenAI does not share its training data with the public, so feel free to draw your own conclusions on this matter. (For what it's worth, OpenAI doescare about copyright when it's work is stolen.) Related 9 Reasons to Create Your Own Custom GPTs in ChatGPT Transform ChatGPT into your perfect AI tool. Posts GPT-4o image generation is available today. Open ChatGPT in your browser, ask the AI to generate an image, and enjoy. Note that the rollout is not complete, so some users may still encounter the old DALL-E 3 model. The best way to tell the difference is to observe how a generated image loads. DALL-E 3 loads images with a spinning wheel, while GPT-4o images load with a pleasant top-down side-to-side flatbed scanner-ish animation. All ChatGPT users can access GPT-4o image generation, including free users. However, free users face usage limits, just as they did when using DALL-E 3. By the way, DALL-E 3 will remain available in custom GPTs for those who want to use it. Source: OpenAI

[16]

Interesting Engineering

AI Ghibli art boom is "melting" OpenAI GPUs, Altman tells users to relax

However, this viral phenomenon has come with an unexpected consequence: OpenAI's GPUs struggle to keep up, leading to temporary rate limits and longer processing times. CEO Sam Altman has jokingly remarked that the company's GPUs are "melting" under the intense demand. But what's happening behind the scenes? And why is AI-generated art putting such a strain on computing resources? OpenAI's image-generation feature has quickly become a sensation, allowing users to create breathtaking Ghibli-inspired art with only a text prompt. People have eagerly reimagined countless subjects, from historical events to modern pop culture icons, in this whimsical anime style. The nostalgia and emotional depth associated with Studio Ghibli's works have made this AI-driven art especially appealing. Users love the dreamlike visuals, lush backgrounds, and expressive characters, which have made films like Spirited Away and My Neighbor Totoro so iconic. The AI's ability to replicate this aesthetic with impressive accuracy has fueled an overwhelming surge in demand. However, this surge has also put OpenAI's infrastructure under immense pressure. The phrase "melting GPUs" may sound like an exaggeration, but it reflects the strain on OpenAI's computing power. AI-generated images, particularly those with intricate details like Ghibli-style art, require enormous computational resources. Unlike traditional image processing, AI models must analyze vast datasets, apply complex artistic techniques, and render thousands of tiny details pixel by pixel. A single request to generate an AI image triggers billions of calculations across specialized processors known as GPUs (Graphics Processing Units). Unlike CPUs, which handle general computing tasks, GPUs are designed for parallel processing, making them ideal for AI applications.

[17]

PCWorld

OpenAI releases impressive 4o image generator for free and paid users

Earlier this week, OpenAI released their "most advanced image generator yet" and made it available through ChatGPT using the GPT-4o model. ChatGPT previously relied on Dall-E to generate images. According to OpenAI, the improved 4o model is able to produce precise, accurate, and photorealistic results. They claim that it's also particularly good at rendering text, following instructions precisely, and even understanding the context of a chat. All of this includes the transformation of uploaded images or using uploaded images as visual inspiration. GPT-4o can also continually build on existing images to ensure the subject remains consistent across images. On top of that, GPT-4o is now able to handle 10 to 20 different objects in images (compared to most other AI image generators, which struggle to include 5 to 8 different objects). At the same time, OpenAI warns that GPT-4o may have problems with cropping, hallucinations, too many elements, precise graphs, rendering text in non-Latin alphabets, precise editing, and displaying lots of dense text on a small surface. The new and improved 4o image generator is rolling out now for ChatGPT Free, Plus, Pro, and Team users. (Enterprise and Edu users will gain access at a later time.) If you think Dall-E made better images, you have the option to switch and use that instead.

[18]

9to5Google

ChatGPT promises image generation for free tier as feature is 'melting' GPUs

ChatGPT launched GPT 4o some days ago, and it has been met with a barrage of users trying out the AI model's new image generation features. The company announced that the new surge of users is "melting" GPUs, and free-tier ChatGPT users will soon get limited image generation access. ChatGPT 4o brings a new image generation feature that's actually very good, in terms of AI generation. The addition gained popularity as users have been sending each other Studio Ghibli versions of themselves. The trend has garnered even more attention towards the chatbot, and what attention had waned for the AI model has come back in full force. ChatGPT CEO Sam Altman took to Twitter/X to express appreciation for the sudden surge of activity and the positive feedback. With that, he also announced that the internal GPUs used for processing image generations are "melting." Though this is likely hyperbolic, there could be some truth to that with overheating concerns, but the post doesn't go into detail. What it does mean is that so many users are asking ChatGPT to create Studio Ghibli versions of themselves that the company's servers can't keep up. One nice announcement to come out of the post was that users on the free ChatGPT tier will soon get 3 image generations per day, as opposed to none. Currently, users need to pay $20/month to try ChatGPT Plus. With that, you get access to ChatGPT 4o and all its image-generation prowess. Users even get access to Sora. chatgpt free tier will get 3 generations per day soon. The post goes on to note that rate limits will be imposed temporarily. It's unclear if those limits have cleared or if they'll stay in place for some time. There is no date just yet for the free tier expansion, but it could be expected relatively soon if the post is to be believed.

[19]

OpenAI just unveiled new ChatGPT image generator powered by Sora -- here's what you can do now

OpenAI today announced the GPT-4o image generator, introducing advanced image generation capabilities integrated within the ChatGPT-4o language model. The company says that GPT-4o represents a significant leap forward in image generation that should create images that are not only visually stunning but practically useful. The GPT-4o image model focuses on 'useful image generation,' which means users can now use the AI model for everyday needs such as logos, diagrams, and infographics. Unlike previous generative models that often produce surreal but impractical visuals, GPT-4o was designed to deliver more contextually relevant and accurate imagery. Key features of GPT-4o include upgraded text rendering, allowing seamless integration of textual information into images. This capability supports visual communication, elevating the utility of generated images. Additionally, GPT-4o supports multi-turn generation, enabling users to refine and adjust images through natural conversational interactions, maintaining consistency throughout iterative design processes. The image generation from GPT-4o is capable of managing complex prompts involving up to 20 distinct objects, which is an improvement over existing systems. Through in-context learning, GPT-4o can analyze user-uploaded images, seamlessly incorporating these details into subsequent image generations, thus creating a more personalized and contextually informed visual output. Built upon extensive multimodal training on vast online image and text datasets, GPT-4o has developed sophisticated visual fluency, allowing the model to produce images that are contextually aware, stylistically diverse, and photorealistically convincing. Despite its advanced capabilities, OpenAI acknowledges certain limitations, such as occasional cropping issues, hallucinated details, difficulties rendering dense information at small scales, and precision editing challenges. Multilingual text rendering, especially for complex non-Latin scripts, remains an area under active development. Safety continues to be a paramount focus, with rigorous measures in place to block harmful content, including explicit materials or images that violate content policies. Provenance tools, such as C2PA metadata tagging and internal reverse search, ensure transparency and accountability in generated visuals. GPT-4o image generation is available now across all ChatGPT platforms, including Plus, Pro, Team, and Free tiers, with Enterprise and Education access anticipated soon. Regardless of tier, users can specify detailed image requirements -- from exact colors and aspect ratios to transparent backgrounds -- making professional-quality image creation as straightforward as a simple chat interaction. OpenAI's GPT-4o signifies a major advancement in AI-driven visual communication, turning generative image creation into an accessible, practical and powerful tool for everyday users and professionals alike.

[0]

Android Police

ChatGPT's native image generator isn't paywalled anymore

Summary ChatGPT, credited with mainstreaming AI chatbots and reaching 100 million users in two months, has now rolled out its native GPT-4o-powered image generation capabilities to all free users, after an initial period limited to paid subscribers. The new free image generation feature, especially popular for producing Studio Ghibli-style art, among other styles, is currently rate-limited due to a large influx of users, with the daily generation cap for free users appearing to be two, possibly due to traffic. This expansion of free features follows significant financial growth for OpenAI, including a 30 percent revenue surge, reaching 20 million paid subscribers, and securing a $40 billion funding round that values the company at $300 billion. OpenAI's ChatGPT single-handedly kicked off the AI chatbot race. I'm not saying that AI tech wasn't thriving before ChatGPT, but OpenAI's tool clearly made the tech mainstream, with ChatGPT reaching a staggering 100 million users in only two months. Since October 2023 at the very least, the AI giant has used its Dall-E 3 model to generate images right within ChatGPT, but that changed exactly a week ago when OpenAI rolled out native ChatGPT image generation capabilities. Related What is ChatGPT? Learn what ChatGPT is, how it works, what you can do with it, and how much it costs to use OpenAI's most advanced AI chatbot Posts The Japanese Ghibli art-style images you've been seeing floating around on social media, alongside the subsequent backlash and concerns about copyright issues, are all thanks to ChatGPT's new native image generator that is powered by GPT-4o. Originally limited to users with a paid ChatGPT plan, native GPT-4o-powered image generation capabilities are now rolling out to non-paying users, and no, contrary to what our inner voice is whispering, this isn't an elaborate April Fools' Day joke. Let the initial traffic subside before you give the tool a shot Announced by CEO Sam Altman on X (Twitter), image generation support for the free tier has indeed rolled out -- I see it on my account. With the large influx of users now trying to generate images, however, the capability is currently rate limited. Additionally, while Altman had previously suggested that free-tier users will get three free image generations per day, users currently report being limited to two. It is unclear if that is the current traffic-induced rate limitation kicking in or the actual daily cap. This comes on the heels of ChatGPT's revenue surging by 30 percent, the AI tool reaching the 20 million paid subscriber mark (paywalled), and it securing a new $40 billion raise at a $300 billion valuation.

[20]

I just went hands-on with ChatGPT-4o's enhanced image generator -- and I can't believe this is free

Can this version provide more realistic and detailed images? OpenAI just enhanced its image generating capabilities with a new model integrated within ChatGPT-4o. Promising more realistic, richly detailed images from the simplest prompts, the model sounded too good to be true, especially since I haven't been all that impressed with ChatGPT's image generating capabilities in the past. I have to admit I was skeptical. OpenAI's new image generator is designed to be fast and intuitive, so I wanted to try it with 7 prompts. I tested everything from cinematic landscapes to mock product concepts to see how well it works and what sets it apart from other AI image generator tools. Here's what happened when I put this ChatGPT-4o image generator to the test. Prompt: Create a photograph of a suburban house in early spring. These houses are beautiful, and the chatbot catches the season correctly, but these do not look like photographs. They are pretty close and, for a free chatbot, definitely eye-catching and clear. But they still give off that AI-glow. I could see images like this being used for a brochure or travel company, but in terms of fooling someone that it's a photograph of your house, nah. The driveway alone gives it away. Prompt: Create a logo for an ice cream shop on the boardwalk. Knowing that it is this easy to create a logo makes me wish I owned a business. With a simple prompt, I had several logos to choose from in seconds. After the first few logos that had the same aesthetic, I asked the chatbot to switch the style. It went from whimsical to bland really fast, but with a little prompt tweaking, I'm sure the chatbot would come up with a usable. I did notice a few misspellings and blurred letters, so be sure to do a spell check after you generate your logo. Prompt: Create a cartoon with two cats chatting about the weather. I can see myself playing around with these prompts a lot. Maybe Tom's Guide needs a weekly AI comic strip? When I first prompted the chatbot, it generated just one comic. From there, I asked for a comic strip. I wasn't incredibly impressed with the layout. I was expecting something more like what you find in the Sunday paper. I then prompted the chatbot to do a third comic strip in the style of Disney. Definitely not "The Aristocats," but I thought it was pretty close. Prompt: Create the package renderings for a chocolate bar called Amanda. This was pretty cool and the most realistic prompt yet. I thought it was interesting that the chatbot went a "fancy" direction for the chocolate bar, but that's probably because I used my name. Had I used "Crazy 4 chocolate," it probably would have done something simpler. But what's interesting is when I asked the chatbot to generate two more images, it gave me an error message: "I wasn't able to generate the two additional chocolate bar package designs due to a content policy restriction. It might've flagged something in the description, even though the request seems fine on the surface. "If you'd like, I can rephrase the prompt and try again with slightly different wording -- just let me know if you want to stick with a luxury theme or explore something more fun, playful, or eco-friendly!" After that, I tried again, but the chatbot was done with that prompt. Prompt: Create a photograph of New York City in the summer. The first image the model generated looked more like a painting than a photograph. I keep hoping ChatGPT will have the ability to create photorealistic images, but I'm finding it fails most of the time. I asked the chatbot to be "more realistic," but even the second image didn't cut it for me. I also find it interesting that the chatbot focused on the greener aspects of NYC in the summer rather than generating an image of the busy sidewalks. Because I have AI Madness on the brain and OpenAI has said that the ChatGPT-4o image generator can create images with little prompting, I decided to try it here. The chatbot did a nice job coming up with some realistic-looking awards. The first one is a little busy, but once I prompted the chatbot to go for a simpler style, the image was much cleaner. This would be great for parents who want to give their kids awards for chores, office parties, and similar events for recognizing a job well done. Prompt: Create a realistic image of a dinosaur, a potato, and a cat. The potato was the most impressive of the three images. The dinosaur looked like a generic image of a prehistoric creature and the cat was downright comical. I purposely did a mixed bag here to test the AI's ability to understand the word "realistic." The AI seems to be somewhat unpredictable with it's style, so my biggest recommendation is to be as specific as possible when making your prompts. OpenAI says you don't need to, but I highly recommend it. Overall, the ChatGPT-4o image generator feels like an upgrade for several reasons. It is easy to use directly within the chat window, the chatbot pivots when prompted or given more information, and anyone can use it, even without an account and for free. That last point is a huge plus. Although the chatbot tends to have trouble with on-image text and occasionally needs extra prompting, the enhanced image generating tool is something I plan to keep in my AI toolbox.

[21]

ChatGPT's new image generator is now 'delayed for a while' for free users

Images created by ChatGPT's latest AI image generator have been doing the rounds online. But while they were originally meant to be available in all versions of ChatGPT, both free and paid, that plan has now been delayed. Sam Altman, CEO of OpenAI, announced on Wednesday via a post on X that the feature had been much more popular than first expected. This unexpected popularity has resulted in a delay in the rollout of the plan. No date was announced for when this feature will come to the free version of ChatGPT, with Altman simply stating it will be "delayed for a while." However, anyone can use it on ChatGPT Plus (for $20 a month) or on the more expensive Pro version. This new image tool utilizes OpenAI's GPT-4o - an upgraded model that was announced for ChatGPT last year. That was originally only available for text-based requests until now, with the introduction of the 4o image generation. This brought about a variety of new and improved features, but the focus here is an increase in realism and detail in your images. While it takes slightly longer to generate the images, OpenAI has claimed to improve background details and let the model spend more time on finer details. Equally, this latest model has an increased focus on communication. It can hold the information you have discussed with the model, allowing for fluid conversations to result in a more detailed image at the end. Equally, improvements have been made to the generation of texts and words in images -- a task that has proved challenging for AI models in the past. In our initial tests with the model, we were impressed by its ease of use and ability to pivot on style and input based on your conversations with it. However, even with the improvements in text detection, it does still have a habit of messing up words in images.

[22]

VentureBeat

'Insane': OpenAI introduces GPT-4o native image generation and it's already wowing users

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More We're coming up on the one year anniversary since OpenAI released its first "omni" or multimodal model, GPT-4o back in May 2024, but that old standby still has some tricks up its sleeve. Case-in-point, today OpenAI finally turned on the native multimodal image generation capabilities of GPT-4o for users of its hit chatbot ChatGPT on the Plus, Pro, Team, and Free usage tiers, though the company said it would also soon be made available for Enterprise, Edu, and through its application programming interface (API). Unlike the previous generative AI image model available in ChatGPT -- OpenAI's DALL-E 3, a classic diffusion transformer model that was trained to reconstruct images from text prompts by removing noise from pixels -- this new image generator is part of the same model that spits out text and code, as OpenAI trained the entire model to understand all these forms of media at once. OpenAI president Greg Brockman had long ago previewed this native capability of GPT-4o back in May 2024, but for reasons that still remain unknown publicly, the company held onto it until now -- following the public release of what many AI power users saw as a similar feature from Google AI Studio with its Gemini 2 Flash Experimental model. This has resulted in a much higher quality image generator that produces far more lifelike images and accurate text baked in, and it's already impressing users -- one of whom calls the quality "insane." By the same token (pun intended), OpenAI still hasn't said precisely what data GPT-4o's image generation capabilities were trained on -- and given the history of the company and other model providers, it likely includes many artworks scraped from the web, some of which are presumably copyrighted, which is likely to anger the artists behind them. Bringing Image Generation to ChatGPT and Sora OpenAI has long aimed to make image generation a core capability of its AI models. With GPT-4o, users can now generate images directly in ChatGPT, refining them through conversation and adjusting details on the fly. The model also integrates into Sora, OpenAI's video-generation platform, further expanding multimodal capabilities. In an announcement on X, OpenAI confirmed that GPT-4o's image generation is designed to: Users can describe an image in ChatGPT, specifying details such as aspect ratio, color schemes (hex codes), or transparency, and GPT-4o will generate it within a minute. As independent AI consultant Allie K. Miller wrote on X, it's a "Huge leap in text generation," and is "the best" AI image generation model she's seen. Key capabilities and use cases GPT-4o is designed to make image generation not just visually stunning but also practical. Some of the key applications include: How GPT-4o improves generative images over DALL-E According to OpenAI's official thread on X, GPT-4o introduces several improvements over previous models: Limitations Despite its advancements, GPT-4o still has some known challenges: OpenAI is actively addressing these issues through ongoing model refinements. Safety and labeling measures As part of OpenAI's commitment to responsible AI development, all GPT-4o-generated images include C2PA metadata, allowing users to verify their AI origin. Moreover, OpenAI has built an internal search tool to help detect AI-generated images. Strict safeguards are in place to block harmful content and prevent misuse, such as prohibiting explicit, deceptive, or harmful imagery. OpenAI also ensures that images featuring real people are subject to heightened restrictions. OpenAI CEO Sam Altman described the release as a "new high-water mark for creative freedom", emphasizing that users will be able to create a wide range of visuals, with OpenAI observing and refining its approach based on real-world usage. As AI-generated images become more precise and accessible, GPT-4o represents a significant step forward in making text-to-image generation a mainstream tool for communication, creativity, and productivity.

[23]

PetaPixel

ChatGPT's New AI Image Generator Looks Scarily Good

OpenAI has launched a new AI image generator that is a technological step forward, and some of the examples the company shared achieve a frightening degree of verisimilitude. Called "Images in ChatGPT", the feature differs from DALL-E -- OpenAI's previous image generator which seems like it's being retired -- because the images come from within ChatGPT-4o. Describing the model as a "step change", research lead Gabriel Goh tells The Verge that GPT-4o is "omnimodal" -- a model that can generate any kind of data like text, image, audio, and video. This new type of model is indicative of a wider change in the AI industry where systems combine all types of data. Yesterday, PetaPixel reported on Google's "Project Astra" which can see the world around it via a smartphone camera and answer questions. In a blog post revealing Images in ChatGPT, OpenAI shared some impressive examples. The pictures of an "OpenAI researcher" working on a whiteboard in a room "overlooking the Bay Bridge" with the photographer's reflection are scarily good. OpenAI also shared other examples which showcase the model's ability to generate photorealistic images. Images in ChatGPT doesn't have a visual watermark the way DALL-E did. However, ChatGPT multimodal product lead Jackie Shannon tells The Verge that "all of our generated images will include standard C2PA metadata to mark the image as having been created by OpenAI." The new version of ChatGPT started rolling out yesterday (Tuesday) and will be available to people using the free and paid versions of the chatbot.

[24]

TechRadar

'Our GPUs are melting' - OpenAI puts limits on image creation and delays rollout to free accounts

Amid the growing controversy over its AI mimicking the artistic style of Studio Ghibli, OpenAI is being forced to limit how many images ChatGPT can produce on the free tier to 3 a day because it's proving too popular. In a recent tweet on X, Sam Altman, CEO of OpenAI, said "It's super fun seeing people love images in ChatGPT, but our GPUs are melting. We are going to temporarily introduce some rate limits while we work on making it more efficient. Hopefully won't be long! ChatGPT free tier will get 3 generations per day soon." ChatGPT's new image generation capabilities are clearly a step up in the development of AI image generation, proving superior in our tests to DALL-E 3, which is the model previously used by ChatGPT, and also what it will still default to once you've run out of generations in the new model. In his X-thread Sam Altman also goes on to say that "(also, we are refusing some generations that should be allowed; we are fixing these as fast we can.)" This could explain the frustrations I've been experiencing getting ChatGPT to produce text in images. It's quite possible that you don't have access to ChatGPT's image creation tools quite yet anyway. While ChatGPT Plus and Pro users all seem to have access, not all free-tier users do. On March 26 Altman tweeted that rollout to the free tier was going to be delayed: "Images in ChatGPT are way more popular than we expected (and we had pretty high expectations). Rollout to our free tier is, unfortunately, going to be delayed for a while." As we've said in our testing, even on the Plus tier, ChatGPT is already very slow when it comes to generating images, and when the rollout to the free tier is complete we would expect it to be even slower. The move to limit the free tier to 3 images on a temporary basis, while understandable, will inevitably lead to people feeling frustrated with the company. Have you been able to sample ChatGPT's new image creation abilities yet? Let us know what you think in the comments below.

[25]

Futurism

OpenAI's New Image Generator Can Do Near-Perfect Text

Until now, the chatbot used the company's separate DALL-E model to dream up pictures. With this latest update, users will be able to access a new feature dubbed "Images in ChatGPT," leveraging OpenAI's flagship GPT-4o model, which has underpinned the chatbot for nearly a year. The upgrade is also available in Sora, OpenAI's video generation tool. "This model is a step change above previous models," research lead Gabriel Goh told The Verge. The most noticeable change is how the model handles text, something that it and its competitors have long struggled with. Words tended to come out looking like gobbledygook, and the text that was legible looked sloppy, filled with formatting errors and misspellings. Not anymore, according to OpenAI. One example shared by the company shows an employee writing out the pros and cons of the ChatGPT image update on a whiteboard, following to the letter what was specified in the prompt; ditto for a four-panel comic strip about a snail -- all with cleanly rendered text. "This was just like a process of iteration that took many, many months to get right," Goh told The Verge. "It's been just many months of small improvements." The model still struggles with very small lettering, but overall, the text quality is consistently usable, Goh said. Unlike image generators like DALL-E, which use a diffusion model, GPT-4o uses an autoregressive approach that produces images from left to right and top to bottom, per The Verge, similar to how text -- at least in English -- is written. Beyond improved penmanship, OpenAI says the model will now follow instructions better, as a common issue with older iterations was that they'd ignore certain details in lengthier prompts. It's also been fine-tuned to be able to generate more photorealistic images. There are caveats. For one, it'll take longer to generate the outputs. And like all generative models, it's still prone to making up information, or hallucinating. It also struggles with generating non-Latin scripts, hallucinating characters when trying to write out languages like Korean. With greater capabilities come greater safety and misinformation concerns. To this end, OpenAI stressed that it has particularly "robust safeguards" in place around nudity, violence, and depictions of real people. Moreover, all images that the AI model generates will be embedded with C2PA metadata identifying that it was made with GPT-4o. But this hidden watermark of sorts can easily be stripped -- in fact, many social media platforms automatically remove an image's metadata once it's uploaded. "Ultimately, no system is perfect for this type of thing, but we're continuously improving our safeguards and we think of this as a starting point," ChatGPT multimodal product lead Jackie Shannon told The Verge. For now, GPT-4o image generation is only available to subscribers of OpenAI's ludicrous $200 per month Pro subscription tier, with plans to roll out the feature to Plus and free users in the near future.

[0]

ChatGPT's updated image generator is now free for all -- but there's a catch

Free users can generate only a limited number of images per day OpenAI's latest GPT‑4o image generation tool can now be accessed by all ChatGPT users, regardless of whether they have a subscription or not. However, there is a limit to how many images you can generate in a day The new image generator was initially only available to ChatGPT Plus subscribers. However, in a recent X post, OpenAI CEO Sam Altman announced that the tool is now available to all users. Altman has previously stated on X that users of ChatGPT's free tier could generate up to three images a day before they have to stop, or upgrade to one of the paid subscriptions. However, when I tested it, I could only create one before running into complications. When it tried to create a second image, I was told that I needed to wait for the limit to reset before I could create another. When it comes to image generation, there's little doubt that ChatGPT's GPT-4o image generation tool is one of the best. In our tests, we found seven great examples of just how detailed it could be when generating images. While access to the feature is great, the issue with limits indicates that OpenAI is still looking for a solution to its recent issues regarding the app. While it wasn't a universal issue, many stated that the fault was due to how many people were trying to generate images at once. As such, our issues with the limit numbers might be a holdover from that. While access to more powerful image generation might be great for some, others might despair at the possibility of even more generated art that arguably steals from other creators. Recently, we've seen a wealth of AI-generated art that apes the style of Hayao Miyazaki, the animator and filmmaker behind Studio Ghibli. This has left many fans upset, and has fed into the continued debate around AI generation, creator rights, and whether it is good or bad for artists. We've seen similar arguments in the past regarding OpenAI, especially regarding how and what it uses to train its AI. For instance, last year, we saw Open AI drop its Sky voice after a legal confrontation with actor Scarlett Johansson, who claimed the voice was copied from hers without her permission. As it stands, ChatGPT's GPT-4o tool remains one of the best means for generating images with ease. However, its existence only adds fuel to the debate around AI-generated images while AI companies, creatives and users try to negotiate a consensus.

[26]

OpenAI halts free GPT-4o image generation after Studio Ghibli viral trend

After only one day, OpenAI has put a halt on the free version of its in-app image generator, powered by the GPT-4o reasoning model. The update is intended to improve realism in images and text in AI-generated context; however, users have already created a runaway trend that has caused the AI company to rethink its rollout strategy. Not long after the update became available on ChatGPT, users began sharing images they had fashioned to social media platforms in the style of Studio Ghibli, the popular Japanese animation studio. Creations ranged from Studio Ghibli-based personal family photos to iconic scenes from the 2024 Paris Olympics, scenes from movies including "The Godfather" and "Star Wars", and internet memes including distracted boyfriend and disaster girl. While OpenAI CEO, Sam Altman, also joined in on the fun on Wednesday by changing his X profile picture to a Studio Ghibli style image of himself, which pundits assume was fashioned using GPT-4o, the executive also followed up with a prompt response, stating on the social media platform that the image generator update would be delayed for the ChatGPT free tier for the time being. He didn't give any indication of whether or when the feature would return. Currently, paid tiers of ChatGPT, including Plus, Pro, and Team, still have access to the update. Recommended Videos The Verge noted that the GPT-4o reasoning model uses a method called "autoregressive approach" to generate images, which develops the image from left to right and top to bottom, instead of all at once. This aids accuracy and realism when generating images. AI text on older models is known for being either blurry or gibberish, while GPT-4o text can generate sharp and coherent text. OpenAI also told the Wall Street Journal that it put the reasoning model through human-directed post-launch training to find common errors in text and image designs. Ethics and legality is a long-standing issue when it comes to AI. While the brand attempted to cover its bases, divulging to the WSJ the sources of its data for training, it was impossible to know how the public will respond to the product once it's available. Neal & McDevitt intellectual property lawyer Evan Brown told TechCrunch that ChatGPT users utilizing the GPT-4o-powered product to generate Studio Ghibli-style photos technically is not breaking the law because "style is not explicitly protected by copyright." However, it does lie in a legally gray area. An OpenAI spokesperson also told the publication in a statement that ChatGPT can emulate a studio's style, but not the style of an individual artist. The AI company does seem to be aiming to cover its bases by removing the GPT-4o update from its ChatGPT free tier until it figures out its next move. There have been many AI trends based on popular designs. One older trend from late 2023 allowed users to generate Funko Pop! figures in Microsoft Designer with a few prompts. While the main premise of this trend was to make figures of yourself, many people made Funko Pop! figures of notable people, such as Elon Musk, and Beyonce. Similarly, this trend brought to light the legal restraints that can arise while trying to have fun with AI online. You had to input certain nickname workarounds, such as "Elon X" and "Queen Bey," with proper descriptions of the characters to generate your desired results.

[27]

OpenAI's latest model creates life like images and readable text, try it free

OpenAI has introduced its 4o model into ChatGPT to enable native image generation within the chatbot atmosphere. This upgrade makes it so you don't have to use OpenAI's Dall-E image generation model as a separate entity, though Dall-E remains available for those as a preference. The AI brand has also enabled its Sora AI video generator within ChatGPT. The new features are currently available for ChatGPT free users, as well as for ChatGPT Plus, Team, and Pro users. Availability will be coming to enterprise and education users next week. Previously, Dall-E 3 was the image generation plug-in for paid ChatGPT subscribers. Meanwhile, those who wanted to try the generator for free could do so through the basic tier of Microsoft Copilot. Recommended Videos The model has been lauded as one of the top image generators available, particularly in its paid version. Despite the benefit of all ChatGPT users being able to use image generation natively with the 4o model, those using the free tier of ChatGPT should be prepared to run into some limitations, such as maximums for file uploads and data analysis, CNET noted. Even so, ChatGPT will benefit from having more realistic images with more legible text after OpenAI spent a year having GPT-4o go through a post-launch training effort called "reinforcement learning from human feedback" (RLHF), according to the Wall Street Journal. After announcing GPT-4o in May 2024, OpenAI had a team of over 100 "human trainers" scouring the model for typos, as well as common errors in hands and faces, the project's lead researcher, Gabriel Goh told the publication. The GPT-4o model will also bring to ChatGPT the ability to create transparent backgrounds. This should be a major benefit for business users and creatives, as it will allow them to create logos or other iconography, ChatGPT multimodal product lead, Jackie Shannon also noted to WSJ. Despite the improvements that OpenAI has made, the updated GPT-4o model as a whole still has its shortcomings. It still has a propensity toward hallucinations, which is a common AI feature that has yet to be resolved. Maintaining editing consistency remains a challenge within the ChatGPT atmosphere; however, OpenAI has promised rapid updates, as early as next week. Another ongoing issue for OpenAI is the matter of ethics and legality. The brand insists its model was trained on "publicly available data," and through proprietary data it owns via partnerships with brands including Shutterstock, WSJ noted. Images generated through ChatGPT based on the 4o model won't have AI watermarks. However, the brand has indicated images will include C2PA⁠ metadata denoting them as AI-generated. This remains the industry standard.

[28]

Quartz

ChatGPT delays roll out of image generator for free users because it's too 'popular'

OpenAI is delaying the rollout of ChatGPT's latest image generation features, Images in ChatGPT, for users on its free subscription tier, CEO Sam Altman said in an X post on Wednesday. The reason why? It's too "popular." "images in chatgpt are wayyyy more popular than we expected (and we had pretty high expectations). rollout to our free tier is unfortunately going to be delayed for awhile," Altman wrote in an X post on Wednesday. OpenAI announced on Tuesday that it will be integrating its latest image-generation technology into its GPT-4o model chatbots, making them a one-stop shop for artificially generated content. At the time of the release, OpenAI said one of the biggest advancements in the chatbot's image ability is in the quality of its text rendering abilities. The company claims its systems can now generate images with text that is meaningful and readable -- not warped and typo-ridden. OpenAI also boasted in the press release that the image generator is now better at following instructions, especially when users upload images as visual inspirations. This latter feature ended up being the real star of the show for users. ChatGPT had a viral moment on Wednesday as people started sharing pictures of themselves in the distinct style of Japanese animation company Studio Ghibli. Co-created by famous Japanese filmmaker Hayao Miyazaki, Studio Ghibli is behind famous animation films like Spirited Away and My Neighbor Totoro. While he hasn't commented on the recent trend, Miyazaki has in the past spoken out against AI-generated animation. In a documentary feature published in 2016, Miyazaki said he was "utterly disgusted" and called the technology "an insult to life itself." And when told that the goal of AI animation is to "draw pictures like humans do," Miyazaki said that he felt "like we are nearing to the end of the times. We humans are losing faith in ourselves." Even Sam Altman himself jumped on the viral trend, changing his X profile picture to a ChatGPT-generated photo of himself in the style of Ghibli, as social media users flocked to ChatGPT to create their own versions. While the feature is no longer available for free users, at least for now, it is still accessible for ChatGPT's plus, pro, and team subscription tiers. The company said the feature will also roll out soon for enterprise and educational users, as well as for developers using the API.

[0]

TechRadar

ChatGPT 4o's image generation is now free for everyone - 3 ways to use the new AI tool without following the Studio Ghibli herd

There's a daily limit for free users which can be quite infuriating ChatGPT's viral 4o image generation features are now available to everyone for free, albeit with some limits that can make the new AI tool a little infuriating. Announced by OpenAI CEO, Sam Altman, on X, the new native image generation has taken the world by storm thanks to its impressive ability to create images far better than Dall-E (the previous model used by ChatGPT) ever could. You might've seen the controversy surrounding the image generation tool's blatant copy of Studio Ghibli-style anime over the last week, but if not we've covered the drama at length here. In this article, I'm going to show you three ways to use ChatGPT 4o's image generation that aren't following the anime recreation trend. Why only three you might ask? Well, unfortunately, the daily limit for free users is only three image generations, so it only felt fitting to limit the ideas just like OpenAI has limited the use of the tool. ChatGPT's image generation is very impressive when it comes to replacing the background of your images. In the first example above, I uploaded an image of my French Bulldog, Kermit, and asked 4o to "change the background to the beach." Like magic, Kermit was now on the beach, although you can tell there have been slight modifications to his likeness. For example, in the original photo, he had a pimple under his mouth, which is now completely removed. ChatGPT 4o's image generation is very good at determining the outline of a subject and creating a realistic background to match. While I've not yet tried Galaxy AI's version of background editing, from my testing with ChatGPT I can't imagine a better AI background editor is available for free on the market. After taking Kermit to the beach, I thought I'd try recreating his namesake's iconic opening scene of the 1979 Muppet Movie. In the movie, Kermit the Frog sits on a log in a swamp playing the banjo. While Kermit the Dog can't play the banjo, ChatGPT managed to take the source material and edit the background so Kermit was indeed on a log in a swamp. Pretty cool! ChatGPT 4o's image generation is also very good at using a source image to create anything you can imagine. After feeding the AI tool multiple images of my dog, I asked "Can you imagine a french bulldog riding a skateboard while dressed as a frog?" After waiting about 20 seconds, yes the image generation can be pretty slow, I had an excellent image of a French Bulldog that looks like my boy Kermit, in a frog costume riding a skateboard. Eagle-eyed readers will notice his back paws only have three digits, but I guess ChatGPT thought the dog costume should mimic an amphibian rather than have enough space to be comfortable for the mammal inside. Unfortunately for ChatGPT, a frog has 18 toes altogether... 10 at the back and eight at the front. Last but not least, I decided to upload an image of Kermit in a cow costume. Why is he wearing a cow costume? You might ask. Well, for the purpose of this article I needed an image where he's clearly a bit miffed. You see I live in Scotland so Kermit needs a fleece to keep himself warm in the winter. Unfortunately for him, he hates wearing a fleece. That said, ChatGPT's new image generation tools can help with that, so I asked AI to "Make this dog happy." The result here is bizarre because it takes everything from the previous image and recreates it perfectly, just with a smiling dog with its tongue out. I would argue a French Bulldog always looks miserable, and a tongue sticking out like this would actually imply it's thirsty. But AI deems this a happy dog, and to be fair, it has doesn't a very good job a recreating the image with the mood tweaks I asked for. So there you have it, ChatGPT's image generation is now free for everyone to try and it's capable of some fantastic photo editing. The image limitations are pretty annoying, however, as I was only able to generate three images without loading up my paid ChatGPT Plus work account. I also experienced upload limitations, where ChatGPT stopped allowing me to add source material to edit after providing two images. So it's worth noting ChatGPT image generation is very powerful, but on the free tier you'll need to be patient as you'll quickly reach a daily limit.

[29]

Viral trend drives ChatGPT to 500 million users

OpenAI's flagship service ChatGPT remains as popular as ever, with the brand having hit a 500 million active user milestone in recent days amid the Studio Ghibli viral trend that came with the brand introducing its GPT-4o-powered image generation. The company's CEO, Sam Altman, shared on X on Monday that ChatGPT gained "one million users in the last hour." He compared the user spike to the burgeoning interest in OpenAI during its early days in 2022, when the chatbot gained one million users in five days, VentureBeat noted. Recommended Videos The twenty-six-month-old brand has evolved quickly since its introduction to the industry. Its most recent developments have left users excited to use the chatbot to create images based on the popular Japanese animation style of Studio Ghibli and share them on social media. The brand recently enabled access to image generation for its free users once more after halting it last week due to the massive demand over working its servers. Altman sarcastically mentioned that the GPUs at OpenAI's data centers were "melting" due to the fad. This is just one instance of the brand continuing to stand out in the industry. OpenAI notably hit a 400 million weekly active user milestone in February, while comparisons to the Chinese AI company DeepSeek were still fresh. Reports at the time indicated that the brand saw a 33% increase in less than three months from 300 million users in December 2024, coming from primarily organic growth. Before that, a December report from the Financial Times indicated that OpenAI had a goal of hitting 1 billion users in 2025 as it aimed to secure financial backing and shift from a private organization to a for-profit company. Now, OpenAI is on track to meet several of those goals. It is not only midway to 1 billion users, but it has also secured a funding round led by SoftBank, with a $300 million valuation. This is the largest private equity investment to date, according to VentureBeat. Despite this achievement, the company still has a lot of work to do to meet its goals. If it is to become a for-profit company by the end of this year, OpenAI will likely be running at a substantial loss for many years before it starts to see tangible profits. It is likely the company won't break even until 2029, the publication noted. OpenAI is known for its consumer-facing products, such as ChatGPT and its associated AI models; however, the brand may have to focus more on enterprise projects to earn money moving forward.

[30]

Decrypt

OpenAI Launches GPT-4o's New Image Generation Into ChatGPT, Showing 'Unbelievably Better' Results - Decrypt

OpenAI is integrating image generation capabilities directly into ChatGPT, allowing users to create images without leaving the chat interface. The company announced the feature Tuesday as part of its broader push to make AI tools more useful and accessible across different media, staying relevant in the AI art scene. The feature is an evolution of DALL·E 3, OpenAI's image generator, which launched in September 2023 but fell out of favor among AI enthusiasts who preferred the next generation of models, including Flux, MidJourney v6, SD 3.5, Recraft, and Reve. Before this release, OpenAI offered two different models on the same platform, with GPT generating text and DALL·E 3 handling image generation. Now, GPT-4o will do everything on its own, and DALL·E 3 will disappear. "GPT‑4o image generation excels at accurately rendering text, precisely following prompts, and leveraging 4o's inherent knowledge base and chat context -- including transforming uploaded images or using them as visual inspiration," OpenAI claimed in an official blog post. The integration of DALL·E 3 continues to make good on the company's plan to make GPT-4o an "omni" model, trained with multimodal data and capable of handling all tasks. The result is a model that is much more capable, accurate and intelligent than its predecessors. "We know we've made you wait, but we think it's really worth it, and we think you're going to love it," Sam Altman, OpenAI's CEO, said in a video showing GPT-4o's new capabilities. "It's such a huge step forward that the best way to explain it to you is just to show it." In the video, the company showed off the system's capabilities with several examples, including manga pages explaining the theory of relativity -- with inputs in english and mandarin -- custom trading cards based on personal and real photos, commemorative coins combining multiple images with transparent backgrounds, and a very accurate image based on and extraordinarily long and detailed prompt. The model is slow at generating images, but it seems to be highly accurate. Altman pointed to the significant quality upgrade as worth the longer waiting time. "Images are much slower than our previous image generation (model), but unbelievably better. We think it's super worth the wait," Altman said during the demonstration. "We also will be able to make it faster over time." The rollout appears to be happening gradually, and we weren't able to get our hands on the new model as of press time. Users can tell which system they're using based on how images appear: Besides the apparent quality gap, DALL·E 3 images pop up fully formed after a loading screen, while the new GPT-4o renders images progressively from top to bottom in real time. The company emphasized that the technology extends beyond creating fancy images. "What's really exciting about this release is that now these models can actually visualize what they know and externalize it in a visual way," explained a research scientist at OpenAI, invited by Sam Altman to talk about this new feature. This capability allows for educational applications like detailed scientific diagrams or informational posters with accurately rendered text and even image editing with subject consistency. OpenAI has also implemented guardrails to prevent the generation of deepfakes, illegal content, and the removal of watermarks. While the generated images won't have visible watermarks, they will include C2PA metadata to identify them as AI-created. The company is also developing tools to track image provenance. The company plans to bring the feature to its API, allowing developers to integrate the technology into their own applications. OpenAI's Terms of Use also say that users will retain ownership of images they generate, subject to the company's usage policies.

[0]

Mashable

OpenAI's new image generator is now free for everyone

Credit: Thomas Fuller / SOPA Images / LightRocket via Getty Images OpenAI's new image generator is available for all users in ChatGPT and Sora, whether you're a subscriber or using the free version. However, the move comes just days after widespread backlash to the online wave of Studio Ghibli-style AI art generated by the tool. CEO Sam Altman announced the rollout in a post on X on Tuesday, April 1, a date that instantly makes journalists think everything is a prank. However, the news is indeed true, with the GPT-4o model-driven image generation feature now open for anyone to use. I was able to access the generator without a ChatGPT Plus subscription at the time of writing. This Tweet is currently unavailable. It might be loading or has been removed. Altman also mentioned in an X post that users without GhatGPT Plus will soon get rate limits of three image generations per day. OpenAI's native image generation tool was announced on March 25, with Altman leading a livestream demonstration of its capabilities. However, after Studio Ghibli-inspired AI art generated by OpenAI's tool spread across the internet after launch, large-scale backlash erupted online, with social media users calling out both the potential copyright issues and Studio Ghibli co-founder Hayao Miyazaki's longstanding public hatred of AI. OpenAI responded to the criticism, leaning on the company's insistence on "creative freedom." The definition of such freedom in AI image generation could cause issues down the track, particularly where OpenAI draws the line on "offensive" images. As Mashable tech reporter Cecily Mauran pointed out, Altman explained both during the demo of the AI model and later in an X post, that the company wants people to "really let people create what they want." "What we'd like to aim for is that the tool doesn't create offensive stuff unless you want it to, in which case within reason it does," Altman's post read. "As we talk about in our model spec, we think putting this intellectual freedom and control in the hands of users is the right thing to do, but we will observe how it goes and listen to society." Now the tool is available for everyone to use for free, the problems OpenAI are already facing could increase in scale. This Tweet is currently unavailable. It might be loading or has been removed. I'd rather not see how this turns out, but here we are.

[31]

Decrypt

The 'Ghibligeddon': People Are 'Melting' OpenAI's GPUs With Studio Ghibli Memes - Decrypt

OpenAI's GPUs are buckling under unprecedented demand for its new image generation feature, forcing the company to implement emergency limits as users create millions of anime-inspired illustrations. The company's computing infrastructure has been pushed to the breaking point just days after rolling out the capability in its latest GPT-4o model. The image generator, which transforms text prompts or existing photos into detailed artwork, has sparked particular enthusiasm for Studio Ghibli-style transformations. Over the last few days, users have flooded social media with pastel-colored landscapes and whimsical character portraits reminiscent of films like "Spirited Away" and "My Neighbor Totoro." "Can y'all please chill on generating images. This is insane, our team needs sleep," OpenAI CEO Sam Altman whine-tweeted early today. That came on the heels of his initial complaint Thursday -- "our GPTs are melting" -- after the new image-generating capabilities had rolled out and some genius discovered that you could take any photo and turn it into a cutesy Ghibli-style image. During the ensuing weekend, more and more people tried to Ghblify everything from 9/11 photos to Adolf Hitler. The viral phenomenon spread beyond tech enthusiasts to the White House, which published a Ghibli-style meme making fun of a Latin American woman arrested for drug trafficking. Finally, OpenAI shut off access to its image generator in some areas, while imposing rate limits of 3 generations per hour for others. That didn't go over well with users who complained about arbitrary limits and even hallucinated restrictions. The situation seemed to get a bit more under control Monday with OpenAI announcing that free users -- who previously had more freedom of generations with the previous DALL·E 3 image generator -- would be limited to just three images per day, if and when the feature becomes publicly available. Altman may be crying all the way to the bank. Ghibligeddon apparently drove more than 1 million people to sign up for ChatGPT in one hour Monday, though how many of those subscribed to the premium service was not disclosed. Search data shows that as soon as the new model was released last week, the term "ChatGPT" briefly overtook "Google" in Google Trends. So why is Ghibligeddon "melting" OpenAI's servers? GPT-4o's image generation differs fundamentally from previous AI art tools, using more computing power. That's because most image generators use "diffusion models" that start with a bunch of random pixels, and change them until they form a real representation of an image. GPT-4o employs an autoregressive approach that generates images pixel by pixel in sequence -- predicting what pixel comes next -- in the same way it does text-based reasoning. For images, this technique produces more accurate details and better text rendering. But it's a huge lift for OpenAI's GPUs. It's kind of like the difference between creating an oil painting and creating a tile mosaic. Each image request triggers complex mathematical operations across many GPU cores working in parallel, consuming power and generating heat that threatens hardware if run continuously at maximum capacity. The computing crunch comes at a challenging time for OpenAI financially. Despite Altman's claims of the company "doing really great," the AI unicorn reportedly loses billions annually, with projections suggesting these losses could reach $14 billion by 2026. The infrastructure demands of maintaining image generation capacity while continuing to develop new AI capabilities put additional pressure on the company's resources. And it's unlikely that dealing with memes is the revolutionary use case that could make a multibillion dollar business profitable, but who are we to judge? For now, users eager to transform their photos into Ghibli-style artwork may need to wait their turn as OpenAI works to strengthen its infrastructure. The company hasn't specified when full capacity might be restored, saying only that the optimization work "hopefully won't be long."

[32]

Lifehacker

ChatGPT Just Got a Huge Image-Generation Upgrade

OpenAI has significantly leveled up the image generating capabilities of ChatGPT, adding the update as part of the GPT-4o model introduced last May. The new and improved AI generator is rolling out now for all ChatGPT users, across paid plans and the free tier (though free users are more restricted in how much they can use it). It's been possible to generate images through the ChatGPT interface for a while now, though behind the scenes the work was farmed out to the DALL-E 3 image model. Now, everything will be handled by GPT-4o, for a more consistent and native experience. There are lots of improvements here, which cover some of the areas that AI image creator tools have typically struggled with: rendering text, keeping characters consistent across pictures, and drawing diagrams. OpenAI says you can now expect more "precise, accurate, [and] photorealistic" results from your prompts. Images made with AI often come with an artificial sheen that tells you they've been dreamt up by algorithms, and that should be less obvious with GPT-4o images. One of the demo pictures shown off by OpenAI has a woman writing on a whiteboard, with a view reflected in it -- all pretty life-like, though note the small caption at the bottom that tells you this was the best of eight attempts ChatGPT had at the prompt. The AI art users create should also stick more closely to the prompts given, OpenAI says. So, if you want specific objects in specific places, or you need people in certain positions, then these instructions will apparently be carried out more faithfully. One of the more impressive example images shows a four-panel comic strip rendered by ChatGPT, without any obvious errors or inconsistencies. I tried to get ChatGPT to turn an Austen novel into a comic strip, and produce a photorealistic image of a stately home with a garden, and the results were impressive -- if not quite perfect. They're certainly significantly better than the images ChatGPT was previously producing, although the rendering takes longer to complete (typically minutes rather than seconds). Trying to get AI to render text and diagrams accurately has long been a challenge: The way these tools are built means they're much better at inventing and remixing the images they've been trained on, rather than reproducing an exact copy of the alphabet or a series of rectangles and arrows. The new GPT-4o model can render text and diagrams to a high level of detail and accuracy, so you shouldn't see as many strange mistakes and inconsistencies. OpenAI's showreel included a menu, an invitation, a boarding pass, and a diagram explaining Newton's prism experiment, all generated from a single text prompt. When I asked ChatGPT to produce an infographic explaining DNA in simple terms, and a book cover with a specified title and author, it followed the brief pretty exactly -- the graphic was basic but accurate (as per the prompt), and the book cover looked like something you might see in a store. Just as importantly, there were no weird artifacts or inconsistencies in the images. I've written before about the limitations of ChatGPT image editing, and this is another area that's been upgraded. It's now easier to keep characters and scenes consistent between images, to only tweak parts of a picture and leave the rest untouched, and to build up different layers of an image. You can even create transparent backgrounds, if needed, or specify colors using hex codes. Other improvements come in the way ChatGPT can accept and remix your own images, and incorporate other information (from the web and its training data): So one of the demo OpenAI pictures was built from the prompt "make a visual infographic describing why SF is so foggy" and ChatGPT did just that (well, best of three). In my own tests, I found ChatGPT much better at editing images, and pretty competent at remixing pictures in different styles. It still struggles to some extent keeping consistency between images -- especially with complex objects and characters. It's definitely better than it was at this, but there's still a tendency to overdo the edits, making the AI less useful for tweaking images or making a series of several images that need to match. As with any generative AI announcement, issues around copyright, misuse, and energy demands are once again brought to mind. OpenAI is on record as saying it's impossible to build these tools without training on copyrighted images, though it has recently started signing content deals with providers such as Shutterstock. Brad Lightcap, OpenAI's chief operating officer, told the Wall Street Journal that the GPT-4o image generator will reject requests to mimic the work of any living artist. When it comes to safety, OpenAI says generated images all come with C2PA metadata to identify them as AI-generated -- though this metadata can be easily removed with something as simple as a screenshot. The AI generator is also built to rebuff any attempts to create "child sexual abuse materials and sexual deepfakes" OpenAI says, as well as other prompts that violate its content policies. This is clearly a major step forward for AI images: The upgraded technology is genuinely jaw-dropping at times, and a lot of the tell-tale signs of AI and the errors made by the tech are vanishing. It does raise some big questions about the future we're all barreling towards though, one where fakes are so easily made, where creative work is done by robots rather than people -- and where we collectively lose our ability to sketch a picture, craft a sentence, or write a line of code. And then how will generative AI find more training data?

[33]

AIM

Users In Awe of OpenAI's GPT-4o Native Image Generation Feature

'Creating and customising images is as simple as chatting using GPT‑4o'. OpenAI, the company behind the GPT family of AI models, unveiled native image generation capabilities in GPT-4o on Tuesday. This makes it possible for GPT-4o to generate images of varied natures, like infographics, comic strips, signboards, graphics, menus, memes, street signs, and more. It is also possible to refine and edit images generated with follow-up prompts. OpenAI has introduced native image generation features for users with Plus, Pro, Team, and Free plans. Access to Enterprise and Edu plans will be available shortly. Access to the API will be rolled out in the next few weeks. Native image generation indicates that GPT-4o can generate images using its inherent knowledge, meaning it doesn't have to rely on any external diffusion models, such as the company's very own DALL-E. OpenAI also mentioned that users can continue to use DALL-E as usual. "Creating and customising images is as simple as chatting using GPT‑4o - just describe what you need, including any specifics like aspect ratio, exact colors using hex codes, or a transparent background," said the company. In no time, users were blown away by its capabilities. Tobias Lutke, CEO of Shopify, shared in a post on X how the model could describe the anatomy of an unknown animal on his son's t-shirt. After he saw the results, he remarked, "How is this even real?". Besides, the model is also capable of generating texts without any distortions or errors. The model is also capable of generating user interfaces based on details in a prompt without any reference images. Users have also been experimenting with style transformations on existing photos. Grant Slatton, a founding engineer at Row Zero, showcased an example of how GPT-4o could convert a regular photo into a 'Studio Ghibli'-style anime image. His post quickly gained traction, inspiring many others to share their own AI-generated creations. In another instance, users could reproduce advertisement images, including the copy material. A user on X shared an ad image as a reference and asked GPT-4o to recreate it for their app. He also requested that the app screenshot in the original ad be replaced with a screenshot of their app. "Within minutes, it had almost perfectly replicated it," he said. Besides, people are also amazed by the model's capabilities of generating photorealistic images. OpenAI's announcement comes a few days after Google introduced native image generation in the Gemini 2.0 Flash AI model. Initially introduced to trusted testers in December, this feature is now accessible across all regions supported by Google AI Studio. "Developers can now test this new capability using an experimental version of Gemini 2.0 Flash (gemini-2.0-flash-exp) in Google AI Studio and via the Gemini API," Google said.

[34]

SiliconANGLE

OpenAI upgrades ChatGPT's image generation capabilities - SiliconANGLE

OpenAI today rolled out what it describes as a major upgrade to ChatGPT's built-in image generation tool. Until now, the feature was powered by an algorithm called DALLE-3 that debuted in 2023. It's the third iteration of a text-to-image model that first debuted two years earlier. The model's original version was a modified edition of GPT-3 adapted to rendering tasks. As part of today's update, OpenAI is switching ChatGPT's image generation tool from DALL-E to GPT-4o. The latter algorithm is a multimodal large language model that launched last April. OpenAI says that the upgrade will significantly enhance ChatGPT's graphic design skills. The chatbot's image generator can now take on more complex tasks than before. In one internal test, OpenAI asked ChatGPT to visualize an early physics experiment carried out by Isaac Newton. In response, the chatbot generated a detailed illustration complete with explanatory text. ChatGPT can customize the images it generates based on user instructions. After creating the illustration of Newton's experiment, OpenAI engineers asked the chatbot to overlay the drawing on a notebook. The chatbot successfully completed the task, which involved both changing the angle of the illustration and adding a complex background. According to OpenAI, competing AI image generators struggle with prompts that ask them to draw more than a handful of objects. The company says that GPT-4o can accurately draw up to 20 different items specified by the user. That includes text, which the model generates more reliably than DALL-E 3. Users can optionally supply ChatGPT with reference images. An interface designer, for example, could upload a dropdown menu template and ask the chatbot to make improvements. Another selling point of ChatGPT's upgraded image generator is that it can create objects with transparent backgrounds. A transparent background makes it easier to combine visual assets with one another. That simplifies tasks such as integrating a newly created logo into an existing application interface. According to the Wall Street Journal, OpenAI trained GPT-4o using publicly available data and assets licensed from partners such as Shutterstock Inc. "We trained our models on the joint distribution of online images and text, learning not just how images relate to language, but how they relate to each other," OpenAI staffers wrote in a blog post. After the initial training phase, the company used a method called RLHF to further refine ChatGPT's output quality. It's a variation of reinforcement learning, an industry-standard approach to developing AI models. In reinforcement learning projects, an AI model's training process is coordinated by a second neural network. RLHF, the machine learning OpenAI used to build GPT-4o, enhances that second neural network using feedback from human experts. The improvements the experts make help increase the quality of the AI being trained.

[35]

TweakTown

OpenAI's Studio Ghibli image-generator adds more users than ChatGPT did in 5 days

TL;DR: OpenAI experienced a significant increase in ChatGPT users, adding one million new users in just an hour. This marks the most substantial growth since the platform's launch 26 months ago. The release of OpenAI's latest image-generation tool that's powered by the advanced GPT-4o model has taken the internet by storm, with millions of users flocking to the new AI tool to spawn countless new images inspired by the iconic Studio Ghibli art style. The new image-generation tool is able to take almost any image and recreate it with the art style inspired by Studio Ghibli, created by famed director Hayao Miyazaki, who helmed many highly rated animated movies such as Spirited Away, Howl's Moving Castle, and My Neighbor Totoro. These images have already taken over the internet with tsunamis of them being posted on multiple social media platforms as users are taking real-life historical events, iconic scenes from other movies, and famous people, and recreating them in the Studio Ghibli art style. With this massive increase in popularity comes a cost that OpenAI is paying, and that cost is in GPU processing power, which company CEO Sam Altman said a few days ago was so high that OpenAI's GPUs are "melting" under the demand. In fact, the demand was so high that Altman announced that rate limits had to be implemented, and in a follow-up announcement, he pleaded with users to "chill" on generating images as the OpenAI team "needs sleep." While I'm sure OpenAI's GPUs aren't actually physically melting under the demand, and that teams are getting some sleep but are certainly really busy, Altman gave some hard statistics to visualize this newly found popularity. According to the CEO, when ChatGPT launched 26 months ago, it added 1 million users in just five days, and following the launch of the new image-generation tool, one million users were added in just one hour.

[0]

LaptopMag

ChatGPT image generation now is free for all -- but will it show Elon Musk wielding a chainsaw?

It may take several minutes, but as of Monday night, all free users can generate just about any image they can dream up with ChatGPT. CEO Sam Altman posted the news on X that "chatgpt image gen now rolled out to all free users!" -- which presents a swift turn-around from last Thursday, when Altman announced that OpenAI would introduce limits on how many images one could create with the ChatGPT service because, as he put it, "our GPUs are melting." For a bit of background, any user can go to chatgpt.com, and, once they create an account, can ask the generative AI tool to create an image, so long as the image is not of a specific person -- sort of. If you ask ChatGPT to create an image of Elon Musk holding a chainsaw, you will be denied: "I can't generate an image of Elon Musk specifically, but I can create a generic illustration of a man resembling him holding a chainsaw in a specific style -- such as realistic, cyberpunk, or Studio Ghibli." However, if you ask it to create a Studio Ghibli-style illustration of the current U.S. President, or Donald Trump, you'll be met with this image: And ChatGPT will also comply if you ask it to generate an image of President Trump holding a chainsaw in Studio Ghibli style. So, you can make an image of Donald Trump holding a chainsaw but not Elon Musk. At least when I tried. And maybe with good reason. Musk is a co-founder of OpenAI ("Elon Musk's OpenAI Can Be Muscle for Safe Artificial Intelligence Research" reads one headline from nearly a decade ago) but left the group in 2018. A year ago, the reasons from the point of view of OpenAI, unraveled in public. In the year since, Musk has become a vocal critic of OpenAI, culminating in a $97.4 billion bid by Musk and others to buy OpenAI. Altman dismissed the bid and parried this post on X: "no thank you but we will buy twitter for $9.74 billion if you want." As OpenAI's persona non grata, it's perhaps not surprising that ChatGPT wouldn't generate an image of the Tesla CEO holding a chainsaw. But if you log in with a different account and make the same request, you might get lucky. I did. You'll get a wide-eyed Elon holding a chainsaw, looking straight down the camera. And once you've got Musk with a chainsaw, ChatGPT will also give you a smiling chainsaw-less Musk: All of which is to say: ChatGPT is very buggy right now. Sometimes, you'll get error messages saying your history won't load. Sometimes you'll get a chainsaw pic, and other times you won't. If Altman's frequent posts on X haven't made it clear enough, the technology can't quite keep up with energy-intensive demand. The white-hot attention on ChatGPT makes very unsurprising on Tuesday that the firm just closed a $40 billion funding round. Japan-based SoftBank led the round, contributing $30 billion, with other investors contributing $10 billion. It's the largest private funding round ever for a tech company, reports CNBC. OpenAI's value is now north of $300 billion, the company says, adding that more than 500 million people use ChatGPT every week. The addition of ChatGPT image generation to the zeitgeist is only the beginning of how images and the concept of art itself are changing. Even Japan's copyright laws don't appear able to stop the Studio Ghibli-styled ChatGPT images. See also: ChatGPT's Studio Ghibli-style image generator proves AI isn't sustainable The only thing holding back the explosion of image generation even more is technology. After a handful of images are created, you'll be asked to come back in about 24 hours because, even though it's free for all users and presumably the servers are no longer melting, OpenAI's got a world-changing hit on its hands that will no doubt evolve into video generation in the months ahead.

[36]

Cointelegraph

'Our GPUs are melting' -- OpenAI puts limiter in after Ghibli-tsunami

OpenAI CEO Sam Altman says while it's been "super fun" seeing everyone's image creations using ChatGPT, some limits need to be put in place. ChatGPT creators OpenAI have introduced rate limits after a viral social media trend that saw nearly everything "Ghiblifyied" -- turned into AI art in the style of the famous Japanese animation studio. OpenAI CEO Sam Altman was one of the first to take part in the trend, posting a portrait of himself generated by the model on March 25 but said in a subsequent post two days later that all image requests have started to tax the firm's infrastructure. "It's super fun seeing people love images in ChatGPT but our GPUs are melting. We are going to temporarily introduce some rate limits while we work on making it more efficient," he said. "Also, we are refusing some generations that should be allowed; we are fixing these as fast we can," he added. OpenAI launched the upgraded image generation offering in ChatGPT-4o on March 25, resulting in users splashing images across social media in the art style of Studio Ghibli -- known for its anime films Spirited Away and My Neighbor Totoro. Altman didn't give a definitive timeline on how long the rate limits would last but said, "Hopefully, it won't be long! ChatGPT free tier will get three generations per day soon." Rate limits are generally applied to help OpenAI manage the aggregate load on its infrastructure, according to OpenAI. Related: Ghibli memecoins surge as internet flooded with Studio Ghibli-style AI images "If requests to the API increase dramatically, it could tax the servers and cause performance issues. By setting rate limits, OpenAI can help maintain a smooth and consistent experience for all users," OpenAI says on its rate limit explanation page. Meanwhile, Bloomberg reported on March 26 that OpenAI expects to more than triple its revenue this year to $12.7 billion, citing a person familiar with the matter. Altman said on Feb. 12 his firm wants to ship GPT-4.5 and GPT-5 in the coming weeks or months.

[0]