3 Sources
3 Sources
[1]
OpenAI prepares Sora 2 to take on Google's Veo 3
OpenAI has had enough of Google's Veo 3 dominating generative AI videos and is now working on Sora 2, the successor to Sora. As spotted on X, OpenAI's servers now include references to "Sora 2," which makes it obvious that OpenAI has developed a new generative video model. Sora shipped on December 9, 2024, and it's been more than six months since it was last updated. Sora is also no longer the talk of the town, but it was recently in the news when Microsoft added it to Bing Video Creator for free. BleepingComputer understands that Sora 2 isn't yet ready for public rollout, and we might learn more about it from OpenAI in the coming weeks. Meanwhile, Google has announced that Veo 3 is now free for college students with the Gemini AI Pro plan. If you're not a college student, you can use Google Cloud's free credit, then sign up for Vertex AI development mode and use Veo 3.
[2]
ChatGPT with GPT 5 tipped for Sora 2 upgrade -- here's everything we know
ChatGPT's GPT-5 upgrade is lurking right around the corner. OpenAI is hinting at it, and predictions are flooding in, both for the release day and what new features will come with it. And one new prediction could be the most exciting yet. Tibor Blaho, known for his AI predictions and insights on X, hinted at a possible update to OpenAI's Sora. He posted a screenshot of host names on OpenAI.org, which shows files with the names "Sora 2". Sora is OpenAI's video generator. It was originally launched back in December 2024 and, at the time, was one of the biggest jumps forward for AI video we'd seen. However, things have moved on since then, and Sora hasn't changed all that much. In fact, it has steadily been pushed to the back of the line by the likes of Google's Veo 3, as well as lesser-known competitors including Kling 2, Runway, and Moonvalley. Including an update to Sora alongside GPT-5 would seem logical. AI video is moving fast, and OpenAI won't want to get left behind. While it isn't yet confirmed, there are some features that we would expect to see from a new version of Sora. Sora does not understand human movement. Lots of clips that you'll see from the current version depict people melting into each other, arms disappearing and bodies warping through the ground. There are a lot of reasons for this, but the big one is that AI models don't completely understand physics. They've been trained on video clips, but not the underlying explanations of those clips. For example, footage of a man surfing makes sense in theory, but it can be hard to replicate this if you don't understand the way humans interact with water. Equally, training on video is a lot more complicated than images. Movement can be different every time you see it, and this can cause confusion for recreating the complexities of human movement. This doesn't mean it is impossible to fix. More advanced AI video models are getting close to nailing this problem, and Sora 2 could, in theory, do that too. One of the biggest factors that gave Veo 3 a lead in the market is its addition of audio with video clips. The earlier AI video generators (like Sora) were designed to just replicate video. However, more recent alternatives have begun to bring AI-generated audio into the fold as well. To make Sora 2 more competitive, OpenAI is likely to bring a feature like this into the mix. They have experimented with creating AI audio in the past but have been somewhat silent on this venture since. Where OpenAI could stand out with Sora 2 is in the length of its generated clips. Most generators are only able to produce videos between four and eight seconds at a time. However, some competitors are slowly starting to nudge this number upwards. OpenAI could be one of the companies to do this first. If they were able to do this, it would almost definitely be locked behind OpenAI's most expensive plan, considering how expensive longer video generation is and the energy costs it requires. A more recent change that is being noted in the world of AI video is an increase in the quality of the footage. Moonvalley, a company offering copyright-free AI video generation, is making waves with the quality of its footage. By training exclusively on high-quality videos, it can offer footage of only a certain resolution and above. While OpenAI is unlikely to go as far as this, improvements in both training and the technology powering the tool could see Sora 2 offering much higher quality in its video. Of course, this is tricky to blend with the longer length video mentioned above and would most likely end up being separate features.
[3]
Sora 2 is coming, but it will have to dazzle viewers to beat Google's Veo 3 model
Veo 3 already offers features that Sora does not, and OpenAI will need to enhance both what Sora can do and how easy it is to use to entice possible customers OpenAI appears to be finalizing plans to release Sora 2, the next iteration of its text-to-video model, based on references spotted in OpenAI's servers. Nothing has been officially confirmed, but there are signs that Sora 2 will be a major upgrade aimed squarely at Google's Veo 3 AI video model. It's not just a race to generate prettier pixels; it's about sound and the experience of producing what the user is imagining when writing a prompt. OpenAI's Sora impressed many when it debuted with its high-quality images. They were silent films, however. But, when Veo 3 debuted this year, it showcased short clips with speech and environmental audio baked in and synced up. Not only could you watch a man pour coffee in slow motion, but you could also hear the gentle splash of liquid, the clink of ceramic, and even the hum of a diner around the digital character. To make Sora 2 stand out as more than just a lesser option to Veo 3, OpenAI will need to figure out how to stitch believable voices, sound effects, and ambient noise into even better versions of its visuals. Getting audio right, particularly lip-sync, is tricky. Most AI video models can show you a face saying words. The magic trick is making it look like those words actually came from that face. It's not that Veo 3 is perfect at matching sound to picture, but there are examples of videos with surprisingly tight audio-to-mouth coordination, background music that matches the mood, and effects that fit the intent of the video. Granted, a maximum of eight seconds per video limits the scope for success or failure, but fidelity to the scene is necessary before considering duration. And it's hard to deny that it can make videos that both look and sound like real cats jumping off high dives into a pool. Though if Sora 2 can extend to 30 seconds or more with a steady quality, it's easy to see it attracting users looking for more room for creating AI videos. OpenAI's Sora can stretch up to 20 seconds or more of high-quality video. And as it's embedded into ChatGPT, you can make it part of a larger project. This flexibility is significant for helping Sora stand out, but the audio absence is notable. To compete directly with Veo 3, Sora 2 will have to find its voice. Not only find it, but weave it smoothly into the videos it produces. Sora 2 might have great audio, but if it can't outmatch the seamless way Veo 3's audio connects with its visuals, it might not matter. At the same time, making Sora 2 too good might cause its own issues. With every new generation of AI video model, there's more concern about blurring the line with reality. Sora and Veo 3 both don't allow prompts involving real people, violence, or copyrighted content. But adding audio offers a whole new dimension of scrutiny over the origin and use of realistic voices. The other big question is pricing. Google has Veo 3 behind the Gemini Advanced paywall, and you really need to subscribe to the $250 a month AI Ultra tier if you want to use Veo 3 all the time. OpenAI might bundle access to Sora 2 into the ChatGPT Plus and Pro tiers in a similar manner, but if it can offer more to the cheaper tier, it's likely to quickly expand its userbase. For the average person, the AI video tool they turn to will hinge on that price, as well as ease of use, as much as the features and quality of video. There's a lot OpenAI needs to do if Sora 2 is going to be more than a silent blip in the AI race, but it looks like we will find out how well it can compete soon.
Share
Share
Copy Link
OpenAI is preparing to launch Sora 2, a significant upgrade to its AI video generation model, aiming to compete with Google's Veo 3. The update is expected to bring improvements in video quality, audio integration, and human movement representation.
OpenAI is gearing up to release Sora 2, the next iteration of its text-to-video AI model, in a bid to challenge Google's dominant Veo 3. References to "Sora 2" have been spotted on OpenAI's servers, indicating that the company is actively developing a new generative video model
1
2
.Source: Bleeping Computer
Since its launch in December 2024, OpenAI's original Sora model has been overshadowed by competitors, particularly Google's Veo 3. While Sora impressed many with its high-quality visuals, it lacked audio capabilities and has not seen significant updates in over six months
1
3
.Google's Veo 3, on the other hand, has gained traction by offering features that Sora currently lacks. It can generate short video clips with synchronized speech and environmental audio, creating a more immersive experience
3
.To compete effectively with Veo 3, Sora 2 is anticipated to introduce several key enhancements:
Audio Integration: One of the most significant expected upgrades is the addition of AI-generated audio, including speech, sound effects, and ambient noise
2
3
.Enhanced Human Movement: The current version of Sora struggles with accurately representing human movement. Sora 2 is expected to address this issue, potentially through improved understanding of physics and human interactions
2
.Source: TechRadar
Longer Video Generation: While most AI video generators produce clips between 4-8 seconds, Sora 2 might push this limit further, possibly offering videos up to 30 seconds or longer
2
3
.Improved Video Quality: Following the trend set by competitors like Moonvalley, Sora 2 could offer higher resolution and overall better quality footage
2
.Related Stories
Despite these potential improvements, OpenAI faces several challenges with Sora 2:
Audio-Visual Synchronization: Achieving seamless lip-sync and audio-to-mouth coordination will be crucial for Sora 2 to compete effectively with Veo 3
3
.Ethical Concerns: As AI-generated videos become more realistic, there are growing concerns about the potential for misuse and the blurring of lines with reality
3
.Pricing and Accessibility: The success of Sora 2 may depend on its pricing strategy and integration with existing OpenAI services like ChatGPT
3
.Source: Tom's Guide
The release of Sora 2 could significantly impact the AI video generation market. Google has recently made Veo 3 free for college students with the Gemini AI Pro plan, indicating the growing competition in this space
1
.As AI video technology advances, it's becoming an increasingly important tool for content creators, marketers, and various industries. The outcome of this competition between OpenAI and Google could shape the future of AI-generated video content.
While no official release date has been announced, industry observers expect more information about Sora 2 to be revealed by OpenAI in the coming weeks
1
2
3
.Summarized by
Navi
[1]
1
Business and Economy
2
Technology
3
Business and Economy