Curated by THEOUTPOST
On Tue, 17 Dec, 12:03 AM UTC
24 Sources
[1]
Google Veo2 vs OpenAI Sora Turbo: Which AI Video Generator Wins?
Google's Veo2 AI video generator is an innovative tool reshaping the landscape of AI-driven video creation. Designed to transform textual prompts into cinematic and highly realistic video content, Veo2 is currently available in early access through Google Labs. It features two primary modes: text-to-video and text-to-image-to-video generation. While the tool shows remarkable potential, it faces challenges in handling complex scenarios and maintaining consistency. Understanding its features, performance, and future prospects helps highlight its role in the evolving field of AI video generation. Imagine turning your wildest ideas into stunning, cinematic videos with just a few words. Whether it's a robot detective wandering through a rain-soaked city or a dolphin leaping through a sea of honey, the possibilities seem endless. For anyone who has struggled to bring creative visions to life, Google's Veo2 AI video generator feels like a dream come true. This innovative tool, currently in early access, promises to transform video creation by converting simple text prompts into visually rich, dynamic outputs. However, like any new technology, it comes with its quirks and challenges, raising the question: is it the fantastic option we've been waiting for? In this guide Matt Wolfe compares Google Veo2 vs OpenAI Sora Turbo, revealing what makes Veo2 so exciting, from its innovative features to the areas where it still stumbles. Whether you're a seasoned creator or just curious about the future of AI-driven video, there's plenty to uncover. Explore how Veo2 compares to tools like OpenAI's Sora Turbo, its strengths in realism and creativity, and its limitations that remind us it's still a work in progress. Veo2 stands out for its ability to generate high-quality videos from simple text prompts, offering two distinct modes of operation: For every input, Veo2 generates four video variations, giving you the flexibility to choose and refine the most suitable output. This feature is particularly useful for creative projects, where visual experimentation is key. For example, you could prompt Veo2 to create a scene of a robot detective navigating a rain-soaked city or a dog joyfully leaping into a pool. Despite its impressive capabilities, Veo2 is not without limitations. The current version lacks a built-in library for saving prompts or outputs, which could significantly enhance workflow efficiency. This omission reflects its early developmental stage but also highlights areas for potential improvement in future updates. Veo2 excels in generating visually stunning and imaginative content, particularly for abstract or creative prompts. For instance, it can effectively produce visuals like a dolphin breaching a sea of honey or a punk rock squirrel playing a guitar. These outputs showcase the tool's ability to handle artistic and unconventional ideas with remarkable realism and detail. However, the tool struggles with more intricate scenarios. Complex interactions, such as multiple moving elements or dynamic relationships between foreground and background, can lead to inconsistencies. In such cases, you might notice reversed sequences, fluctuating object details, or a lack of coherence in the final output. Additionally, Veo2 has limited success with highly specific prompts, such as those requiring real names, precise details, or intricate spatial arrangements. These challenges underscore the tool's current limitations but also point to areas where advancements could significantly enhance its functionality. Expand your understanding of AI video generators with additional resources from our extensive library of articles. When compared to OpenAI's Sora Turbo, Veo2 distinguishes itself through its emphasis on realism and creativity. Its outputs are more cinematic and visually detailed, making it a preferred choice for projects that demand high-quality visuals. In contrast, Sora Turbo offers faster generation times but often sacrifices resolution and visual fidelity. Both tools face similar challenges when dealing with intricate prompts, particularly those requiring precise coordination between multiple elements. For example, scenarios involving synchronized movements or detailed interactions often result in less coherent outputs from either platform. Your choice between Veo2 and Sora Turbo will ultimately depend on your priorities. If you value speed and efficiency, Sora Turbo may be the better option. However, if your focus is on achieving visually stunning and realistic results, Veo2 is likely to be the more suitable choice. As an early access tool, Veo2 is available through Google Labs, with a waiting list for broader access. The platform encourages iterative refinement of prompts, allowing you to enhance clarity and specificity by using tools like ChatGPT or Gemini. This iterative approach enables you to experiment and fine-tune your creative vision. Unlike some AI tools that showcase only the best results, Veo2 provides all four generated videos for your evaluation. This transparency fosters a more hands-on creative process, allowing you to compare outputs and refine your prompts for better results. However, the absence of a built-in library for saving or revisiting previous prompts may hinder workflow efficiency. This limitation reflects the tool's developmental phase but also suggests opportunities for future enhancements that could make the platform more user-friendly and versatile. The rapid evolution of AI video generation technology signals an exciting future for tools like Veo2. As the technology matures, you can expect significant improvements in areas such as handling complex prompts, managing dynamic interactions, and making sure overall coherence in video outputs. Google's commitment to expanding access to Veo2 is likely to provide widespread access to advanced video creation, making it accessible to a broader audience. This increased accessibility could empower more users to explore creative possibilities and bring their ideas to life. Competition with platforms like Sora Turbo is expected to drive further innovation, resulting in more powerful and versatile tools. Both Veo2 and Sora Turbo are anticipated to release updated versions with enhanced capabilities, pushing the boundaries of what AI can achieve in video generation. These advancements will likely lead to more sophisticated, efficient, and user-friendly solutions for creators like you.
[2]
Google Unveiled Its New Veo 2 AI Video Generator, and It Looks Impressive
The rapid improvements in AI video generation don't appear to be slowing down any time soon, and Google has announced a major upgrade to its own video maker in the form of Veo 2. It follows the introduction of Veo in May, and OpenAI rolling out its Sora AI video generator to the public earlier this month. Google says Veo 2 beats Sora and other competitors in terms of user preferences and prompt adherence -- you can see the relevant charts for yourself on the Veo 2 website -- and that the new model brings with it "enhanced realism and fidelity" as well as better handling for motion and improved camera controls. What Veo 2 can do Veo 2 has boosted its video-generating capabilities through "an improved understanding of real-world physics and the nuances of human movement and expression," according to Google -- so it sounds like you'll see less weirdness in your clips. The new model also has a better handle on the language of cinema and film. That means prompts can include a specific movie genre, a certain type of lens, or a particular cinematic effect. There's now support for terms such as "low-angle tracking shot," "close-up shot," or "shallow depth of field," which can help generated videos fit more closely to a prompt. Videos can be created at resolutions up to 4K (an upgrade from 1080p), and can now run to several minutes in length. Some of the example clips Google has been showing off look impressive: We've got drinks being poured, dogs riding inflatables in pools, cartoon characters talking to the camera, and scientists peering down microscopes. It's not flawless, however, and Google admits Veo 2 still struggles with consistency in complex scenes. This is an issue with AI video generators as a whole -- they don't understand the physics and space of the real world, even if they can do a rough approximation of it based on a vast amount of training data. You'll notice that none of these videos show something going out of shot and then coming back in, most probably because the AI would forget about it or render it differently. How to try Veo 2 The process for getting at Veo 2 is the same as it already was for the first Veo model: You need to head over to Google's VideoFX tool on the web, which is part of Google Labs, and sign up for the waiting list. For now, you need to be in the U.S. to try out Veo 2, and you need to be 18 or older. Google is promising an expansion of Veo 2 availability over the course of next year -- including putting it into YouTube Shorts for creators -- but for the time being it's not available in other tools such as Google Gemini (which does offer access to the latest Imagen 3 art generator model). As I don't currently have access to Veo 2, I can't give you too many details on the limitations that are in place. Google says "each user is limited to a certain number of daily generations" but it's not clear how many videos you can whip up each day, or how long they can be. OpenAI is letting ChatGPT Plus users produce up to 50 videos per month, with the length capped at five seconds and the resolution limited to 720p. Any videos you produce will come with an "invisible SynthID watermark" that identifies them as having been AI-generated -- part of Google's attempts to minimize the risk of Veo 2 being used to deceive viewers and spread misinformation (safety concerns are also part of the reason for the slow rollout, Google says).
[3]
New Google VEO 2 AI Video Generator by DeepMind : The Future of Cinematic AI Movies
Google DeepMind has unveiled VEO 2, an advanced AI video generation model that is setting new standards in AI-driven content creation. With the ability to produce high-resolution, cinematic-quality videos, VEO 2 has captured widespread attention for its exceptional performance in user evaluations and benchmark tests. This article explores its capabilities, strengths, limitations, and potential applications, offering a comprehensive look at this innovative technology. Imagine bringing your wildest ideas to life -- not just as sketches or static images, but as vivid, dynamic videos that feel like they've leapt straight out of your imagination. Whether it's a pirate cannon firing meatballs or penguins gracefully paragliding through a sunset sky, the possibilities are endless. For creators, storytellers, and innovators, this technology feels like a dream. With Google DeepMind's latest breakthrough, VEO 2, that dream is inching closer to reality. If you've ever struggled to find visuals that match your vision, this might be the tool you've been waiting for. VEO 2 isn't just another AI model -- it's a fantastic option for video generation, offering stunning realism, cinematic quality, and an uncanny ability to follow even the most complex prompts. From high-resolution underwater scenes to futuristic cityscapes, its capabilities are as versatile as they are impressive. Of course, no technology is without its quirks, and VEO 2 has its share of challenges. However, its potential to transform creative industries is undeniable. VEO 2 has emerged as a leader in AI video generation, consistently outperforming competitors such as OpenAI's Sora Turbo and Meta's MovieGen. Its dominance is particularly evident in the widely recognized MovieGen Bench, where it excels in key areas: These achievements underscore VEO 2's reliability for professional and creative applications, where precision and quality are essential. Its ability to deliver consistent results makes it a preferred choice for users seeking dependable AI tools for video production. VEO 2 is equipped with a range of advanced features that cater to diverse video generation needs. Its standout capabilities include: One of its most notable features is the ability to simulate realistic physics. This includes accurately depicting water dynamics, fire behavior, and complex object interactions. For instance, VEO 2 can render a car drifting through rain-soaked streets or a dancer performing intricate movements with stunning precision. These capabilities make it a versatile tool for creating both realistic and imaginative content. Explore further guides and articles from our vast library that you may find relevant to your interests in AI video generation. The strengths of VEO 2 lie in its ability to combine technical precision with creative flexibility. Its realism in lighting and reflections enhances the cinematic quality of its outputs, while its physics simulation ensures believable interactions between objects. The model also excels in handling complex prompts, such as choreographed fight scenes or animations involving multiple moving elements. These attributes make it a powerful asset for various industries, including: By seamlessly blending technical accuracy with creative adaptability, VEO 2 enables users to push the boundaries of visual storytelling and content creation. Despite its impressive capabilities, VEO 2 is not without its limitations. Some challenges include: While these issues are relatively minor, they highlight areas for future improvement. Addressing these challenges could further enhance VEO 2's performance, solidifying its position as a leader in AI video generation. The versatility of VEO 2 makes it a valuable tool across a wide range of industries. Its applications include: Examples of its generated content showcase this versatility. Realistic scenarios include DJs performing in vibrant nightclubs, cars drifting through bustling city streets, and underwater scenes teeming with marine life. On the imaginative side, VEO 2 can create whimsical visuals, such as penguins paragliding or pandas riding pterodactyls. Its ability to produce abstract outputs, like hyperlapse cityscapes or futuristic soccer matches, further demonstrates its creative potential. Currently, access to VEO 2 is limited to users on a waitlist, with broader availability anticipated in the near future. This restricted access reflects the model's innovative nature and the growing demand for high-quality AI video generation tools. As access expands, VEO 2 is expected to play a pivotal role in shaping the future of AI-driven content creation. Its potential to transform industries ranging from entertainment to advertising underscores its significance in the evolving landscape of AI technology. In direct comparisons, VEO 2 consistently outperforms competitors like Sora Turbo and Meta MovieGen. Its ability to produce coherent, realistic outputs -- even in challenging scenarios -- sets it apart. While other models may excel in specific areas, VEO 2's balanced performance across multiple metrics makes it a preferred choice for users seeking reliability and quality. Its combination of technical precision and creative adaptability ensures it remains a frontrunner in the field of AI video generation.
[4]
Google's Veo 2 video generator takes on Sora Turbo - how to try it
Also now available: An improved version of Google's Imagen 3 image generator and a fun, new experiment. OpenAI upped the ante in the video generation space earlier this month, making Sora -- its state-of-the-art text-to-video generator model -- available to ChatGPT Plus users with Sora Turbo. Now, Google is gearing up to compete with the launch of its most advanced video generator. On Monday, Google launched Veo 2, a text-to-video generator that boasts improvements from the company's previous model, including a better understanding of real-world physics, which helps the AI produce better generations with more detail and realism, according to Google. Also: This new Google AI tool lets you easily generate images from other photos - no prompt required The videos generated can reach up to 4K resolution and, Google said, can tackle common video generator challenges -- including hallucinations such as extra fingers. When evaluated by human raters against other leading video models, including Sora Turbo, Kiling v1.5, and Meta Movie Gen, Veo 2 was voted best on overall performance and prompt adherence. Veo 2 also understands cinematography language, such as a specific genre, lens, or angle. For example, if a user says "shallow depth of field," Veo 2 knows to blur out the subject's background to produce the effect. The video below was created with a shot that specifically said, "Shot with a 35mm lens on Kodak Portra 400 film." The model is available to the public and can be accessed in VideoFX in Google Labs. The early access waitlist form asks for basic information such as age, name, place of residence, relevant work, and how you heard about it. Google said submissions are reviewed on a rolling basis. Google also shared it improved its Imagen 3 image-generation model to generate "brighter and better composed" images. The improved model can generate more diverse styles and output images with higher prompt fidelity, richer details, and textures, according to the company. This version of Imagen 3 is rolling out to the public via ImageFX in Google Labs starting today, and unlike VideoFX, it does not require a waitlist. The previous version of Imagen 3 was already very capable, ranking as the best AI image generator on ZDNET's 2024 roundup. Also: Google Labs just got a redesign. Here are 6 reasons to check it out Lastly, Google unveiled Whisk, a new experiment that is also available in Labs. This tool allows users to create an image -- or input their own -- and transform it into a new image in the style of a plushie, pin, or sticker. It leverages Imagen 3 and Gemini, creating detailed captions for your image that are fed into Imagen 3 to create the final products.
[5]
Google debuts new AI video generator Veo 2 claiming better audience scores than Sora
The company also updated its image generation model Imagen 3 to produce richer, more detailed photos. Google said Veo 2 has "a better understanding of real-world physics and the nuances of human movement and expression." It is available on Google Labs' VideoFX platform -- but only on a waitlisted basis. Users will need to sign up through a Google Form and wait for access to be granted provisionally by Google at a time of its choosing. "Veo 2 also understands the language of cinematography: Ask it for a genre, specify a lens, suggest cinematic effects and Veo 2 will deliver - at resolutions up to 4K," Google said in a blog post. While Veo 2 is available only to select users, the original Veo remains available on Vertex AI. Videos created with Veo 2 will contain Google's metadata watermark SynthID to identify these as AI-generated. Google admits, though, that Veo 2 may still hallucinate extra fingers and the like, but it promises the new model produces fewer hallucinations. Veo 2 will compete against OpenAI's recently released Sora video generation model to attract filmmakers and content creators. Sora had been in previews for a while before OpenAI made it available to paying subscribers. Impressively, Google says that on its own internal tests gauging "overall preference" (which videos an audience liked better) and "prompt adherence" (how well the videos matched the instructions given by the human creator), Veo was preferred by human evaluators to Sora and other rival AI models. Google announced Veo in May this year during its Google I/O developer conference with a video made in partnership with actor-musician Donald Glover, aka Childish Gambino. AI video generation still needs some work AI video generation has long been an area of generative AI in which big model developers, like Google and OpenAI, regularly compete with and catch up with relatively smaller companies. RunwayML, one of the pioneers of AI video generation, recently launched advanced controls for its Gen-3 Alpha Turbo model. Pika Labs released Pika 2.0, allowing for more user control and adding their own characters to a video. Luma AI announced a partnership with AWS to bring its models to Bedrock for enterprise use. Luma also expanded its Dream Machine generation model. However, AI video generation still needs to convince both creators and viewers. After Sora's long-anticipated release, people remained skeptical of its capabilities when it continued to generate physics and anatomy-defying figures. Users felt it gave inconsistent results. A trailer from the recent Game Awards also showed people's distrust of what they perceive as "AI slop." Some filmmakers, though, have begun to embrace the possibilities AI video generators can provide. Famed director James Cameron joined the board of Stability AI while actor Andy Serkis announced he was building an AI-focused production company. However, Google said it's seeing interest from many users. The company said YouTube creators have been using VideoFX to make backgrounds for YouTube Shorts to save time. Updates to Imagen 3 Google also updated its image model Imagen 3, which it recently made available through its Gemini chatbot on the web, to be more realistic and offer brighter images. Imagen 3 can now render more art styles accurately, "from photorealism to impressionism, from abstract to anime." Google said the model will also follow prompts more faithfully.
[6]
Google's new Veo 2 beats OpenAI Sora with 4K AI video generation - here's how to try it
Google has announced its new AI video generation model, Veo 2, and you can join the waitlist to start creating videos of your own, today. The new AI video generator looks set to take on OpenAI's newly launched Sora in the battle for AI video supremacy in 2025, and if Veo 2 is as good as Google says it is, then the ChatGPT creator is under real pressure. In a blog post, Goggle says, "Veo 2 creates incredibly high-quality videos in a wide range of subjects and styles." The announcement also highlighted Veo 2's ability to capture "an improved understanding of real-world physics and the nuances of human movement and expression, which helps improve its detail and realism overall." Meanwhile, OpenAI's Sora launched in the US last week, and one of the biggest criticisms so far is that the AI video generator can't capture movement in a way that looks truly natural to the eye. The Veo 2 press release hints that Google thinks its own video generator can outshine Sora in this department. Another major improvement in Veo 2 compared to OpenAI's Sora is the ability for Google's offering to output in resolutions up to 4K, a stark contrast to Sora Turbo's (the premium tier of Sora for ChatGPT Plus subscribers) 1080p max output. Veo 2 launches today in the US, although there is a waitlist, and Google plans to expand the video generation tool to YouTube Shorts and "other products" next year. Google also claims that in testing, Veo 2 outperformed rivals in human testing where participants viewed over 1,000 prompts and respective videos on MovieGenBench, a benchmark dataset. In those results, Veo 2 was preferred to Sora Turbo on 58.8% of occasions compared to 14.5% who were neutral, and 26.7% who preferred Sora Turbo. While one benchmark isn't enough to clearly indicate which AI video generator is better, it's a good sign for Google that Veo 2 passes the eye test. We won't know how Veo 2 compares to Sora until we get our hands on the video creation tool. That said, Google's emphasis on cinematographic understanding could lead to Veo 2 being deemed the best AI video generator. Like any AI product, the proof is in the pudding and we looking forward to putting the new model through its paces ourselves. To gain access to Veo 2, you'll need to be over 18 and reside in the US. If you meet those requirements, follow the steps below.
[7]
Googles VEO-2 AI Video Generator : Features, Benefits, and Applications
Google has unveiled VEO-2, its latest AI-powered text-to-video generator, which sets a new standard in artificial intelligence. By integrating advancements in visual realism, physics simulation, and prompt accuracy, VEO-2 has surpassed competitors such as Meta's MovieGen and OpenAI's Sora Turbo. Alongside this breakthrough, Google introduced IMAG-3, an innovative text-to-image model, further cementing its position as a leader in AI innovation. Imagine being able to bring your wildest ideas to life -- whether it's a cinematic masterpiece, a surreal dreamscape, or a scientific visualization -- with just a few words. For creators, educators, and innovators, the gap between imagination and execution often feels daunting. But what if technology could bridge that gap, turning abstract concepts into vivid, dynamic visuals? Enter Google's VEO-2 AI video generator, a tool that transforms what's possible in AI-driven media creation. Whether you're a filmmaker envisioning the next blockbuster or a curious mind experimenting with creative prompts, VEO-2 promises to make the process seamless, precise, and awe-inspiring. But that's not all -- Google didn't stop at video. Alongside VEO-2, the tech giant unveiled IMAG-3, an innovative text-to-image model that's equally impressive. Together, these tools represent a major leap forward in AI, offering unparalleled realism, accuracy, and versatility. From simulating the delicate flow of water to crafting photorealistic or surreal imagery, Google's latest advancements are setting a new standard in the industry. So, what makes VEO-2 and IMAG-3 so innovative, and how can they transform the way we create and communicate? Let's explore the details and possibilities. VEO-2 distinguishes itself as a frontrunner in AI text-to-video generation through its exceptional technical capabilities and user-focused design. It delivers visually realistic outputs, simulates intricate physics, and adheres closely to user prompts, making it a versatile tool for diverse applications. Whether you are crafting cinematic sequences or exploring creative concepts, VEO-2 ensures results that are both precise and captivating. By addressing common challenges in AI-generated video, such as object distortion and incoherent transitions, VEO-2 provides a seamless experience for users. Its ability to handle complex prompts with accuracy and consistency makes it a preferred choice for professionals and enthusiasts alike. The success of VEO-2 lies in its innovative features, which redefine the potential of AI-generated video content: These features collectively make VEO-2 a powerful tool for creating high-quality, engaging video content that meets the demands of various industries. Expand your understanding of AI text-to-video generation with additional resources from our extensive library of articles. The versatility of VEO-2 opens up a wide range of creative possibilities. Its ability to generate imaginative and diverse scenarios makes it suitable for both professional and experimental use. Whether you are envisioning a high-energy action sequence or a surreal concept like dancing robots, VEO-2 maintains coherence and consistency throughout the output. Additionally, the model supports long-form content creation, making sure character continuity and narrative flow across extended sequences. This capability is particularly valuable for: By catering to these diverse applications, VEO-2 demonstrates its potential to transform how video content is conceptualized and produced. Complementing VEO-2, Google's IMAG-3 model advances text-to-image generation by delivering high-quality visuals that align closely with user inputs. IMAG-3 is designed to provide precision, coherence, and versatility, making it a valuable tool for a variety of creative and professional applications. IMAG-3's ability to deliver coherent and visually appealing outputs makes it a valuable addition to Google's suite of AI tools, further enhancing its reputation for innovation. The launch of VEO-2 and IMAG-3 underscores Google's commitment to advancing AI technology. By addressing critical challenges such as image coherence, realistic physics simulation, and creative versatility, Google continues to lead the way in AI development. These innovations not only highlight technical expertise but also open up new possibilities for media creation across industries. Google's efforts in AI innovation are setting the stage for a future where technology seamlessly integrates with creativity. By providing tools that are both powerful and user-friendly, Google is allowing individuals and organizations to push the boundaries of what is possible in content creation. The introduction of VEO-2 and IMAG-3 marks a significant step forward in AI-driven media creation. These technologies have the potential to transform various fields by offering tools that are efficient, versatile, and accessible. As these models continue to evolve, their impact is expected to grow in areas such as: By bridging the gap between imagination and reality, Google's innovations are paving the way for a new era of creativity and technological integration. These advancements promise to reshape how we approach media creation, making it more efficient, accessible, and inspiring.
[8]
Google strikes back with an answer to OpenAI's Sora launch
Google's DeepMind division unveiled its second generation Veo video generation model on Monday, which can create clips up to two minutes in length and at resolutions reaching 4K quality -- that's six times the length and four times the resolution of the 20-second/1080p resolution clips Sora can generate. Of course, those are Veo 2's theoretical upper limits. The model is currently only available on VideoFX, Google's experimental video generation platform, and its clips are capped at eight seconds and 720p resolution. VideoFX is also waitlisted, so not just anyone can log on to try Veo 2, though the company announced that it will be expanding access in the coming weeks. A Google spokesperson also noted that Veo 2 will be made available on the Vertex AI platform once the company can sufficiently scale the model's capabilities. Recommended Videos "Over the coming months, we'll continue to iterate based on feedback from users," Eli Collins told TechCrunch, "and [we'll] look to integrate Veo 2's updated capabilities into compelling use cases across the Google ecosystem ... We expect to share more updates next year." BREAKING: Google just dropped Veo 2 and Imagen 3 -- their next gen video and image generation models. Turns out Google's been closing the gap quietly -- not just on LLMs, but on visual creation too. Here's everything you need to know w/o the hype 🧵 pic.twitter.com/4BL82HAmEQ — Bilawal Sidhu (@bilawalsidhu) December 16, 2024 Veo 2 reportedly holds a number of advantages over its predecessors, including a better understanding of physics (think better fluid dynamics and better illumination/shadowing effects) as well as the capacity to generate "clearer" video clips, in that generated textures and images are sharper and less prone to blurring when moving. The new model also offers improved camera controls, enabling the user to position the virtual camera lens with greater precision than before. As TechCrunch notes, Veo 2 has not yet perfected the video generation process, though it does appear to hallucinate far less than rivals like Sora, Kling, Movie Gen, or Gen 3 Alpha. "Coherence and consistency are areas for growth," Collins said. "Veo can consistently adhere to a prompt for a couple minutes, but [it can't] adhere to complex prompts over long horizons. Similarly, character consistency can be a challenge. There's also room to improve in generating intricate details, fast and complex motions, and continuing to push the boundaries of realism." Google also announced improvements to Imagen 3 on Monday, enabling the commercial image generation model to create "brighter, better-composed" outputs. The model, available on ImageFX, will also offer additional descriptive suggestions based on keywords in the user's prompt, with each keyword spawning a drop-down menu of related terms.
[9]
Google says Veo 2 AI can generate videos without all the hallucinations
Summary Google's Veo 2 AI video generator creates highly realistic videos with minimal flaws or hallucinations. Veo 2 outputs videos with a SynthID watermark to combat misinformation and identify them as AI-generated. Veo 2 features improvements in realism, physics, cinematic effects, resolution, and longer clip outputs compared to the original. ✕ Remove Ads OpenAI brought us Sora, an AI-powered video-creation tool that makes incredibly lifelike videos. Not to be outdone, Google just launched Veo 2, the latest model of its video generator. Google says it creates highly realistic videos and addresses flaws like hallucinations. Related Google Veo: The ultrarealistic AI video generation tool explained Close-ups, panning, and all the film-school tricks at your fingertips Posts Like the original Veo, the new Veo 2 outputs videos with an invisible SynthID watermark so they are identifiable as AI-generated. This helps combat misinformation, at least on paper. Everything new in Veo 2 ✕ Remove Ads Google released a short video, created with Veo 2, to highlight how far it has come. It is a shockingly realistic video and shows a woman looking into a microscope, a cartoon child in a kitchen, and a dog swimming in a pool. All of these images look miles ahead of what the original Veo could produce. Google said Veo 2 includes significant improvements in key areas: Realism and physics: Veo 2 has better understanding of human movement, facial expressions, and real-world physics. Cinematic improvements: Users can specify shot types, lenses, and cinematic effects such as wide-angle or shallow depth of field. Few hallucinations: Google claims Veo 2 reduces the number of weird things we've come to expect from AI videos, such as extra fingers or distorted faces. Higher resolution: Veo 2 can produce videos in up to 4K resolution. Longer outputs: The updated video generator supports clips of several minutes in length. Part of Google's drive to become the biggest AI company Veo 2 complements Google's bigger picture as the company pivots from being an internet company to an AI company. Google also announced a new Imagen 3 AI image generator, and Whisk, a tool that lets users drop multiple images in as prompts and will create a brand new image. ✕ Remove Ads Veo 2 is available now for select users via the Google Labs VideoFX platform. There's also a waiting list for the rest of the US public who want to try their hand at making a minute-long short video. Related How to use Google MusicFX to create AI-generated music Create instrumental songs with just a text prompt Posts
[10]
Google reveals Veo 2 -- its advanced AI video model to take on Sora
Now it's the turn of the more creative side of the search giant. DeepMind has unveiled a new version of its Veo video model. First announced at Google I/O earlier this year, Veo is a direct competitor to OpenAI's Sora to be one of the best AI video generators. Veo 2 brings with it improvements in visual realism as well as a better understanding of physics, ensuring movement is more accurately depicted. This is similar to updates made by Pika Labs in its new Pika 2.0 model. According to Google, Veo 2 achieves state-of-the-art results when compared to other leading models, especially when it comes to human expression. The model is available to try in VideoFX and a new lab experiment called Whisk that lets you visualize ideas using AI. It will also be available to developers and enterprise on Google Cloud. Google claims Veo 2 can understand real-world physics. This is something of a holy grail for AI video models and an area even the best struggle with, including Sora from OpenAI. I haven't tried Veo 2 myself but the videos shared by Google -- including one showing bees surrounding a beekeeper -- suggest they may have solved the problem. Veo 2 also has an understanding of different camera types. This is something image models have had for a while, and can be used effectively. According to Google you can: "Suggest '18mm lens' in your prompt and Veo 2 knows to craft the wide angle shot that this lens is known for," adding you can also put "shallow depth of field" to blur out the background. Veo 2 can generate clips up to a minute long and in 4k resolution. It has been trained on the "language of cinematography" and Google claims you'll get fewer extra fingers and unwanted objects. Veo 2 has been added to VideoFX but the service is still operating a waitlist. It is also going to be added to YouTube Shorts in the future to allow for AI content creation on the video platform.
[11]
Google Releases Veo 2 Video Generation Model to Rival Sora
Besides that, Google has introduced an experimental "Whisk" AI tool that remixes subjects, scenes, and styles to create unique images. Google has finally released its AI video generation model called Veo 2 to rival OpenAI's Sora model. Unlike Sora which can generate 20-second clips up to 1080p resolution, Google's Veo 2 model can produce minutes-long videos at stunning 4K resolution. Importantly, Google says Veo 2 understands "real-world physics" and "the nuances of human movement and expression" for improved realism. In our Sora hands-on testing, we noted that OpenAI's video generation model lacks a good understanding of physics. But Veo 2 promises better output as it hallucinates less. For example, with Veo 2, you are unlikely to get extra fingers in humans or unexpected objects in the final output. Apart from that, you can generate videos in different styles by defining the lens, genre, cinematic effects, close-up shots, low-angle tracking shots, and more. Veo 2 is being slowly rolled out via VideoFX, YouTube, and Vertex AI. To access Veo 2, you need to sign up for the waitlist. Google says Veo 2 is coming to YouTube Shorts next year. And all Veo 2-generated videos will include an invisible SynthID watermark to identify AI-generated videos. Besides video generation, Google has also announced Imagen 3, its AI image generation model. Imagen 3 can now generate "brighter, better composed images." It's also more accurate and can produce a wide range of art styles including abstract, anime, photorealism, and impressionism. It's widely rolling out on ImageFX in over 100 countries. Finally, Google has introduced a new experimental tool called "Whisk" which lets you add a subject, scene, and style to create a unique image by remixing them. For example, you can upload a photo as your subject, then you can add a scene or enter a prompt to describe the scene. Next, you can define the style and add a prompt to produce images you have in mind. It uses Imagen 3 and Gemini's visual understanding to blend all the inputs and generate a new image. You can access Whisk on Google Labs.
[12]
Google debuts Veo 2 video generator, upgraded Imagen 3 - SiliconANGLE
Google LLC today debuted Veo 2, an artificial intelligence model capable of generating high-resolution videos up to two minutes in length. The company is rolling out the algorithm alongside two other additions to its AI portfolio. The first is a new version of Imagen 3, Google's flagship AI image generator, that has received an output quality boost. The search giant also debuted a tool called Whisk that uses Imagen 3 to remix existing images. The new Veo 2 model generates videos based on natural language prompts. Users can enter up to several sentences describing what objects a clip should depict, as well as the manner in which those objects should be rendered. It's optionally possible to enter instructions for specific points in time, such as a video's ending. Veo 2 enables users to customize a clip's cinematographic settings. The AI can simulate camera features such as a specific type of lens or film roll cartridge. For example, users could instruct Veo 2 to generate a video as if it were shot with a 18-millimeter lens optimized for capturing wide-angle footage. The model likewise supports cinematic effects. In one example, Google showed Veo 2 generating a video with volumetric lighting. This is a rendering method for generating realistic-looking beams of light. Veo 2 is the successor to an eponymous AI video generator that Google debuted in May. Compared to its predecessor, the new model produces more realistic and detailed clips with up to 4K resolution, which corresponds to 3840 pixels by 2160 pixels. Google says that Veo 2 is also less prone to hallucinations. The search giant put the model to the test by evaluating it with MovieGenBench, a benchmarking tool that Meta Platforms Inc. open-sourced earlier this year. As part of the evaluation, 1,003 users compared Veo 2 with several other video generators. The model outperformed the competition, including OpenAI's newly released Sora Turbo, on "overall preference" and with respect to its ability to accurately follow prompts. "It brings an improved understanding of real-world physics and the nuances of human movement and expression, which helps improve its detail and realism overall," Google DeepMind research scientist Aäron van den Oord and Elias Roman, the senior director of product management for Google Labs, detailed in a blog post. At the same time, Veo 2 has certain limitations. Google detailed that the model struggles to keep frames consistent with one another "throughout complex scenes or those with complex motion." The company plans to integrate Veo 2 into several products including YouTube Shorts and Vertex AI, Google Cloud's AI development toolkit. Initially, the model will be available in Google Labs, a service that provides early access to the search giant's newest AI features. Accessibility is tied to a waitlist. Veo 2 is rolling out to Google Labs alongside a new version of Imagen 3, the company's most advanced AI image generator. Compared to the original Imagen 3, it generates brighter images with "richer details and textures. It's also better at following user prompts. Imagen 3 powers Whisk, a new service that will likewise be accessible via Google Labs. It allows users to combine multiple existing images into a new one. Whisk can, for example, apply one the style of one image to another's background.
[13]
Google's new AI video model sucks less at physics
But can Veo 2 beat Sora and generate a gymnast that moves like a human? Google may have only recently begun rolling out its Veo generative AI to enterprise customers, but the company is not wasting any time getting a new version of the video tool out to early testers. On Monday, Google announced a preview of Veo 2. According to the company, Veo 2 "understands the language of cinematography." In practice, that means you can reference a specific genre of film, cinematic effect or lens when prompting the model. Additionally, Google says the new model has a better understanding of real-world physics and human movement. Correctly modeling humans in motion is something all generative models struggle to do. So the company's claim that Veo 2 is better when it comes to both of those trouble points is notable. Of course, the samples the company provided aren't enough to know for sure; the true test of Veo 2's capabilities will come when someone prompts it to generate a video of a gymnast's routine. Oh, and speaking of things video models struggle with, Google says Veo will produce artifacts like extra fingers "less frequently." Separately, Google is rolling out improvements to Imagen 3. Of its text-to-image model, the company says the latest version generates brighter and better-composed images. Additionally, it can render more diverse art styles with greater accuracy. At the same time, it's also better at following prompts more faithfully. Prompt adherence was an issue I highlighted when the company made Imagen 3 available to Google Cloud customers earlier this month, so if nothing else, Google is aware of the areas where its AI models need work. Veo 2 will gradually roll out to Google Labs users in the US. For now, Google will limit testers to generating up to eight seconds of footage at 720p. For context, Sora can generate up to 20 seconds of 1080p footage, though doing so requires a $200 per month ChatGPT Pro subscription. As for the latest enhancements to Imagen 3, those are available to Google Labs users in more than 100 countries through ImageFX.
[14]
Google DeepMind unveils a new video model to rival Sora | TechCrunch
Google DeepMind, Google's flagship AI research lab, wants to beat OpenAI at the video generation game -- and it might just, at least for a little while. On Monday, DeepMind announced Veo 2, a next-gen video-generating AI and the successor to Veo, which powers a growing number of products across Google's portfolio. Veo 2 can create two-minute-plus clips in resolutions up to 4k (4096 x 2160 pixels). Notably, that's 4x the resolution -- and over 6x the duration -- OpenAI's Sora can achieve. It's a theoretical advantage for now, granted. In Google's experimental video creation tool, VideoFX, where Veo 2 is now exclusively available, videos are capped at 720p and eight seconds in length. (Sora can produce up to 1080p, 20-second-long clips.) VideoFX is behind a waitlist, but Google says it's expanding the number of users who can access it this week. Eli Collins, VP of product at DeepMind, also told TechCrunch that Google will make Veo 2 available via its Vertex AI developer platform "as the model becomes ready for use at scale." "Over the coming months, we'll continue to iterate based on feedback from users," Collins said, "and [we'll] look to integrate Veo 2's updated capabilities into compelling use cases across the Google ecosystem ... [W]e expect to share more updates next year." Like Veo, Veo 2 can generate videos given a text prompt (e.g. "A car racing down a freeway") or text and a reference image. So what's new in Veo 2? Well, DeepMind says the model, which can generate clips in a range of styles, has an improved "understanding" of physics and camera controls, and produces "clearer" footage. By clearer, DeepMind means textures and images in clips are sharper -- especially in scenes with a lot of movement. As for the improved camera controls, they enable Veo 2 to position the virtual "camera" in the videos it generates more precisely, and to move that camera to capture objects and people from different angles. DeepMind also claims that Veo 2 can more realistically model motion, fluid dynamics (like coffee being poured into a mug), and properties of light (such as shadows and reflections). That includes different lenses and cinematic effects, DeepMind says, as well as "nuanced" human expression. DeepMind shared a few cherry-picked samples from Veo 2 with TechCrunch last week. For AI-generated videos, they looked pretty good -- exceptionally good, even. Veo 2 seems to have a strong grasp of refraction and tricky liquids, like maple syrup, and a knack for emulating Pixar-style animation. But despite DeepMind's insistence that the model is less likely to hallucinate elements like extra fingers or "unexpected objects," Veo 2 can't quite clear the uncanny valley. Note the lifeless eyes in this cartoon dog-like creature: And the weirdly slippery road in this footage -- plus the pedestrians in the background blending into each other and the buildings with physically impossible facades: Collins admitted that there's work to be done. "Coherence and consistency are areas for growth," he said. "Veo can consistently adhere to a prompt for a couple minutes, but [it can't] adhere to complex prompts over long horizons. Similarly, character consistency can be a challenge. There's also room to improve in generating intricate details, fast and complex motions, and continuing to push the boundaries of realism." DeepMind's continuing to work with artists and producers to refine its video generation models and tooling, added Collins. "We started working with creatives like Donald Glover, the Weeknd, d4vd, and others since the beginning of our Veo development to really understand their creative process and how technology could help bring their vision to life," Collins said. "Our work with creators on Veo 1 informed the development of Veo 2, and we look forward to working with trusted testers and creators to get feedback on this new model." Veo 2 was trained on lots of videos. That's generally how AI models work: Provided with example after example of some form of data, the models pick up on patterns in the data that allow them to generate new data. DeepMind won't say exactly where it scraped the videos to train Veo 2, but YouTube is one possible source; Google owns YouTube, and DeepMind previously told TechCrunch that Google models like Veo "may" be trained on some YouTube content. "Veo has been trained on high-quality video-description pairings," Collins said. "Video-description pairs are a video and associated description of what happens in that video." While DeepMind, through Google, hosts tools to let webmasters block the lab's bots from extracting training data from their websites, DeepMind doesn't offer a mechanism to let creators remove works from its existing training sets. The lab and its parent company maintain that training models using public data is fair use, meaning that DeepMind believes it isn't obligated to ask permission from data owners. Not all creatives agree -- particularly in light of studies estimating that tens of thousands of film and TV jobs could be disrupted by AI in the coming years. Several AI companies, including the eponymous startup behind the popular AI art app Midjourney, are in the crosshairs of lawsuits accusing them of infringing on artists' rights by training on content without consent. "We're committed to working collaboratively with creators and our partners to achieve common goals," Collins said. "We continue to work with the creative community and people across the wider industry, gathering insights and listening to feedback, including those who use VideoFX." Thanks to the way today's generative models behave when trained, they carry certain risks, like regurgitation, which refers to when a model generates a mirror copy of training data. DeepMind's solution is prompt-level filters, including for violent, graphic, and explicit content. Google's indemnity policy, which provides a defense for certain customers against allegations of copyright infringement stemming from the use of its products, won't apply to Veo 2 until it's generally available, Collins said. To mitigate the risk of deepfakes, DeepMind says it's using its proprietary watermarking technology, SynthID, to embed invisible markers into frames Veo 2 generates. However, like all watermarking tech, SynthID isn't foolproof. In addition to Veo 2, Google DeepMind this morning announced upgrades to Imagen 3, its commercial image generation model. A new version of Imagen 3 is rolling out to users of ImageFX, Google's image-generating tool, beginning today. It can create "brighter, better-composed" images and photos in styles like photorealism, impressionism, and anime, per DeepMind. "This upgrade [to Imagen 3] also follows prompts more faithfully, and renders richer details and textures," DeepMind wrote in a blog post provided to TechCrunch. Rolling out alongside the model are UI updates to ImageFX. Now, when users type prompts, key terms in those prompts will become "chiplets" with a drop-down menu of suggested, related words. Users can use the chips to iterate what they've written, or select from a row of auto-generated descriptors beneath the prompt.
[15]
Google DeepMind reveals new AI model to rival OpenAI's Sora
The announcement comes shortly after OpenAI launched its text-to-video AI generator, Sora. Google's artificial intelligence (AI) research lab DeepMind has revealed a video-generating AI model, which aims to rival OpenAI's Sora. Yesterday (16 December), DeepMind announced Veo 2, the successor to its Veo model, which powers a growing number of the tech giant's products. Google said Veo 2 can create clips in resolutions up to 4K that can extend to minutes in length. Examples of Veo 2-generated clips showcased online include a Dachshund dog wearing goggles swimming in a pool, flamingos drinking from a body of water and a man carrying out beekeeping duties. However, while Google said Veo 2 brings "an improved understanding of real-world physics and the nuances of human movement", it has also acknowledged its limitations and flaws. "While Veo 2 demonstrates incredible progress, creating realistic, dynamic or intricate videos and maintaining complete consistency throughout complex scenes or those with complex motion, remains a challenge. "We'll continue to develop and refine performance in these areas." The announcement follows OpenAI's launch of Sora last week, which is an AI-powered text-to-video generator. According to OpenAI, Sora comes with Coalition for Content Provenance and Authenticity metadata (which refers to data that describes and gives information about other data), in an effort to facilitate increased transparency in identifying AI-generated videos. Responsible use With the rise of content created by AI in recent years, there have also been increasing concerns about how this will infringe on the creative industries. Authors and musicians have called out the dangerous effects AI could have on their industries, while thousands of creatives around the world signed a statement in October of this year warning AI companies that the unlicensed use of their work to train generative AI models is a "major, unjust threat" to their livelihoods. There are also concerns around how AI could be abused to create deepfake content and spread disinformation. For example, earlier this year, Elon Musk, came under fire after he posted an AI-generated ad of US presidential candidate Kamala Harris on X, the social media platform that he owns.) As a result, technology companies, including Google, are constantly having to moderate the content indexed on their sites. Referring to this moral quandary, Google emphasised that it is "critical" to bring technologies like Veo to the world in a responsible manner. "Videos created by Veo are watermarked using SynthID, our cutting-edge tool for watermarking and identifying AI-generated content and will be passed through safety filters and memorization checking processes that help mitigate privacy, copyright and bias risks." Don't miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic's digest of need-to-know sci-tech news.
[16]
Google Veo 2 is one of the best AI video models I've ever seen -- here's 5 examples of what it can do
Google has an advanced AI lab in the form of DeepMind, and it has been cooking over the past few weeks. The latest release is a new version of the Veo artificial intelligence video model that has the most accurate understanding of physics I've seen from any video tool so far. First announced at Google I/O earlier this year, Veo is a direct competitor to OpenAI's Sora to be one of the best AI video generators, and the new version takes things up to an entirely new level. Veo 2 brings with it improvements in visual realism as well as a better understanding of physics, ensuring movement is more accurately depicted. One example video shows someone accurately slicing a tomato, something no other video model can achieve -- including Sora. The new Veo model is currently still in the waitlist phase, but you can sign up to get access when it becomes available through Google Labs. Built into the VideoFX experiment, it will let you create 4k clips of up to a minute long. I haven't tried Veo 2 myself but the videos shared by Google -- including one showing bees surrounding a beekeeper -- appear more real than anything I've tried. Even Pika 2.0, one of the best so far, doesn't solve the physics issue. While I wait to get access I looked through social media and the Veo 2 website to gather some of the best examples of its capabilities I could find. I picked the above video because of the way it handles the complex interplay between the individual bees and the beekeeper. The bees look and move naturally and the beekeeper picks up a jar of honey. This might seem trivial but each of those elements are things other models struggle with alone. Prompt from Google: "The camera floats gently through rows of pastel-painted wooden beehives, buzzing honeybees gliding in and out of frame. The motion settles on the refined farmer standing at the center, his pristine white beekeeping suit gleaming in the golden afternoon light. He lifts a jar of honey, tilting it slightly to catch the light. Behind him, tall sunflowers sway rhythmically in the breeze, their petals glowing in the warm sunlight. The camera tilts upward to reveal a retro farmhouse with mint-green shutters, its walls dappled with shadows from swaying trees. Shot with a 35mm lens on Kodak Portra 400 film, the golden light creates rich textures on the farmer's gloves, marmalade jar, and weathered wood of the beehives." A few years ago, when OpenAI first unveiled the DALL-E 3 image model they used flamingos. I don't know if this was deliberate from Google but there are more than one Flamingo videos in the examples. Here they capture the movement of the water, the physics involved in the weight of the dog and lighting. Prompt from Google: "A cinematic shot captures a fluffy Cockapoo, perched atop a vibrant pink flamingo float, in a sun-drenched Los Angeles swimming pool. The crystal-clear water sparkles under the bright California sun, reflecting the playful scene. The Cockapoo's fur, a soft blend of white and apricot, is highlighted by the golden sunlight, its floppy ears gently swaying in the breeze. Its happy expression and wagging tail convey pure joy and summer bliss. The vibrant pink flamingo adds a whimsical touch, creating a picture-perfect image of carefree fun in the LA sunshine." This prompt just made me hungry. It led to me making a coffee. Weirdly, pouring liquid is something other models struggle with but Veo 2 did it perfectly. Prompt from Google: "The sun rises slowly behind a perfectly plated breakfast scene. Thick, golden maple syrup pours in slow motion over a stack of fluffy pancakes, each one releasing a soft, warm steam cloud. A close-up of crispy bacon sizzles, sending tiny embers of golden grease into the air. Coffee pours in smooth, swirling motion into a crystal-clear cup, filling it with deep brown layers of crema. Scene ends with a camera swoop into a fresh-cut orange, revealing its bright, juicy segments in stunning macro detail." Video models have got a lot better at depicting emotion, but they're not perfect and some are better than others. This video shows Veo 2 is one of the good ones. Prompt from Google: "An extreme close-up shot focuses on the face of a female DJ, her beautiful, voluminous black curly hair framing her features as she becomes completely absorbed in the music. Her eyes are closed, lost in the rhythm, and a slight smile plays on her lips. The camera captures the subtle movements of her head as she nods and sways to the beat, her body instinctively responding to the music pulsating through her headphones and out into the crowd. The shallow depth of field blurs the background. She's surrounded by vibrant neon colors. The close-up emphasizes her captivating presence and the power of music to transport and transcend." Finally, this video just captivated me for its complexity. There are so many elements happening within the clip and it largely holds visual clarity and motion. The reflection, the motion happening int he mirror -- even reflecting the candle are all elements other may have struggled with. Prompt from Google: "The camera moves in a slow dolly shot, revealing the opulence of a Renaissance palace chamber adorned with gold-inlaid furniture, velvet drapes, and chandeliers casting soft, flickering light. A queen sits motionless at a gilded desk, her crimson silk gown cascading onto the floor like spilled blood. On the desk lies an unsigned letter, its edges curled with age. The camera frames her from behind, catching the reflection of her stoic face in a massive, ornate mirror. In the background, courtiers murmur, their silhouettes dancing like ghosts in the candlelight. The room feels heavy, every gilded detail amplifying an air of betrayal and paranoia. The color palette alternates between deep, regal reds and cold golds, with chiaroscuro lighting intensifying the drama. Shot on 70mm film for rich texture, evoking the grandeur of historical masterpieces."
[17]
Google Unveils Veo 2, Its Next-Generation AI Video Model
Google claims that prompt adherence of the AI model has also improved Google unveiled the successor to the Veo video generation model on Monday. Dubbed Veo 2, the artificial intelligence (AI) model improves on the capabilities of the older model which was launched in early 2024. Veo 2 can now generate videos in up to 4K resolution and for a much longer duration. The new model was introduced just a week after the Mountain View-based tech giant added Veo to Vertex AI for its enterprise clients. Veo 2 is currently only available to a select number of creators in the VideoFX platform via Google Labs. In a blog post, the tech giant detailed the new video-generation AI model. Currently, the older Veo AI model is available to users via Vertex AI and VideoFX, and it is also being used in the video background generator feature for YouTube Shorts. Now, the company introduced Veo 2, which is said to have achieved state-of-the-art results during internal testing, beating video models such as Sora. The tech giant claimed that Veo 2 has made improvements in the detail and realism of the generated videos. Additionally, the AI model can now understand the language of cinematography. This means users can specify a genre, lens, cinematic effects, and camera movements and Veo 2 can generate videos to the exact specifications. Highlighting an example, the post said Veo 2 can understand complex prompts such as "a low-angle tracking shot with 18mm lens that glides through the middle of a scene". The mention of the lens will let the AI know that the user wants to generate a wide-angle shot which the lens is known for, the company added. Additionally, Veo 2 can generate videos up to 4K resolution and can extend video duration to "minutes" in length. The company did not specify the maximum duration of videos that can be generated using the AI tool. The tech giant also claimed that the video model reduces the instances of hallucination and unwanted details such as extra fingers, morphed limbs, or unexpected objects will be seen less frequently. On safety, Google stated that the company has added the native SynthID watermark on generated videos to help identify AI videos from real videos, reducing its misuse to create deepfakes. The company has also planned a gradual rollout of the AI model to control its access and improve its quality and safety before global availability. Currently, Veo 2 is available to select users in VideoFX via Google Labs. Those interested in trying out the tool are asked to sign up for the waitlist. Further, the tech giant plans to expand Veo 2 to YouTube Shorts and other products next year. Alongside the introduction of Veo 2, the tech giant also rolled out the AI image generation model Imagen 3 in ImageFX to more than 100 countries. The latest iteration of the model has received several upgrades, including the ability to render more diverse art styles, greater accuracy, and better-composed images. Imagen 3 also displays better prompt adherence and can render more details and textures in images, the company claimed. It is said to also excel in particular styles such as photorealism, impressionism, abstract, and anime.
[18]
Google Deepmind is rolling out AI video generator Veo 2. How it works and how to try it.
Google Deepmind has unveiled its next-generation AI tools, Veo 2 and Imagen 3, along with a playful new experiment, Whisk, aimed at remixing visuals in creative ways. Veo 2 is the newest version of the company's flagship video-generation tool, which according to Google will be capable of higher realism and an improved understanding of movement, physics, and cinematic techniques. In its press release, Google Deepmind states that Veo 2 can generate 4K videos and handle complex prompts -- like specific camera lenses or cinematic shots. "Veo 2 understands the unique language of cinematography," the press release states, emphasizing its ability to deliver stylistic choices like "low-angle tracking shots" or "shallow depth of field." Further, the company says that, unlike earlier models that often "hallucinate" strange visuals (like extra limbs), Veo 2 reduces these quirks, making results more natural. Imagen 3, released in August, has gotten some improvements with richer textures, brighter colors, and diverse art styles -- from photorealistic landscapes to anime. The upgrade will also stick more faithfully to user prompts. The most intriguing addition might be Whisk, a tool that allows users to remix images with creative flair. By combining Google's Imagen 3 and Gemini models, Whisk can turn rough image ideas into digital artworks like "plushies or enamel pins," all while automatically generating descriptive captions. Veo 2 is rolling out in Google's VideoFX, Imagen 3 hits ImageFX in over 100 countries, and Whisk is live for U.S. users. You can sign up for the Google Labs waitlist now.
[19]
How to Get Access and Use Google's Veo 2 AI Video Generator
Veo 2 is available on Google Labs for free, under the VideoFX tool. Google has released a phenomenal video generation model called Veo 2 that outperforms OpenAI's Sora model. Veo 2 can produce realistic 4K videos and understands physics and human movements really well. It can generate eight seconds of video clips and extend them to minutes-long videos as well. So if you want to access Google's Veo 2 model for AI video generation, you can follow our guide and sign up for the waitlist. Currently, Google has made Veo 2 available to early access users in the US only, aged above 18. Google says it's expanding the access for Veo 2 so you are likely to make a cut, this time around. So this is how you can sign up for Veo 2 and join the waitlist for quick access. Google is surprisingly allowing more users to access Veo 2 and approving accounts without much delay. So go ahead and try your luck. In case, you want to access OpenAI's Sora video generator, you can follow our tutorial. And if you are facing any issues, let us know in the comments below.
[20]
Google unveils Veo 2 and Imagen 3 with advanced capabilities
Google on Monday unveiled two advanced AI models -- Veo 2 for video generation and Imagen 3 for image generation -- both designed to deliver state-of-the-art results. These models are now available through VideoFX, ImageFX, and the new Google Labs experiment, Whisk. Veo 2 creates high-quality videos across a wide range of subjects and styles, achieving exceptional results in human-rated comparisons against leading models. This model enhances realism by understanding real-world physics and human movement, which helps generate more detailed and lifelike videos. Users can request Veo 2 to produce specific cinematic effects, such as a low-angle tracking shot or a close-up of a scientist. It can generate videos in resolutions of up to 4K and durations lasting several minutes. Veo 2 outperformed other video generation models in head-to-head comparisons, as evaluated by human raters. It excelled in overall preference and accurately followed prompts, producing fewer unrealistic details (such as extra fingers or misplaced objects). Despite its advancements, Veo 2 still faces challenges in maintaining consistency during complex scenes with intricate motion. Google is continually working to refine these aspects. Veo 2 outputs feature an invisible SynthID watermark to identify them as AI-generated, reducing the risk of misinformation and misattribution. Google is also expanding access to Veo 2 carefully via VideoFX, YouTube, and Vertex AI. Imagen 3, the latest version of Google's image generation model, now produces brighter, more vibrant images with improved color balance and fidelity. It also expands its capability to render diverse art styles, from photorealism to abstract art and anime. Imagen 3 has achieved state-of-the-art results in human-rated comparisons against other image generation models. It received top scores for visual quality, prompt accuracy, and appeal. Google has implemented extensive filtering, data labeling, and red-teaming to ensure the safety and fairness of Imagen 3's outputs. SynthID watermarking is also applied to all images, providing an imperceptible but detectable identifier for AI-generated content.
[21]
How to get on the Veo 2 waitlist -- access Google's new AI video generator
In a week dominated by Google's AI announcements, including Gemini 2, Deep Research, and new NotebookLM features, the tech giant isn't done yet. Google's DeepMind lab has now unveiled Veo 2, the latest version of its AI video generation model first announced at Google I/O. Going head-to-head with OpenAI's Sora, Veo 2 represents a significant leap forward in AI video generation, offering the ability to create realistic 4K videos with impressive expressions and human movement modeling. Veo 2 can create initial 8-second clips that can be extended into longer videos. Currently available only to early access users in the U.S. over 18, getting on the waitlist is your first step toward accessing this powerful tool. Ready to try it yourself? Here's how to join the waitlist. With Veo 2's waitlist growing rapidly, you might want to explore the 5 Best AI video generators while you wait. For more AI tools, check out our guide to the best ChatGPT alternatives, and if you're interested in Open AI's iteration of AI video generation, check out how to use Sora AI in the EU and UK.
[22]
Google announces Veo 2 video generation model, expanding VideoFX access
Google today announced Veo 2, as well as updates to Imagen 3 and a new Whisk experiment that showcases the latest model and Gemini. Google announced Veo in May at I/O 2024, and the follow-up model features "improved understanding of real-world physics and the nuances of human movement and expression" to improve overall detail and realism. Meanwhile, your prompt can specify a genre, lens, and cinematic effect. For example: Veo 2 also hallucinates "less frequently" and there's the invisible SynthID watermark. Google is rolling out Veo 2 to VideoFX (in Google Labs) and "expanding the number of users who can access it," but there is still a waitlist. It will come to "YouTube Shorts and other products next year." We have been intentionally measured in growing Veo's availability, so we can help identify, understand and improve the model's quality and safety while slowly rolling it out via VideoFX, YouTube and Vertex AI. The camera floats gently through rows of pastel-painted wooden beehives, buzzing honeybees gliding in and out of frame. The motion settles on the refined farmer standing at the center, his pristine white beekeeping suit gleaming in the golden afternoon light. He lifts a jar of honey, tilting it slightly to catch the light. Behind him, tall sunflowers sway rhythmically in the breeze, their petals glowing in the warm sunlight. The camera tilts upward to reveal a retro farmhouse with mint-green shutters, its walls dappled with shadows from swaying trees. Shot with a 35mm lens on Kodak Portra 400 film, the golden light creates rich textures on the farmer's gloves, marmalade jar, and weathered wood of the beehives. This medium shot, with a shallow depth of field, portrays a cute cartoon girl with wavy brown hair, sitting upright in a 1980s kitchen. Her hair is medium length and wavy. She has a small, slightly upturned nose, and small, rounded ears. She is very animated and excited as she talks to the camera. A low-angle shot captures a flock of pink flamingos gracefully wading in a lush, tranquil lagoon. The vibrant pink of their plumage contrasts beautifully with the verdant green of the surrounding vegetation and the crystal-clear turquoise water. Sunlight glints off the water's surface, creating shimmering reflections that dance on the flamingos' feathers. The birds' elegant, curved necks are submerged as they walk through the shallow water, their movements creating gentle ripples that spread across the lagoon. The composition emphasizes the serenity and natural beauty of the scene, highlighting the delicate balance of the ecosystem and the inherent grace of these magnificent birds. The soft, diffused light of early morning bathes the entire scene in a warm, ethereal glow. A cinematic shot captures a fluffy Cockapoo, perched atop a vibrant pink flamingo float, in a sun-drenched Los Angeles swimming pool. The crystal-clear water sparkles under the bright California sun, reflecting the playful scene. The Cockapoo's fur, a soft blend of white and apricot, is highlighted by the golden sunlight, its floppy ears gently swaying in the breeze. Its happy expression and wagging tail convey pure joy and summer bliss. The vibrant pink flamingo adds a whimsical touch, creating a picture-perfect image of carefree fun in the LA sunshine. Google today also announced an improved Imagen 3 model. Images are brighter, better composed, and feature "richer details and textures," with the ability to "render more diverse art styles with greater accuracy." This latest model is rolling out globally to ImageFX. Finally, Google Labs has a new "Whisk" experiment that "lets you prompt with images." It's meant to showcase the updated Imagen 3 and Gemini's visual understanding/description capabilitiesl.
[23]
Google's Sora Rival - DeepMind Introduces New Video Generation Tool - Alphabet (NASDAQ:GOOGL)
Imagen 3 generates diverse, detailed images in multiple styles. On Monday, Alphabet Inc's GOOG GOOGL Google significantly upgraded its artificial intelligence-powered content generation tools, unveiling the Veo 2 video and enhanced Imagen 3 image models. These advancements promise to revolutionize creative workflows, offering realism and customization for video and image creators. Veo 2, Google's video generation tool, produces high-quality videos in diverse subjects and styles. Also Read: Nvidia, AMD, Taiwan Semi Gain As Broadcom's Q4 Performance Sparks Sector-Wide Surge The company said in the blog that the model excels in realism, capturing details like human expressions and cinematographic effects. Its enhanced understanding of physics and cinematography enables users to generate stunning content, from tracking shots to wide-angle compositions. According to the search giant, the Veo 2 achieves superior results compared to competing models. It offers a resolution of up to 4K and video lengths extending to several minutes. It incorporates advanced features like shallow depth of field, specific lens types, and detailed movement dynamics. Developers and creators can access the tool through Google Labs and expect broader integration with platforms like YouTube Shorts in 2025. Veo 2 is designed with safety in mind. It embeds SynthID watermarks in generated videos to prevent misuse or misinformation. This focus on responsibility underpins Google's measured rollout of Veo 2 across its platforms. The Imagen 3 model offers enhanced image composition and detail accuracy, supporting a range of styles from photorealistic to abstract. It generates richer textures and adheres more faithfully to user prompts. Imagen 3 is now accessible in over 100 countries through Google Labs' ImageFX tool, allowing users worldwide to experiment with its cutting-edge capabilities. Google also launched Whisk, a creative tool that combines Imagen 3 with Gemini's visual analysis capabilities. Users can input images to generate detailed captions, remix styles, and create personalized designs like digital plushies or enamel pins. Google's latest tool rivals ChatGPT parent OpenAI's Sora AI video generator, which was launched in February, but the model quickly faced challenges, including errors. Amazon.Com Inc AMZN debuted its AI-powered video tool for advertisers in September, competing with Adobe's Firefly Video Model, while Meta Platforms Inc META introduced Movie Gen in October to generate videos, images, and audio from text prompts. Recently, Marc Andreessen praised OpenAI's Sora video model for addressing a key robotics challenge by enabling realistic video generation. He highlighted the model's ability to generate 3D-realistic video from text prompts, simulating complex elements like lighting and reflections essential for real-world navigation. Price Action: GOOGL stock is up 4.65% at $200.29 at the last check on Monday. Also Read: Apple's AirPods Production Shifts To India: Foxconn Will Begin Manufacturing In 2025 Photo via Company Market News and Data brought to you by Benzinga APIs
[24]
State-of-the-art video and image generation with Veo 2 and Imagen 3
Earlier this year, we introduced our video generation model, Veo, and our latest image generation model, Imagen 3. Since then, it's been exciting to watch people bring their ideas to life with help from these models: YouTube creators are exploring the creative possibilities of video backgrounds for their YouTube Shorts, enterprise customers are enhancing creative workflows on Vertex AI and creatives are using VideoFX and ImageFX to tell their stories. Together with collaborators ranging from filmmakers to businesses, we're continuing to develop and evolve these technologies. Today we're introducing a new video model, Veo 2, and the latest version of Imagen 3, both of which achieve state-of-the-art results. These models are now available in VideoFX, ImageFX and our newest Labs experiment, Whisk.
Share
Share
Copy Link
Google introduces Veo2, an advanced AI video generator that claims superior performance over competitors like OpenAI's Sora Turbo, featuring enhanced realism, cinematic quality, and improved prompt adherence.
Google has unveiled Veo2, its latest AI video generator, marking a significant advancement in the field of artificial intelligence-driven content creation. This new tool, currently available in early access through Google Labs, promises to transform the landscape of video production with its enhanced capabilities and improved performance [1][2].
Veo2 boasts several impressive features that set it apart from its predecessors and competitors:
High-Resolution Output: The model can generate videos with resolutions up to 4K, a substantial improvement over previous versions [3].
Enhanced Realism: Veo2 demonstrates an improved understanding of real-world physics and nuances of human movement, resulting in more lifelike and coherent video outputs [2][4].
Cinematic Quality: The tool comprehends cinematographic language, allowing users to specify genres, lenses, and cinematic effects in their prompts [3][4].
Versatile Generation Modes: Veo2 offers both text-to-video and text-to-image-to-video generation capabilities, providing flexibility for various creative needs [1].
Google claims that Veo2 outperforms competitors such as OpenAI's Sora Turbo and Meta's MovieGen in user evaluations and benchmark tests:
User Preference: In internal tests, human evaluators preferred Veo2's outputs for overall quality and prompt adherence compared to other leading models [3][5].
Benchmark Performance: Veo2 has shown superior results in the MovieGen Bench, excelling in areas like visual quality, temporal consistency, and text alignment [4].
Despite its advancements, Veo2 still faces some challenges:
Complex Scenarios: The model may struggle with maintaining consistency in intricate scenes involving multiple moving elements or dynamic interactions [1][4].
Specific Prompts: Highly detailed or precise requests can sometimes lead to less accurate outputs [1].
Accessibility: Currently, Veo2 is available only through a waitlist system, limiting its widespread use [3][5].
The versatility of Veo2 opens up numerous possibilities across various industries:
Entertainment: Film and video game production could benefit from rapid prototyping and visual effects creation [4].
Advertising: The tool's ability to generate high-quality, customized video content could revolutionize commercial production [4].
Education: Veo2 could be used to create engaging visual aids and educational content [4].
As Veo2 continues to develop, Google has outlined plans for its future:
Expanded Access: The company aims to increase availability over the next year, including integration into YouTube Shorts for creators [2][3].
Safety Measures: All videos generated by Veo2 will include an invisible SynthID watermark to identify them as AI-generated, addressing concerns about misinformation [3][5].
Ongoing Improvements: Google is likely to continue refining the model, addressing current limitations and expanding its capabilities [1][4].
The introduction of Veo2 represents a significant step forward in AI-driven video generation, potentially reshaping how visual content is created and consumed across various industries. As the technology evolves, it will be crucial to monitor its impact on creative processes, content authenticity, and the broader media landscape.
Reference
[1]
[3]
Google has introduced its advanced AI models, Veo for video generation and Imagen 3 for image creation, on its Vertex AI platform, marking a significant advancement in generative AI technology for enterprise clients.
16 Sources
An analysis of OpenAI's new AI video generation tool, Sora, examining its features, limitations, and user experience for ChatGPT Plus subscribers and potential Pro users.
3 Sources
OpenAI has officially released Sora, its advanced AI video generation tool, to ChatGPT Plus and Pro subscribers. This launch marks a significant advancement in AI-powered content creation, offering users the ability to generate high-quality video clips from text, images, and existing videos.
81 Sources
OpenAI's Sora, a new AI video generation tool, offers innovative features for creating high-quality videos from text or images. While it shows promise in transforming content creation, it also faces challenges in performance and accessibility.
8 Sources
Luma AI has released Ray2, a new AI video generation model that promises improved realism, natural motion, and better physics. This update to their Dream Machine platform challenges competitors like OpenAI's Sora and Google's Veo 2.
4 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved