21 Sources
21 Sources
[1]
Google's AI videos get a big upgrade with Veo 3.1
It's getting harder to know what's real on the Internet, and Google is not helping one bit with the announcement of Veo 3.1. The company's new video model allegedly offers better audio and realism, along with greater prompt accuracy. The updated video AI will be available throughout the Google ecosystem, including the Flow filmmaking tool, where the new model will unlock additional features. And if you're worried about the cost of conjuring all these AI videos, Google is also adding a "Fast" variant of Veo. Veo made waves when it debuted earlier this year, demonstrating a staggering improvement in AI video quality just a few months after Veo 2's release. It turns out that having all that video on YouTube is very useful for training AI models, so Google is already moving on to Veo 3.1 with a raft of new features. Google says Veo 3.1 offers stronger prompt adherence, which results in better video outputs and fewer wasted compute cycles. Audio, which was a hallmark feature of the Veo 3 release, has allegedly gotten better, too. Veo 3's text-to-video was limited to 720p landscape output, but there's an ever-increasing volume of vertical video on the Internet. So Veo 3.1 can produce both landscape and portrait 16:9 video. Google previously said it would bring Veo video tools to YouTube Shorts, which use a vertical video format like TikTok. The release of Veo 3.1 probably opens the door to fulfilling that promise. You can bet Veo videos will show up more frequently on TikTok as well now that it fits the format. This release also keeps Google in its race with OpenAI, which recently released a Sora iPhone app with an impressive new version of its video-generating AI. A focus on filmmakers The Veo 3.1 model will be available across Google's AI ecosystem. You'll be able to create content with Veo 3.1 and Veo 3.1 Fast via the Gemini app, and developers will have access in Vertex AI and through the Gemini API. Using the Fast variant will help keep costs down when paying per token. Presumably, users of the Gemini app will get more Fast video generations -- we've asked Google about limits and will report if we hear back. Veo is the underlying model in Google's Flow filmmaking tool, and it's getting a few new capabilities thanks to the updated model. The Ingredients to Video, Frames to Video, and Extend features are now all compatible with generated audio. So you can upload multiple images as a reference or use images as a starting or end point while also adding custom audio to the clip. These same capabilities are offered in the API, and the Gemini app continues to accept reference images for Veo outputs. The app doesn't get all the Flow features, though. There are a couple of entirely new video features coming with Veo 3.1, too. Google says Veo 3.1 is better able to replicate the look of a video while making "precision" edits. So you'll be able to add an object to a clip while keeping the rest of it unchanged (more or less). Likewise, you can remove an element without changing the rest of the scene. Adding objects will be available in Flow and the API immediately. Removing objects won't be available in Flow just yet, but Google says that will be coming soon. The new video model begins rolling out today, so make sure you use a skeptical eye when scrolling vertical videos.
[2]
Google releases Veo 3.1, adds it to Flow video editor | TechCrunch
Google launched its new video model Veo 3.1 with improved audio output, granular editing controls, and better output for image to video. It said that Veo 3.1 builds on May's Veo 3 release and generates more realistic clips and adheres to prompts better. The model allows users to add an object to the video and have it blend into the clip's style, Google said. Soon, users will be able to remove an existing object from the video in Flow, too. Veo 3 already has edit features such as adding reference images to drive a character, providing the first and last frame to generate a clip using AI, and the ability to extend an existing video based on the last few frames. With Veo 3.1, Google is adding audio to all these features to make the clips more lively. The company is rolling out the model to its video editor Flow, the Gemini App, along with Vertex and Gemini APIs. It said that since Flow's launch in May, users have created more than 275 million videos on the app.
[3]
Google Drops New Veo 3 AI Video Model Amid Sora Hype
Google wants you to take a break from OpenAI's Sora and try out its new AI video model. The newest version of its flagship AI video generator is here, named Veo 3.1, the company announced Wednesday. Veo 3.1 is available now for paying Gemini users and through Flow, the Gemini API and Vertex AI. The new version of Veo will have some features you may recognize if you've used Flow, Google's AI filmmaking program. Ingredients to video will let you upload separate assets that Veo will combine in the final video, which first debuted on Flow. You'll also be able to add objects to existing assets, with the much-needed ability to remove objects coming soon. You can now also give Veo a starting and ending still shot, and it will generate an AI transition to blend the two images into a short video clip. Short clips can now be extended to be over a minute long, another way to smooth out transitions between clips. These transition tools will be helpful for creators, as AI video has previously relied on many hard jump cuts between short clips. Google's Veo 3 dropped earlier this year at its I/O developers conference and quickly found fans. It was the first AI video generator to include native, AI-generated synchronized audio. Google has been investing heavily in generative media this year. Its nano banana AI image model quickly gained popularity. But Google's dominance has been challenged by OpenAI. The ChatGPT maker dropped a new version of its AI video generator, Sora, and created a TikTok-like social media app. Sora has been the topic of much debate, with enthusiasts eagerly snatching up invite codes and, more concerningly, experts worried about its ability to create convincing deepfakes and further fill the internet with AI slop. AI video generators like Veo and Sora highlight the controversial role generative AI plays in creative industries. Many videographers, filmmakers and creators are concerned about how AI is trained on their existing material and deployed by studios and streamers. While hotly contested debates rage on, many artists and authors are taking AI companies to court over alleged copyright infringement and other intellectual property issues. (Disclosure: Ziff Davis, CNET's parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)
[4]
Google's Veo 3.1 can turn separate images into a single video
Once upon a time, animators had to painstakingly work frame-by-frame, stitching together long strings of still images to create the illusion of motion. Today, they only need to upload a few images, and AI will do the rest. On Wednesday, Google DeepMind released its latest video-generating AI model, Veo 3.1, available now in Flow, Vertex AI, the Gemini API, the Gemini App, and Vids. The company also released a smaller, less powerful version of the model called Veo 3.1 Fast. Also: I used Google's photo-to-video AI tool on my selfie - and it made me do the tango Veo 3.1 specializes in blending disparate images into natural-looking videos, significantly reducing the time and resources that have historically been required for video production. Amazon also recently debuted an AI tool which allows brands to generate short video ads from still images of products in a matter of seconds. Google's new model arrives less than four months after the public launch of its predecessor, Veo 3, which quickly became a hit because of its ability to generate video with synchronized audio. Google also later upgraded that model with the ability to generate short videos from a single image. Veo 3.1 also comes with that feature and more. According to a promotional deck from Google shared with ZDNET, the model "offers richer audio and enhanced realism that captures true to life textures." It also has a more sophisticated "understanding of storytelling, cinematic styles, and character interactions," the company wrote. Veo 3.1 blends multiple images to create a single, natural-looking video, like an AI blender that takes separate assets and combines them into a single visual smoothie. Also: Try Google's Nano Banana image generator in Search and NotebookLM - here's how An image of a woman's face, another of a collection of clothing grouped together, and a third of an ornate-looking room could, for example, prompt the model to create a short video clip of the woman wearing the pictured clothes and strolling through the room (no obviously detectable extra fingers included). More interestingly, you can upload images which, at first glance, you'd never expect could be brought together in any kind of comprehensible way. This is where the "creativity" (to use a loaded term) of Veo 3.1 shines brightest. Want more stories about AI? Sign up for AI Leaderboard, our weekly newsletter. A demo provided by Google showing one image of a decorated Christmas tree behind a pair of sliding doors and another of a psychedelic mixture of colors -- resembling a collection of various paint colors blended together -- creates a video of the doors sliding open to release a flood of multicolored, Christmas ornament-sized balls, like a Surrealist reimagining of the blood-filled elevator in The Shining. Veo 3.1 also allows users to upload just two images -- the first and last in a sequence -- and the model will automatically fill in the intermediary blank spot with video. Also: You can test Microsoft's new in-house AI image generator model now - here's how In one demo video, for example, Google shows an image of an old, rustic barn, with low sunlight pouring through the entryway, and another of a cowboy astride a horse, which appears to be casually trotting through tall grass. Veo 3.1 combines these two images by panning the camera through the barn's doorway until all we see is the (now actually moving) cowboy. The first and last image feature is available now on Flow, Vertex AI, and the Gemini API, but not the Gemini App. In that demo video and in others provided by Google, both the first and last images have similar lighting and artistic aesthetics. Uploading two images that are completely distinct from and unrelated to one another -- a black and white image of a Ferrari paired with a color pencil sketch of an orange tree, say -- will yield less predictable results. Veo 3.1 also comes with a new scene extension feature, through which users can easily lengthen their AI-generated video clips, along with another capability that allows them to add or remove visual elements to and from existing videos.
[5]
Google's Veo 3.1 is better at generating videos from images
Google has released a new update to its Veo AI video generation model that should make it do a better job of sticking to prompts and converting images into videos. Veo 3.1 is available to try today through Google's Gemini API and is now also powering the company's Flow video editor. Veo 3.1 builds on the new capabilities Google introduced with launch of Veo 3 at Google I/O 2025. The new model offers better "prompt adherence," according to Google, and should have an easier time creating videos based on the image "ingredients" you upload alongside your written prompt. Veo 3.1 also makes it possible to convert images to video and generate audio at the same time, a capability that wasn't available with Veo 3. In Flow, Veo 3.1 supports at least a new feature that gives you finer control over the videos you generate. With what Google calls "Frame to Video," Flow lets you upload a first and last frame, and then generates the video in-between. Adobe Firefly, which is powered by Veo 3, offers a similar feature, but Flow will be able to pull it off and create audio at the same time. Those added audio skills will also apply to the video editor's ability to extend clips and insert objects into existing footage, too. Based on the samples Google's shared, videos generated with Veo 3.1 still have an uncanny quality that seems to vary greatly depending on the prompt and subject. Even if it's missing some of the realism of OpenAI's Sora 2, though, the company's decision to try and make Veo more useful to people who actually work with video rather than a source of social media spam is a welcome move.
[6]
Google's filmmaking tool is getting a power-up from the rollout of Veo 3.1
With these tools, you'll be able to more precisely edit your AI-generated videos. Veo 3 was first introduced earlier this year, around Google I/O in May. The AI video generator, capable of producing clips from text and images, powers the company's AI filmmaking tool, known as Flow. Google has now rolled out a newer version of the model, Veo 3.1, which brings some improvements to Flow, like stronger prompt adherence and improved audiovisual quality, and adds some new editing capabilities. In addition to being available in Flow, Veo 3.1 can also be accessed through the Gemini API or Vertex AI. The new video AI model will also be available in the Gemini app.
[7]
Google announces Veo 3.1 and updates Flow with more controls, tools
Google today announced Veo 3.1 as its latest video generation model, with Flow getting a number of updates to take advantage of the latest capabilities. Compared to Veo 3, which was announced at I/O 2025 in May, this new version offers richer audio and "enhanced realism that captures true-to-life textures." Veo 3.1 has a deeper understanding of storytelling, cinematic styles, and character interactions to give you more narrative control. The image-to-video capability benefits from improved audio-visual quality and better follows your prompt. Veo 3.1 and Veo 3.1 Fast are available in the Gemini app, as well as the Gemini API and Vertex AI, to power text-to-video and image-to-video for horizontal (16×9) and vertical (9×16) outputs. Meanwhile, Google is updating the Flow filmmaking tool to take advantage of Veo 3.1. Audio generation is coming to: Flow is getting new editing capabilities. You can insert elements like objects, characters, and details, with Google handling shadows, scene lighting, and other complex details to make everything look natural. Coming soon is the ability to remove objects and characters from a scene, with Flow working to reconstruct the background and surroundings to make the edit seamless.
[8]
Google Veo 3.1 Lets You Combine Different Photos into Seemless AI Video
Google has introduced Veo 3.1, the latest version of its AI video generation model, expanding on features first unveiled with Veo 3. 3.1 promises improved prompt accuracy, smoother image-to-video conversion, and more advanced audio generation. Veo 3.1 is available now through Google's Gemini app, Vertex AI, and the Gemini API. It powers the company's Flow video editor. According to Google, Veo 3.1 delivers stronger "prompt adherence," producing videos that better reflect the written and visual inputs provided by users. The system can now generate synchronized audio alongside video and supports both landscape and portrait formats. These improvements are designed to make the model more practical for video creators rather than solely for social media content. The update enhances Flow's editing tools with greater control and flexibility. The new "Frame to Video" function allows users to upload a starting and ending frame, with Veo filling in the motion between them. Google says the tool can also generate audio for these clips, creating a more complete final product. Other Flow features, such as "Ingredients to Video" and "Extend," can now integrate generated sound as well. Veo 3.1 introduces the ability to insert new objects into footage while matching the existing visual style. Google says users will soon be able to remove objects, too. The model lets users insert an object into a video, seamlessly blending it with the clip's visual style. The company adds that object removal is on the way. In addition to Flow, Veo 3.1 and its faster, lower-cost variant, Veo 3.1 Fast, are rolling out across Google's AI ecosystem. Developers can access the tools through the Gemini and Vertex APIs, while the Gemini app continues to support reference images for generating clips. Google says Flow users have already created more than 275 million videos since its launch in May. The company indicates that Veo's expanded support for vertical video could help bring its AI tools to YouTube Shorts and other short-form platforms.
[9]
Google releases new AI video model Veo 3.1: what it means for enterprises
As expected after days of leaks and rumors online, Google has unveiled Veo 3.1, its latest AI video generation model, bringing a suite of creative and technical upgrades aimed at improving narrative control, audio integration, and realism in AI-generated video. While the updates expand possibilities for hobbyists and content creators using Google's online AI creation app, Flow, the release also signals a growing opportunity for enterprises, developers, and creative teams seeking scalable, customizable video tools. The quality is higher, the physics better, the pricing the same as before, and the control and editing features more robust and varied. My initial tests showed it to be a powerful and performant model that immediately delights with each generation. However, the look is more cinematic, polished and a little more "artificial" than by default than rivals such as OpenAI's new Sora 2, released late last month, which may or may not be what a particular user is going after (Sora excels at handheld and "candid" style videos). Expanded Control Over Narrative and Audio Veo 3.1 builds on its predecessor, Veo 3 (released back in May 2025) with enhanced support for dialogue, ambient sound, and other audio effects. Native audio generation is now available across several key features in Flow, including "Frames to Video," "Ingredients to Video," and "Extend," which give users the ability to, respectively: turn still images into video; use items, characters and objects from multiple images in a single video; and generate longer clips than the initial 8 seconds, to more than 30 seconds or even 1+ plus when continuing from a prior clip's final frame. Before, you had to add audio manually after using these features. This addition gives users greater command over tone, emotion, and storytelling -- capabilities that have previously required post-production work. In enterprise contexts, this level of control may reduce the need for separate audio pipelines, offering an integrated way to create training content, marketing videos, or digital experiences with synchronized sound and visuals. Google noted in a blog post that the updates reflect user feedback calling for deeper artistic control and improved audio support. Gallegos emphasizes the importance of making edits and refinements possible directly in Flow, without reworking scenes from scratch. Richer Inputs and Editing Capabilities With Veo 3.1, Google introduces support for multiple input types and more granular control over generated outputs. The model accepts text prompts, images, and video clips as input, and also supports: * Reference images (up to three) to guide appearance and style in the final output * First and last frame interpolation to generate seamless scenes between fixed endpoints * Scene extension that continues a video's action or motion beyond its current duration These tools aim to give enterprise users a way to fine-tune the look and feel of their content -- useful for brand consistency or adherence to creative briefs. Additional capabilities like "Insert" (add objects to scenes) and "Remove" (delete elements or characters) are also being introduced, though not all are immediately available through the Gemini API. Deployment Across Platforms Veo 3.1 is accessible through several of Google's existing AI services: * Flow, Google's own interface for AI-assisted filmmaking * Gemini API, targeted at developers building video capabilities into applications * Vertex AI, where enterprise integration will soon support Veo's "Scene Extension" and other key features Availability through these platforms allows enterprise customers to choose the right environment -- GUI-based or programmatic -- based on their teams and workflows. Pricing and Access The Veo 3.1 model is currently in preview and available only on the paid tier of the Gemini API. The cost structure is the same as Veo 3, the preceding generation of AI video models from Google. * Standard model: $0.40 per second of video * Fast model: $0.15 per second There is no free tier, and users are charged only if a video is successfully generated. This model is consistent with previous Veo versions and provides predictable pricing for budget-conscious enterprise teams. Technical Specs and Output Control Veo 3.1 outputs video at 720p or 1080p resolution, with a 24 fps frame rate. Duration options include 4, 6, or 8 seconds from a text prompt or uploaded images, with the ability to extend videos up to 148 seconds (more than 2 and half minutes!) when using the "Extend" feature. New functionality also includes tighter control over subjects and environments. For example, enterprises can upload a product image or visual reference, and Veo 3.1 will generate scenes that preserve its appearance and stylistic cues across the video. This could streamline creative production pipelines for retail, advertising, and virtual content production teams. Initial Reactions The broader creator and developer community has responded to Veo 3.1's launch with a mix of optimism and tempered critique -- particularly when comparing it to rival models like OpenAI's Sora 2. Matt Shumer, an AI founder of Otherside AI/Hyperwrite, and early adopter, described his initial reaction as "disappointment," noting that Veo 3.1 is "noticeably worse than Sora 2" and also "quite a bit more expensive." However, he acknowledged that Google's tooling -- such as support for references and scene extension -- is a bright spot in the release. Travis Davids, a 3D digital artist and AI content creator, echoed some of that sentiment. While he noted improvements in audio quality, particularly in sound effects and dialogue, he raised concerns about limitations that remain in the system. These include the lack of custom voice support, an inability to select generated voices directly, and the continued cap at 8-second generations -- despite some public claims about longer outputs. Davids also pointed out that character consistency across changing camera angles still requires careful prompting, whereas other models like Sora 2 handle this more automatically. He questioned the absence of 1080p resolution for users on paid tiers like Flow Pro and expressed skepticism over feature parity. On the more positive end, @kimmonismus, an AI newsletter writer, stated that "Veo 3.1 is amazing," though still concluded that OpenAI's latest model remains preferable overall. Collectively, these early impressions suggest that while Veo 3.1 delivers meaningful tooling enhancements and new creative control features, expectations have shifted as competitors raise the bar on both quality and usability. Adoption and Scale Since launching Flow five months ago, Google says over 275 million videos have been generated across various Veo models. The pace of adoption suggests significant interest not only from individuals but also from developers and businesses experimenting with automated content creation. Thomas Iljic, Director of Product Management at Google Labs, highlights that Veo 3.1's release brings capabilities closer to how human filmmakers plan and shoot. These include scene composition, continuity across shots, and coordinated audio -- all areas that enterprises increasingly look to automate or streamline. Safety and Responsible AI Use Videos generated with Veo 3.1 are watermarked using Google's SynthID technology, which embeds an imperceptible identifier to signal that the content is AI-generated. Google applies safety filters and moderation across its APIs to help minimize privacy and copyright risks. Generated content is stored temporarily and deleted after two days unless downloaded. For developers and enterprises, these features provide reassurance around provenance and compliance -- critical in regulated or brand-sensitive industries. Where Veo 3.1 Stands Among a Crowded AI Video Model Space Veo 3.1 is not just an iteration on prior models -- it represents a deeper integration of multimodal inputs, storytelling control, and enterprise-level tooling. While creative professionals may see immediate benefits in editing workflows and fidelity, businesses exploring automation in training, advertising, or virtual experiences may find even greater value in the model's composability and API support. The early user feedback highlights that while Veo 3.1 offers valuable tooling, expectations around realism, voice control, and generation length are evolving rapidly. As Google expands access through Vertex AI and continues refining Veo, its competitive positioning in enterprise video generation will hinge on how quickly these user pain points are addressed.
[10]
Google Veo 3.1 launches: See the upgrades
While OpenAI's Sora 2 has been getting all the attention recently, we found in our comparison that Google Veo 3 is a more capable video generation model altogether. Now, however, it looks like Google might be widening the gap with the Wednesday launch of Google Veo 3.1. On top of the basic Veo 3.1 model, there's also a new Veo 3.1 Fast model, which is basically a lighter weight version of Veo 3.1. They're both now available in Gemini, the Vertex AI platform, and Google Flow, the AI video editing tool that's designed around AI-generated videos and editing them together. So what's actually new with Google Veo 3.1? As you might expect, given the name, it's not necessarily a massive upgrade over Veo 3, which has only been out for a few months now. According to Google, Veo 3.1 offers richer audio as well as better narrative comprehension, which should help ensure that videos make more sense when put in the wider scope of the video. Last but not least, Google says Veo 3.1 offers enhanced realism when it comes to true-to-life textures. Some of the features available with Veo have been improved too, notably by adding audio capabilities. For example, Ingredients to Video is a feature that lets users upload reference images for Veo to use when creating a video. Users could upload a picture of a character and a location, and have Veo generate a video of the character in that location. This feature is available with audio in Gemini, the Vertex AI platform, and Google Flow, the video editing software that's designed around using Veoo to generate clips to use in a project. Another feature that's been upgraded with audio is Scene Extension, which essentially allows users to extend a video clip. It's available with audio in the Gemini API, but not the Gemini app. It's also available in Flow. Another feature is First and Last Frame, which allows users to upload a static image of the first and final frame in a video and have Veo generate a transition between those two frames. It can be used in the Gemini API, Vertex AI, and Flow. Additionally, Google is adding some precision features to Flow that could help users refine existing video clips. These include the ability to insert or remove objects in a way that looks realistic and natural, though we have yet to test it. Veo 3.1 is now available in Google Flow, Gemini, and Vertex AI. The other features vary in how widely available they are. You'll need a Google AI Pro subscription to use the new models.
[11]
Google Flow AI video creator adds creative controls that make Sora look tame
Merely days after OpenAI released its Sora 2 AI video generator app, which quickly went viral for some pretty morbid reasons, Google is giving a lift to its own AI video tools. The company's Flow video creation platform is getting upgraded to the new Veo 3.1 AI video model, alongside a bunch of new tricks in tow. The next-gen video AI model is now available to users in the Gemini app, as well, starting today. What's new? One of the biggest upgrades this time around is the ability to add audio, alongside a bunch of new creative controls in the Flow suite. For example, the new "Ingredients to video" system lets users upload multiple images to ensure that they can achieve just the right style and character control in the scene. The overarching idea is that instead of describing every minute detail, users can take the easier route. For example, you can upload the image of a person, pick another image with the background of your choice, and the third image is that of a costume for the character. Recommended Videos The AI will combine them all. It's not just convenient, but also offers a far more granular control over the videos you want to create, instead of dealing with hit-or-miss AI prompts. It's almost like creating hybrid emojis using the Emoji Kitchen system in Google's Gboard app. A whole new level of creative control In addition to controlling the scene elements, Flow is also getting a new "Frames to video" feature that lets users create a scene by simply supplying the start and end frames. Once the two images are uploaded, the AI will automatically stitch them together and create a seamless video out of it. Next, we have the new "Extend" tool. As the name makes it abundantly clear, it comes in handy for prolonging a scene without having to write a fresh prompt or add another image as the source material. "Flow creates a new video based on the final second of your original clip," source material. And finally, we have the new "Insert" feature, which lets users add anything to a video naturally without disturbing the background continuity. Soon, Google will also let Flow users remove objects from videos, somewhat like the Magic Eraser feature in the Google Photos app. As far as the Veo 3.1 model goes, it adds "richer audio, more narrative control, and enhanced realism that captures true-to-life textures" to videos created in Flow.
[12]
Google Unveils Veo 3.1 to Rival OpenAI's Sora 2 -- But Does It Deliver? - Decrypt
Google positions Veo as a professional-grade alternative in the crowded AI video market. Google released Veo 3.1 today, an updated version of its AI video generator that adds audio across all features and introduces new editing capabilities designed to give creators more control over their clips. The announcement comes as OpenAI's competing Sora 2 app climbs app store charts and sparks debates about AI-generated content flooding social media. The timing suggests Google wants to position Veo 3.1 as the professional alternative to Sora 2's viral social feed approach. OpenAI launched Sora 2 on September 30 with a TikTok-style interface that prioritizes sharing and remixing. The app hit 1 million downloads within five days and reached the top spot in Apple's App Store. Meta took a similar approach, with its own sort of virtual social media powered by AI videos. Users can now create videos with synchronized ambient noise, dialogue, and Foley effects using "Ingredients to Video," a tool that combines multiple reference images into a single scene. The "Frames to Video" feature generates transitions between a starting and ending image, while "Extend" creates clips lasting up to a minute by continuing the motion from the final second of an existing video. New editing tools let users add or remove elements from generated scenes with automatic shadow and lighting adjustments. The model generates videos in 1080p resolution at horizontal or vertical aspect ratios. The model is available through Flow for consumer use, the Gemini API for developers, and Vertex AI for enterprise customers. Videos lasting up to a minute can be created using the "Extend" feature, which continues motion from the final second of an existing clip. The AI video generation market has become crowded in 2025, with Runway's Gen-4 model targeting filmmakers, Luma Labs offering fast generation for social media, Adobe integrating Firefly Video into Creative Cloud, and updates from xAI, Kling, Meta, and Google targeting realism, sound generation, and prompt adherence. But how good is it? We tested the model, and these are our impressions. If you want to try it, you'd better have some deep pockets. Veo 3.1 is currently the most expensive video generation model, on par with Sora 2 and only behind Sora 2 Pro, which costs more than twice as much per generation. Free users receive 100 monthly credits to test the system, which is enough to generate around five videos per month. Through the Gemini API, Veo 3.1 costs approximately $0.40 per second of generated video with audio, while a faster variant called Veo 3.1 Fast costs $0.15 per second. For those willing to use it at that price, here are its strengths and weaknesses. Veo 3.1 is a definite improvement over its predecessor. The model handles coherence well and demonstrates a better understanding of contextual environments. It works across different styles, from photorealism to stylized content. We asked the model to blend a scene that started as a drawing and transitioned into live-action footage. It handled the task better than any other model we tested. Without any reference frame, Veo 3.1 produced better results in text-to-video mode than it did using the same prompt with an initial image, which was surprising. The tradeoff is movement speed. Veo 3.1 prioritizes coherence over fluidity, making it challenging to generate fast-paced action. Elements move more slowly but maintain consistency throughout the clip. Kling still leads in rapid movement, although it requires more attempts to achieve usable results. Veo built its reputation on image-to-video generation, and the results still deliver -- with caveats. This appears to be a weaker area in the update. When using different aspect ratios as starting frames, the model struggled to maintain the coherence levels it once had. If the prompt strays too far from what would logically follow the input image, Veo 3.1 finds a way to cheat. It generates incoherent scenes or clips that jump between locations, setups, or entirely different elements. This wastes time and credits, since these clips can't be edited into longer sequences because they don't match the format. When it works, the results look fantastic. Getting there is part skill, part luck -- mostly luck. This feature works like inpainting for video, letting users insert or delete elements from a scene. Don't expect it to maintain perfect coherence or use your exact reference images, though. For example, the video below was generated using these three references and the prompt: a man and a woman stumble upon each other while running in a futuristic city, where a Bitcoin sign hologram is rotating. The man tells the woman, "QUICK, BITCOIN CRASHED! WE MUST BUY MORE!! As you can see, neither the city nor the characters are actually there. However, characters are wearing the clothes of reference, the city resembles the one in the in the image, and things portray the idea of the elements, not the elements themselves. Veo 3.1 treats uploaded elements as inspiration rather than strict templates. It generates scenes that follow the prompt and include objects that resemble what you provided, but don't waste time trying to insert yourself into a movie -- it won't work. A workaround: use Nanobanana or Seedream to upload elements and generate a coherent starting frame first. Then feed that image to Veo 3.1, which will produce a video where characters and objects show minimal deformation throughout the scene. This is Google's selling point. Veo 3.1 handles lip sync better than any other model currently available. In text-to-video mode, it generates coherent ambient sound that matches scene elements. The dialogue, intonation, voices, and emotions are accurate and beat competing models. Other generators can produce ambient noise, but only Sora, Veo, and Grok can generate actual words. Of those three, Veo 3.1 requires the fewest attempts to get good results in text-to-video mode. This is where things fall apart. Image-to-video with dialogue suffers from the same issues as standard image-to-video generation. Veo 3.1 prioritizes coherence so heavily that it ignores prompt adherence and reference images. For example, this scene was generated using the reference shown in the elements to video section. As you can see, our test generated a completely different subject than the reference image. The video quality was excellent -- intonation and gestures were spot-on -- but it wasn't the person we uploaded, making the result useless. Sora's remix feature is the best choice for this use case. The model may be censored, but its image-to-video capabilities, realistic lip sync, and focus on tone, accent, emotion, and realism make it the clear winner. Grok's video generator comes in second. It respected the reference image better than Veo 3.1 and produced superior results. Here is one generation using the same reference image and prompt. If you don't want to deal with Sora's social app or lack access to it, Grok might be your best option. It's also uncensored but moderated, so if you need that particular approach, Musk has you covered.
[13]
Here's What's New With Veo 3.1, Google's Latest AI Video Model
Is it really a good thing that companies like Google are making it easy to generate hyper-realistic scenes? Did you know you can customize Google to filter out garbage? Take these steps for better search results, including adding my work at Lifehacker as a preferred source. OpenAI's new Sora app has been the main focus concerning hyper-realistic AI slop over the past few weeks. Sora makes it all too easy for users to generate short-form videos that seem real enough to most people, including videos that showcase real people's likenesses. But before Sora dropped, it was Google that was stirring fears about these realistic AI videos. With Veo 3, Google launched an AI model that not only produced lifelike videos, it also generated realistic audio synced to the action. Sound effects, environments, even dialog could all generate alongside the video itself, selling the effect entirely from one simple prompt. Now, Google is back with an upgrade to Veo, appropriately named Veo 3.1, which the company announced in a blog post on Wednesday. This isn't necessarily an overhaul or a revolutionary new video model. Instead, Veo 3.1 builds on top of Veo 3, adding "richer audio" and "enhanced realism" that Google says generates "true-to-life" textures. The new model also reportedly supports new narrative controls tools, which pairs with new upgrades to Flow, Google's AI video editor. Flow users now have more granular controls when editing, and can add audio to existing features like "Ingredients to Video," "Frames to Video," and "Extend." What does that mean in practice? According to Google, Ingredients to Video with Veo 3.1 lets users add references images to their scenes, such as a specific person, clothing items, or an environment. The new Flow editor can then insert those elements into the finished product, as you can see in the demo video below: Building off of this new feature, Flow now lets you add new elements to an existing scene as well. With "Insert," you can tell Veo 3.1 to add new characters, details, lighting effects, and more to the clip. Google says it is also working on the opposite as well, to allow users to remove any elements they don't like from a generation. Google also now has a new way for users to dictate how they'd like a scene to generate, called "First and last frame." Users can choose reference frames for the beginning and ending of a scene. Flow with Veo 3.1 will then fill in the gap, and generate a scene that starts and ends based on those images. There's also now a way to create videos that are longer than previous iterations of Flow would generate. The new "Extend" feature lets you either continue the action of the current clip, or cut to a new scene that follows it, though Google says the feature is most useful for generating a longer establishing shot. According to the company, Extend can create videos that last over a minute. Veo 3.1 is available for users in the Gemini app as well as Vertex AI, as long as you have a Google AI Pro subscription. Developers can access it via the Gemini API. Google says Ingredients to Video, First and last frame, and Extend are coming to Gemini API, but "Add object" and "Remove object" are not available. "Extend" is also not yet available in the Vertex AI API. Google sees all of these advancements as a boon for creatives and creativity, but I'm highly skeptical. I could see Veo 3.1 and Flow as a good tool for envisioning shots before filming or animating them (i.e. a storyboarding tool), or even a way for new and budding filmmakers to learn editing by seeing their ideas in a more realized form. However, as a whole, I don't think AI-generated content is the future -- or, at least, not a future most of us want. Sure, there's humor or novelty in some of these AI-generated videos, but I'd wager most of the people who enjoy them do so ironically, or exclusively to social media. The idea of replacing human filmmakers and actors with AI generations seems absurd, especially when it puts all of us at risk of disinformation. Is it really so important for companies like Google and OpenAI to make it easy to generate hyper-realistic fully-rendered scenes, when those videos could so easily be used to trick the masses? This could be the ramblings of someone resistant to change, but I don't think most of us would like to see our favorite shows and movies made with passion and emotion, replaced by realistic-looking people delivering muted and robotic performances.
[14]
Veo 3.1 Puts Google in Direct Competition with OpenAI's Sora 2 | AIM
The update introduces audio generation, new editing options, and greater control over video creation. Google DeepMind has announced Veo 3.1, the latest version of its generative video model, alongside expanded capabilities in Flow, its AI filmmaking tool. The update introduces audio generation, new editing options, and greater control over video creation. "Five months ago, we introduced Flow, our AI filmmaking tool powered by Veo, and have been inspired by the creativity it has sparked with over 275 million videos generated," said Google DeepMind in a statement. Veo 3.1 offers better narrative control and realism, building on the Veo 3 model with stronger prompt adherence and improved audiovisual quality when converting images into videos. The model is now integrated with Flow and available through the Gemini API, Vertex AI, and the Gemini app. The new release brings audio support to existing Flow tools such as Ingredients to Video, Frames to Video, and Extend, allowing creators to produce videos with synchronised sound. The company said the features are still in development and will be refined further based on user feedback. With Ingredients to Video, users can control the appearance of their scenes using multiple reference images. Frames to Video lets them generate smooth transitions between two static images, while Extend allows the creation of longer, continuous shots that build on previous clips. Flow also adds new editing tools. The Insert feature enables users to add new elements, such as characters or objects, into scenes, while an upcoming Remove function will allow unwanted objects to be deleted seamlessly. "Flow now handles complex details like shadows and scene lighting, making additions and removals appear natural," the company said. The Veo 3.1 launch comes just days after OpenAI released Sora 2, its latest video and audio generation model, along with a social iOS app called Sora that lets users create, remix, and insert themselves into generated videos.
[15]
Google's Veo 3.1 Will Now Give Users More Creative Control Over AI Videos
Veo 3.1 is currently available via the Flow app and the Gemini API Google released the first major update to its artificial intelligence (AI) video generation model Veo 3 on Wednesday. Dubbed Veo 3.1, the updated model arrives less than five months after the release of Veo 3 and brings significant improvements to prompt adherence and granular control over the final output. Users can now add reference images to guide the video, and even place the first and final frame of the video to let the AI connect the dots. The model is currently not available on the Gemini app. Veo 3.1 Comes With Big Improvements in Prompt Adherence In a post on X (formerly known as Twitter), the official handle of Google DeepMind announced the release of Veo 3.1 video AI model. The company said that in the five months since Veo 3's launch in May, users have generated more than 275 million videos, highlighting its popularity. The new update, the tech giant claims, focused on user feedback and brings more artistic controls. Currently, Veo 3.1 is available via the Flow app and in the Gemini API for developers. Broadly, there are three new features. The first is dubbed "Ingredients to Video," which lets users upload multiple reference images while generating a video. The AI analyses the images and integrates them into the output. Google says this will allow users to generate videos that are closer to the creator's vision. The second feature is "Frames to Video," and with this, users can add a starting and ending image. The AI model will then generate a video that connects the two points. This feature is intended to help users get their desired narrative from the AI-generated videos with creative transitions. Finally, the third feature is called Extend. Users can upload a clip, and Veo 3.1 continues the shot to build on top of it. Google says this mode will let users generate longer videos that last more than a minute. These videos are generated based on the final second of the uploaded clip to create continuity and help users when they want a longer establishing shot. Notably, while the Flow app is available to users with the Google AI Pro and Google AI Ultra subscriptions, developers can access Veo 3.1 via the Gemini API. The pricing has been kept the same as Veo 3, with each second of the video generation being charged at $0.40 (roughly Rs. 35). Additionally, the Veo 3.1 Fast model will cost devs $0.15 (roughly Rs. 13) per second of generation.
[16]
Google's Veo 3.1 turns AI filmmaking into magic
Google has unleashed Veo 3.1 and major upgrades to its AI filmmaking platform Flow, enabling creators to make studio-quality videos directly from text, images, and sounds. With powerful audio generation, deeper scene control, and intuitive editing, Google's Veo 3.1 redefines creative storytelling in the AI era. This article explores how Flow's new features like multi-image video generation, seamless extensions, and smart editing, are revolutionizing digital creation for artists, developers, and enterprises. When Google first introduced Flow, the AI filmmaking tool powered by Veo, it promised a new age of creativity. Just five months later, the results have been astounding: more than 275 million videos generated by artists, storytellers, and brands. Yet, as users experimented with AI-driven video creation, one common demand echoed loudly: more control. Google listened, and now it's answering with Veo 3.1 and a significantly enhanced version of Flow that bridges imagination and cinematic precision. Veo 3.1 represents Google's most advanced step in generative video yet. This upgraded model not only turns images and text into vivid, lifelike videos but also brings sound into the mix. The addition of audio generation completely transforms how creators craft their narratives. Each clip is no longer silent or dependent on manual post-editing. Instead, Veo 3.1 synthesises intelligent soundscapes that match the mood, motion, and texture of your visuals. From ambient echoes in a forest scene to the mechanical hum of a sci-fi world, the audio feels authentic and immersive. The biggest highlight lies in Flow's newfound creative freedom. Using tools like "Ingredients to Video," creators can combine multiple reference images from objects to characters to guide the final shot's style and composition. It's like storyboarding on autopilot, but with far more precision. The "Frames to Video" feature makes transitions seamless, connecting two static images into a cinematic bridge. Meanwhile, "Extend" empowers users to build full-length, continuous shots by adding new sequences that seamlessly flow from the previous clip's last second. For short film enthusiasts, marketers, and YouTube creators, these features make AI video production not just easier but artistically inspiring. Editing within Flow has also received a groundbreaking overhaul. The new "Insert" capability allows users to add new objects or characters into any scene, intelligently matching their lighting and shadows for a natural blend. The upcoming "Remove" tool works in reverse, letting creators erase any unwanted elements while Flow reconstructs the background flawlessly. These editing controls bring Flow closer to professional video editing suites, but with the simplicity and speed of text-based commands powered by AI. What's equally exciting is how Veo 3.1 scales beyond hobbyist use. The model is now integrated across the Gemini ecosystem, available through the Gemini API for developers, Vertex AI for enterprise-level projects, and directly via the Gemini app for individual creators. This means Veo's state-of-the-art realism and responsive narration tools can now be embedded in apps, studios, or corporate storytelling workflows, amplifying production speed and personal creativity alike. Veo 3.1 is more than just a technical update; it's a creative revolution. By merging visuals, audio, and story logic within one unified platform, Google has positioned Flow as the most complete AI filmmaking environment yet. In an era where visual storytelling defines digital communication, the ability to shape scenes, control emotion, and edit with AI precision could make Flow the go-to engine for next-generation creators. For anyone dreaming of producing Hollywood-level content without a studio, Veo 3.1 is that breakthrough. The boundary between imagination and production has never been thinner -- and Flow is the tool turning those dreams into motion.
[17]
Google Launches Veo 3.1 and Upgrades Its Flow AI Filmmaking Tool
The new Veo 3.1 model is already live in the Gemini app and the upgraded Flow AI tool is available starting today. Google has upgraded its powerful Veo 3 video generation model, and launched Veo 3.1 which delivers enhanced realism, better textures, lighting, and cinematic sound design. It's part of an update to Flow, Google's AI filmmaking tool, which brings granular scene editing tools, the ability to insert or remove objects, and generate longer, seamless clips. The Flow AI tool and Veo 3 AI model was launched five months ago during Google I/O 2025. Since then, video creators have requested Google to bring more artistic control and audio control across all features. Now, Google has launched an update to Flow with Veo 3.1, introducing many of those creative capabilities. Users now have greater narrative and visual control and can generate rich audio with improved realism. The Veo 3.1 AI model has better prompt adherence and can follow the storytelling intent much more accurately. You can also now use multiple reference images to define characters, objects, and style. The Flow AI tool can combine those images to create a cohesive scene. Not just that, the generated video now includes audio as well. Next, you can provide a starting and ending image, and Flow creates a seamless video between both the scenes. Plus, you can now extend an existing clip to create longer shots, which goes up to a minute or more. This can be great if you want to generate a continuous shot. It uses the last frame of your clip to continue motion. Finally, you can add new elements or objects into a scene and Flow adjusts lighting and shadows automatically to improve realism. Not to mention, you can remove unwanted objects or characters from a scene as well. You can use the new Veo 3.1 model in the Gemini app and Flow, starting today.
[18]
Google unveils Veo 3.1 AI video generator and new features in filmmaking tool Flow - The Economic Times
Google has introduced its latest video models, Veo 3.1 Standard and Veo 3.1 Fast, through its Gemini application programming interface (API). These new versions improve on the earlier Veo 3 model released in May, offering more realistic video output and following prompts more closely. According to Google's blog, the models now have a better prompt adherence and can generate "richer native audio, from natural conversations to synchronised sound effects, and offer greater narrative control with an improved understanding of cinematic styles". Veo 3.1 includes features such as: While Veo 3 already supporting these features, Veo 3.1 now adds audio to all of them, making the videos more dynamic. Pricing remains unchanged from Veo 3: $0.40 per second for Veo 3.1 Standard and $0.15 per second for Veo 3.1 Fast. The models are available through Google AI Studio and Vertex AI via the Gemini API. Veo 3.1 is also accessible in the Gemini app and AI filmmaking tool Flow. Editing in Flow If users are not satisfied with the first take of the video generated, they can edit it using Flow. In the Flow platform, users can:
[19]
Google rolls out Veo 3.1 in Flow for precise and immersive video editing
Google introduced Veo 3.1, enhancing its Flow platform with richer audio, improved realism, and stronger adherence to prompts. The update adds audio to existing features like Ingredients to Video, Frames to Video, and Extend, giving creators precise control over their scenes. Veo 3.1 integrates generated audio with existing Flow tools, allowing creators to craft scenes with greater precision. These experimental features are continuously improving, with user feedback guiding future iterations. Flow introduces new editing options for scene refinement: Veo 3.1 is included in Gemini API 2 and Vertex AI 3, enabling high-quality, precise video storytelling across multiple platforms.
[20]
Google Veo 3.1 out: Five key AI video advancements it sports
In a bold step forward for AI-generated filmmaking, Google has unveiled Veo 3.1, the latest version of its generative video model, now rolling out through the Gemini API, Vertex AI, and Google's Flow video tool. Positioned as both faster and more capable than its predecessor, Veo 3.1 is designed to bring cinematic realism and creative control within reach for filmmakers, content creators, and developers alike. While earlier versions of Veo demonstrated Google's prowess in generating high-quality short clips from text, this update shifts the focus to refinement, consistency, and storytelling depth. Here are the five key advancements that make Veo 3.1 one of the most versatile AI video tools yet. Veo 3.1 is Google's first version to feature truly integrated audio generation. Beyond visuals, it can now create synchronized dialogue, ambient soundscapes, and sound effects that match the mood and motion of the video. Whether it's the quiet hum of a city street or a burst of cinematic tension, the model ensures the auditory layer complements the visuals seamlessly. This capability extends across all modes in Flow, enabling creators to produce near-finished videos directly from a prompt. Google has introduced a new "Ingredients to Video" feature that lets users supply up to three reference images to guide generation. Think of it as a visual blueprint, you can anchor a scene's character, object, or style, and Veo 3.1 will maintain that consistency throughout the clip. This is particularly valuable for projects that require character continuity, product visualizations, or stylized storytelling, where visual coherence was previously hard to achieve with text-only prompts. Bridging static and moving imagery, this new feature allows users to define the opening and closing frames of a sequence, with Veo 3.1 generating the video that transitions between them. The result: smooth cinematic arcs that maintain continuity from start to finish. It's an ideal tool for filmmakers wanting to experiment with narrative transitions, time lapses, or visual metaphors that evolve across a clip. In a significant leap toward long-form generation, Veo 3.1 can now extend existing clips using a "scene extension" feature. By analyzing the last second of a video, it generates new frames that continue the motion and atmosphere naturally, effectively stitching short clips into longer, coherent sequences. This ability marks a major improvement for creators aiming to build story-driven videos without jarring transitions or manual edits. Google's Flow, the creative tool that sits atop the Gemini API, has received a parallel upgrade alongside Veo 3.1. It now supports advanced Insert and Remove editing functions, letting users add or erase elements from a scene while preserving lighting, perspective, and motion continuity. Although these specific controls are still rolling out gradually, they demonstrate Google's ambition to make AI video editable at a professional level, not just generative. With Veo 3.1, Google isn't just improving quality, it's changing how people direct AI videos. From sound and style control to storytelling continuity, the update makes generative video a more practical tool for creators, not just a demo of AI's visual potential. Already, partners like Promise Studios and Latitude are experimenting with Veo 3.1 to prototype storyboards and animate user-driven narratives, hinting at how mainstream these tools could soon become. As Google continues refining Flow and expanding API access, Veo 3.1 sets a new benchmark for AI-assisted filmmaking, where creative intuition and machine intelligence meet in motion.
[21]
Google introduces Veo 3.1 AI model, adds new features to filmmaking tool Flow
Veo 3.1 offers "stronger prompt adherence and improved audiovisual quality when turning images into videos," according to Google. Google has introduced Veo 3.1 AI model and expanded capabilities for its AI filmmaking tool Flow. These updates aim to give creators more control over their videos and enhance the realism of their scenes. Veo 3.1 builds on the Veo 3 model and offers "stronger prompt adherence and improved audiovisual quality when turning images into videos," according to the tech giant. This means creators can produce videos that are more true to life, with realistic textures and seamless transitions between scenes. A major update in Flow is the addition of audio to several key features. For the first time, creators can add sound when using Ingredients to Video, Frames to Video, and Extend. These features now give users more flexibility in designing their videos: Also read: Google Pixel 9 Pro Fold price drops by Rs 83,000 during Flipkart Big Bang Diwali sale Flow also gets new editing tools for greater precision. With a new Insert feature, you can add new elements to your scenes, and Flow automatically handles shadows and lighting for a natural look. Soon, a new removal feature will allow you to seamlessly take objects or characters out of a scene while reconstructing the background as if they were never there. Also read: Apple loses another key exec to Meta's hiring spree, this time head of AI search project Veo 3.1 is available through the Gemini API, Vertex AI for enterprise users, and the Gemini app. New capabilities are accessible in Gemini API and Vertex AI.
Share
Share
Copy Link
Google's latest AI video model, Veo 3.1, introduces enhanced features for video generation, including improved audio, better prompt adherence, and new editing capabilities. The update aims to revolutionize video creation across Google's ecosystem.
Google has unveiled its latest AI video model, Veo 3.1, marking a significant advancement in the realm of AI-generated content. This update builds upon the success of Veo 3, which was released earlier this year, and introduces a range of new features and improvements aimed at enhancing the quality and versatility of AI-generated videos
1
.
Source: Beebom
Veo 3.1 boasts several key improvements over its predecessor. The new model offers stronger prompt adherence, resulting in better video outputs and fewer wasted compute cycles. Audio quality, a hallmark feature of Veo 3, has been further enhanced in this update
1
. The model now supports both landscape and portrait 16:9 video formats, catering to the increasing demand for vertical video content on platforms like YouTube Shorts and TikTok1
.One of the most notable additions to Veo 3.1 is its ability to blend multiple images into a single, cohesive video. This feature, called 'Ingredients to Video,' allows users to upload separate assets that Veo will combine into a final video
3
4
.The model also introduces a 'Frame to Video' feature, enabling users to provide starting and ending frames, with Veo 3.1 generating the transition between them
3
5
. Additionally, Veo 3.1 allows for the extension of short clips to over a minute in length, providing smoother transitions between scenes3
.
Source: engadget
Veo 3.1 is being integrated across Google's AI ecosystem. It will be available through the Gemini app, Vertex AI, and the Gemini API
2
. The model is also powering Google's Flow video editor, where it unlocks additional features such as adding objects to existing assets and generating audio simultaneously with video conversion5
.
Source: Mashable
Related Stories
The release of Veo 3.1 comes at a time of intense competition in the AI video generation space, with companies like OpenAI also making significant strides with their Sora model
3
. This advancement in AI-generated video technology has sparked discussions about its potential impact on creative industries, with concerns raised about copyright infringement and the changing landscape of video production3
.Google has also announced a 'Fast' variant of Veo 3.1, aimed at reducing costs for users paying per token
1
. The company plans to introduce additional features in the near future, including the ability to remove objects from videos in the Flow editor2
.As AI-generated video technology continues to evolve, it promises to revolutionize content creation, making sophisticated video production more accessible to a wider range of users. However, it also raises important questions about the authenticity of online content and the future of traditional video production methods.
Summarized by
Navi
[1]
1
Business and Economy

2
Business and Economy

3
Business and Economy
