Curated by THEOUTPOST
On Wed, 4 Dec, 12:04 AM UTC
16 Sources
[1]
Google's Imagen 3 and Veo AI Models Are Now Available on Vertex AI
Google finally launched its latest artificial intelligence (AI) image and video generation models on Wednesday. Both of these AI models were unveiled at Google I/O as the tech giant's latest advancement in generative AI. Now, more than six months later, the Mountain View-based company has introduced it on Vertex AI for its enterprise clients. Notably, while Imagen 3 was not available as a standalone platform so far, it was being used in several platforms and tools such as Google Docs, Gemini, and an experimental tool called GenChess. In a blog post, the tech giant announced the introduction of the two new AI models in Vertex AI. Google's Vertex AI platform is a managed machine learning (ML) platform on Google Cloud that allows developers and enterprises to build, deploy, and manage AI models. It is similar to Amazon Bedrock and Microsoft Azure and offers integrated tools and solutions for AI workflows. The tech giant stated that the Veo video generation model is now available on Vertex AI in private preview and businesses can generate videos using text or image prompts. On the other hand, Imagen 3 will be made available starting next week. It takes text prompts and enterprises can use it to generate images that reflect their brand style and logos. Coming to the capabilities of Veo, Google says it can generate high-quality videos based on either text or image prompts. The generated videos can be in a wide range of cinematic and visual styles. Developed by DeepMind, the AI model is said to have high prompt adherence and can generate consistent footage of objects and people and even capture movements realistically. Imagen 3, which will be available in Vertex AI starting next week, can generate photorealistic images in a wide range of styles. Calling it "our most capable image generation model yet", Google stated that the image generation model can understand natural language prompts and users do not have to describe the technical elements to get the desired result. The Imagen 3 AI model will also be available with editing tools for inpainting and outpainting. Companies can also infuse their brand's colours, styles, logos, and other elements in the generated images. For privacy and safety, the tech giant has added several tools. SynthID, the watermarking technology developed by DeepMind, will be embedded into every image and frame of video that these AI models produce to combat instances of deepfakes and misinformation. Google also stated that the AI models will not be trained on customer data and the tools will operate following Google Cloud's data governance and privacy controls.
[2]
Google expands Vertex AI with video generator AI model Veo - SiliconANGLE
Google LLC's cloud division today announced that Veo, Google's artificial intelligence model that can generate lifelike video from text or images, will be available in private preview for customers via Vertex AI. The company also said that Imagen 3, Google's highest-quality text-to-image AI model, will be available to all Google Cloud customers globally on Vertex. Vertex AI is a managed platform that allows users to train and deploy machine learning models and customize generative LLMs for use in AI-powered applications. It enables users to build, customize, test and experiment with AI models with several tools that cater to both highly technical and less savvy users. It includes access to many foundation models from Google and other sources. Google unveiled Veo during its Google I/O conference this year and it is the company's most capable video generation AI model developed by DeepMind, Google's AI research division. It can create high-quality 1080p resolution video with 24 or 30 frames per second in numerous cinematic and visual styles that can go beyond a minute in length. "The model provides an unprecedented level of creative control and understands cinematic terms like 'timelapse' or 'aerial shots of a landscape,'" Google wrote about the model in its initial announcement. "Veo creates footage that's consistent and coherent, so people, animals and objects move realistically throughout shots." Veo can use its deep understanding of natural language to produce video from just text but users can also provide images to condition its outputs. By providing a reference image with a text prompt, Veo will generate a video based on the image and use the user's written text to animate it appropriately. For example, a user could give the model an image of a dog looking at the camera and the text: "The dog gets up, wags his tail, looking at the camera, happy and full of life." The model will produce a video with the dog hopping up, tail wagging joyfully, ears flapping, looking side-to-side as if expecting a treat. In September, YouTube announced that users would be able to generate six-second Shorts clips and backgrounds for their videos using the Veo model using simple text prompts. The company said the animated video backgrounds would launch this year, while six-second video clips would arrive in 2025. Now publicly available to all customers, Imagen 3 is Google's latest and best image generation foundation model that produces lifelike rendering from natural language prompts with numerous improvements over previous models. Imagen 3 produces high-quality images with good lighting and composition based on conversational prompts. It has been upgraded to accurately render fine details such as wrinkles on a person's hand or complex textures such as small pieces of grass or the cracks in a brick wall. It also has improved control over the placement of text in produced images. Text production by text-to-image models can be problematic as these models can sometimes generate gibberish or misunderstand prompts. Additionally, Google said, new features will be coming to Imagen 3 for customers on the allowlist that will help companies edit and customize images. These features include user-friendly ways to change parts of images using prompts such as inpainting, where a section can be selected and modified specifically, and outpainting, where the image can be expanded. In each case the user provides a description of what should appear in the area and the model will generate the appropriate change. Imagen 3 also has new customization controls that allows users to guide the model with brand, style, logo, subject and product features for generating new images. For example, the user can provide the model with an image of a product, such as a dress, shoes or other fashion items and ask it to generate people wearing them and it will create lifelike images of individuals with those products in the appropriate detail. This will make certain that brand, style and other marketable elements will be respected, the company said. "Our collaboration with Google has been instrumental in harnessing the power of generative AI, notably through Imagen 3, to revolutionize content production," said Jon Halvorson, senior vice president of consumer experience and digital commerce of Mondelez International Inc., the snack food business formerly called Kraft Foods Inc. "This technology has enabled us to produce hundreds of thousands of customized assets, enhancing creative quality while significantly reducing both time to market and costs."
[3]
Google's generative AI video model is available in private preview
Google has begun rolling out private access to its Veo and Imagen 3 generative AI models. Starting today, customers of the company's Vertex AI Google Cloud package can begin using Veo to generate videos from text prompts and images. Then, as of next week, Google will make Imagen 3, its latest text-to-image framework, available to those same users. With Veo's rollout, Google says it's the first hyperscale cloud provider to offer an image-to-video model. To that point, OpenAI's Sora model is still only available to select artists, academics and researchers -- though that could change quickly with the company teasing 12 days of product demos starting December 5. Of Veo, Google says the model creates 1080p footage "that's consistent and coherent" and can run "beyond a minute." The tool is also capable of working with both text prompts and images. In the latter case, it's possible to use either AI-generated or human-made pictures as the starting point for a video. Looking at the sample footage Google shared, it's evident Veo, like all AI models, can struggle with cause and effect. For example, in the clip of the roasting marshmallows, the treats don't yellow and char as they're exposed to the heat of a campfire flame. Artifacting is also an issue, as is apparent if you look closely at the hands in the concert footage. As for Imagen 3, Google says the model generates "the most realistic and highest quality images from simple text prompts, surpassing previous versions of Imagen in detail, lighting, and artifact reduction." Here again, however, you don't have to look too closely to see Google has more work to do. In the first example of a group of friends sitting on the trunk of a car, the original prompt includes mention of "flash photography," but the subjects are clearly backlit. One could argue that a flash was used to create intense backlighting, but if the idea behind the prompt was to create something representative of flash photography from the 1960s, this image isn't it. Still, Google is keen to get more of its enterprise customers using generative AI. Citing its own research, the tech giant says among companies using generative AI in production, 86 percent report an increase in revenue. However, a recent Appen survey found return on investment from AI projects fell by 4.6 percentage points from 2023 to 2024.
[4]
Google Cloud launches Veo AI video generator model on Vertex
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More As Amazon takes a major step into the AI space with its new Nova family of foundation models, Google is doubling down on its own multimodal AI capabilities. The tech giant's cloud division has announced that its latest video and image-generation models, Veo and Imagen 3, are now available on Vertex AI. This move empowers teams to integrate cutting-edge video and image-generation capabilities into their AI workflows, unlocking diverse use cases -- especially in marketing and advertising. It also makes Google Cloud the first hyperscaler to offer a video model to its customers. While the Veo model is currently in private preview, Imagen 3 will be generally available to all Vertex AI users starting next week. Notably, Imagen 3 also includes editing features, enabling users to refine generated images to meet specific creative needs. What do Veo and Imagen 3 offer? First unveiled at Google's I/O developer conference, Veo is Google DeepMind's response to competitors like Runway's Gen-3 and OpenAI's Sora, delivering a sophisticated video-generation experience. The model transforms text or image prompts into cinematic, high-definition videos in various visual styles, generating clips over 60 seconds long. What sets it apart is frame-level consistency, ensuring subjects move seamlessly within shots. Imagen 3, also from DeepMind, takes on the task of text-to-image generation, producing photorealistic visuals in a variety of styles. Google claims it surpasses its predecessors in detail, lighting accuracy and artifact reduction. Beyond generation, users on Google's allowlist can also access advanced customization options with Imagen 3. These include image upscaling, inpainting, outpainting and background replacement -- all guided by text prompts. Additionally, users can provide reference images, enabling Imagen 3 to create content aligned with specific brand aesthetics, logos or product features. Broader implications for industry Vertex AI has long been Google Cloud's flagship platform for streamlining AI application development and deployment. By integrating Veo and Imagen 3, the platform offers organizations an even more comprehensive suite of tools to innovate in marketing, sales and beyond. Imagen 3, for instance, simplifies the creation of high-quality assets such as product images and social media content, while Veo extends this capability by offering teams an option to convert these visuals into polished videos. The speeds up production, cuts costs, and accelerates prototyping, allowing teams to iterate rapidly on their creative strategies. "Customers like Agoda are using the power of AI models like Veo, Gemini, and Imagen to streamline their video ad production, achieving a significant reduction in production time," said Warren Barkley, senior director of product management at Google, in a blog post. He also highlighted that both models include safety features like digital watermarking and content moderation guardrails to mitigate risks associated with generative AI. Other early adopters include Mondelez International -- owner of brands such as Oreo, Cadbury, and Milka -- and global marketing and communications service WPP. As Google's foundation models expand their reach, businesses across industries have a powerful opportunity to reimagine how they create and deliver visual content. Competition continues to heat up While all major cloud providers, including Google Cloud, Amazon Web Services and Microsoft Azure, have been providing image generation models on their respective AI orchestration platforms, video generation has been quite a rarity thus far. Google's move to launch Veo in private preview today changes that. Interestingly, soon after the Veo announcement, AWS made a splash at re:Invent with the announcement of Nova Reel, a foundation model that generates six-second-long studio-quality videos from text and image prompts. This model, along with others in the Nova family, is set to become available via Amazon Bedrock, the company's fully managed service designed to simplify the creation and deployment of generative AI applications. Microsoft, on its part, appears to be lagging in this category at this stage. Its AI Foundry does not include models for video generation. However, we expect that to change as soon as OpenAI's Sora hits the market.
[5]
Google Expands its Generative AI Software Offerings with Veo and Imagen 3 - Phandroid
Regardless of what one might think of AI-generated content, it's clear that a lot of major brands and companies are leaning towards its integration within their operations moving forward. One big example is Google, which has been a major player in pioneering AI software for consumer and enterprise-grade solutions. READ: Google's AI future is Dramatically Different from its Past Back in May, Google announced that it was working on Veo and Imagen 3 (via Google DeepMind) which are capable of generating AI-based content under its Vertex AI platform. With that said, both are finally official, and will allow users to generate images on Imagen 3, or videos based on images or directly via prompts using Veo (check out the samples below). Veo is currently available in private preview on Vertex AI, and essentially puts Google as the first hyperscaler to offer high-quality image-to-video content; Imagen 3 on the other hand launches next week on Vertex AI. Google does reassure users that content created from either Veo or Imagen 3 will feature DeepMind's SynthID, which is a measure to prevent users from passing off content as authentic, real-life scenarios. Veo and Imagen 3 are just two of Google's many AI products, the most popular (and highly-marketed) being Gemini, which is available on different device platforms worldwide.
[6]
Meet Veo, Google's AI that creates high-def videos from text
Google has announced the launch of two generative AI models, Veo and Imagen 3, available for businesses using Vertex AI, its cloud platform for AI tools. Veo is designed to generate high-definition videos from images and text prompts, while Imagen 3 focuses on producing realistic images from simple text inputs. Veo, developed by Google DeepMind, generates videos featuring realistic-looking people and animals. Users can create content by uploading an image tied to a text prompt or by inputting text alone. Currently, Veo will be accessible to select businesses through a private preview. It produces 1080p video clips lasting up to six seconds, supporting 24 or 30 frames per second. According to Warren Barkley, senior director of product management at Google Cloud, the enterprise response to generative AI has been overwhelmingly positive, with reports indicating an 86% revenue increase among companies that have integrated these technologies. Prompt: Timelapse of the northern lights dancing across the Arctic sky, stars twinkling, snow-covered landscape Video: Google Imagen 3, also newly launched, is touted as Google's highest quality image generation model. It can create photorealistic images and offers advanced editing capabilities, such as adding, removing, or extending elements within an image. Starting next week, all Vertex AI customers will have access to Imagen 3. Brands like Cadbury, Oreo, and Milka are among the first to utilize these models in their marketing strategies. Both models incorporate digital watermarks to prevent misinformation and misattribution, utilizing Google DeepMind's SynthID technology. Additionally, they include built-in safeguards to prevent misuse and the generation of harmful content. Importantly, neither model is trained on customer data. The availability of Veo in a private preview will allow businesses like Quora and Mondelez International to explore creative applications, such as generating video content for their platforms. Veo's ability to create scenes with specific visual styles is one of its standout features. It can produce dynamic content, including landscape shots and time-lapse videos. However, the model is not without flaws. Issues like disappearing objects and unrealistic physics, such as reversing vehicles, highlight its current limitations. Prompt: A fast-tracking shot down an suburban residential street lined with trees. Daytime with a clear blue sky. Saturated colors, high contrast Video: Google Veo has been trained on a diverse range of footage to enhance its capabilities. When asked about its training sources, Barkley mentioned that it "may" include content from YouTube, in line with agreements with content creators. He emphasized that Google focuses on using high-quality, curated data, adhering to safety and security standards. As with other AI models, concerns about copyright and proprietary content arise, especially with the potential for models to output nearly identical copies of existing work. Google asserts that it has implemented prompt-level filters to manage potentially harmful outputs. Additionally, the company plans to indemnify output from Veo on Vertex AI once it becomes generally available, offering some protection for businesses utilizing the tool. Google is gradually integrating Veo into its suite of products, as evidenced by its introduction into Google Labs earlier this year following initial announcements. In September, the model was incorporated into YouTube Shorts, allowing creators to produce background scenes and brief video clips easily.
[7]
Google Steals the Spotlight from OpenAI Sora, Launches Veo on Vertex AI
Developed by Google DeepMind, Veo includes safety features such as digital watermarking, safety filters, and data governance measures. Google Cloud has introduced Veo, a video generation model, and Imagen 3, an advanced image generation tool, on its Vertex AI platform. Veo, currently in private preview, generates high-quality videos from text or image prompts. It enables businesses to create realistic and coherent footage efficiently, reducing production time and costs. "Veo, now available on Vertex AI in private preview, empowers companies to effortlessly generate high-quality videos from simple text or image prompts. As the first hyperscaler to offer an image-to-video model, we're helping companies transform their existing creative assets into dynamic visuals," the company said in its blog post. Imagen 3, now generally available, offers photorealistic image generation with improved detail and reduced visual artifacts compared to earlier versions. Imagen 3 incorporates editing tools and customisation options, allowing businesses to align output with brand requirements. Businesses like Mondelez International and WPP are using these models to accelerate content creation. Mondelez has utilised Imagen 3 for marketing campaigns and plans to adopt Veo for video production. WPP is integrating these tools into its AI-powered platform, WPP Open, to enhance creative workflows. Agoda, a digital travel platform, is experimenting with these technologies to develop customised visuals for promotions. Developed by Google DeepMind, Veo includes safety features such as digital watermarking, safety filters, and data governance measures. Earlier, Google announced that YouTube is set to roll out advanced generative AI tools to creators over the coming months, enabling them to generate video content using AI models Veo and Imagen 3 through a feature called Dream Screen. Meanwhile, OpenAI's popular text-to-video tool, Sora, recently became the talk of the internet because of its recent leak on Hugging Face. Sora's API got leaked and became available for some artists as early testing. However, not long after the tool's leak, the Hugging Face page seemed to be failing with the 502 error due to high traffic. The company got light of this incident soon enough and shut down the access three hours post revelation. OpenAI has yet to release Sora officially. Competition Galores: With the rise and impact of other tools like Runway, Pika Midjourney and KlingAI over the past year, it has become difficult for creators to think back to the capabilities of Sora. Runway recently partnered with top entertainment and media company Lionsgate to develop customised versions of Gen-3 Alpha. Unlike OpenAI, Runway has also made Gen-3 Alpha available to all users, though the model remains subscription-based. Meta also recently introduced its video generation model, Movie Gen, a 13B parameter model designed for video and text-to-audio generation. Its chief features include generating videos from text, editing videos with text, producing personalised videos, and creating sound effects. The model is not publicly available yet. China also took this opportunity to emerge as a major competitor, surpassing the capabilities of several existing platforms. Kuaishou, a Chinese competitor to TikTok, launched its powerful AI video tool, Kling, this year, which users have adopted as a direct alternative to Sora. Tencent also released its 13B open-source HuanYuan video generation model.
[8]
Google's AI video generator rolls out in private preview. Here's who can access
Veo, Google's Sora competitor, can now be used to create company assets, and the results are impressive. Take a look for yourself. Despite being late to the image generation space, Google's Imagen models have proven highly competitive, even powering ZDNET's overall top pick for best image generator. Now, the company is expanding into text-to-video generation and making its model Veo available to Google Cloud customers. Also: Google's new AI tool lets you make chess as weird as you want it - for free On Tuesday, Google announced that Veo is available in private preview on Vertex AI, the Google Cloud platform where developers and data scientists can more easily access, build, train, and deploy large language models (LLMs) for their applications and use cases. Veo was first released in May at Google I/O. According to Google, Veo is the company's most advanced video generation model, capable of creating realistic videos that adhere to a user's prompt with 24 or 30 fps. In the examples provided, the generations look impressive, tackling the issue of consistency between motions, which is a big challenge for video generators. Along with text prompts, the model can also use reference images to create videos that bring pictures to life, remaining consistent in style, as seen in the two examples below: Vertex AI users will also have access to Imagen 3, the company's most advanced text-to-image generator, which now has a customization feature that enables users to include a reference image, making it easier to create brand assets. Also: You can add a video of yourself speaking in Google Slides now. Here's how Imagen 3 also adds a new editing feature that makes it easier for users to fine-tune images generated by inpainting aspects and outpainting or expanding the image further. According to Google, potential customer use cases include generating images or videos for marketing and advertisement purposes, such as social media content and assets for blogs and events, and even creating film clips. Also: I tested this viral AI image generator, and it does text well To get started with Vertex AI, visit the webpage, which contains many educational materials, including tutorials, a glossary, and tips. You can also start a free trial or contact the sales team for more information.
[9]
Google Veo beats OpenAI Sora to the punch with a video generator that's actually available
Key Takeaways Google's AI video generator, Veo, is now in private preview for Google Cloud customers. Veo is being marketed to advertisers for making cheaper, quicker ads. Imagen 3, Google's upgraded text-to-image generator, will also be available to all Google Cloud customers next week. ✕ Remove Ads In 2023, AI image generators like Midjourney and Stable Diffusion were impressive, but they became a thing of the past when OpenAI introduced Sora, its AI video generator. The startup unveiled Sora in February 2024 but hasn't shared any meaningful updates on its release timeline since then. Google, however, has already made its AI video generator Veo, available to customers, despite unveiling it three months after OpenAI at its developer conference in June. Related A ChatGPT-powered browser could soon gun for Chrome's supremacy Get ready to get scraped Posts2 Google announced on Tuesday that Veo, its AI tool for making short video clips from images and prompts, is now available in private preview for customers using Vertex AI, Google Cloud's AI platform (via The Verge). Veo can churn out 1080p videos in different visual and cinematic styles based on text or image prompts. It's not clear how long the videos can be, though. At launch, Google hinted they'd be about a minute long, but it didn't mention any length limits for this preview release. ✕ Remove Ads Google is pitching Veo to advertisers as a way to make ads faster and cheaper. Companies like Mondelez International (the owner of Oreo), WPP, and travel agency Agoda are already using Veo to create marketing content. Google even shared an example of how Agoda's team used Veo to make a promotional video, and... well, it's not quite there yet. The videos look semi-realistic, which might work for some advertisers, but anyone serious about their marketing game might want to hold off on AI video generators for now. The latest version of Google's Imagen 3 is also here Google announced that the latest version of its Imagen 3 text-to-image generator will also be available to all Google Cloud customers via Vertex starting next week. This expands on its original US-only release through Google's AI Test Kitchen in August. Users on Google's allow list will also get access to new features like prompt-based photo editing and the ability to add their own brand, style, logo, or product features to the generated images. ✕ Remove Ads Google says Imagen 3 creates the most realistic and high-quality images from simple text prompts, with better detail, lighting, and fewer artifacts than previous versions. The images look pretty realistic at first glance, but if you look closely, you'll see there's still some work to be done. Related Google DeepMind's AI watermarking tool goes open source Fighting to separate human from machine Posts2
[10]
Google's new generative AI video model is now available
Veo, Google's latest generative AI video model, is now available for businesses to start incorporating into their content creation pipelines. After first being unveiled in May -- three months after OpenAI demoed its competing Sora product -- Veo has beaten it to market by launching in a private preview via Google's Vertex AI platform. Veo is capable of generating "high-quality" 1080p resolution videos in a range of different visual and cinematic styles from text or image-based prompts. When the model was first announced these generated clips could be vaguely "beyond a minute" in length, but Google doesn't specify length restrictions for the preview release. Some new example clips in Google's announcement are on par with what we've already seen from Veo -- without a keen eye, it's extremely difficult to tell that the videos are AI-generated.
[11]
Google announces generative video model 'Veo' to compete with OpenAI's impending Sora AI model
Google has just introduced its new generative AI video model, Veo, beating OpenAI's text-to-video service Sora, launching Veo into private preview on Google's in-house Vertex AI platform. Google's new Veo model can generate "high-quality" 1080p resolution videos in multiple different visual and cinematic styles, all from text or image-based prompts. The search giant unveiled its text-to-video model a few months ago with generated clips that would be "beyond a minute" in length, but the company didn't specify... but now, these videos that were made by Veo are pretty astounding. The latest version of Google Imagen 3 text-to-image generative will be online and available for all Google Cloud customers on Vertex "starting next week" says the company, which will see an expansion of its US-first release on Google's AI Text Kitchen in August 2024. Google it pitching its new Veo to advertisers as a way to make ads both faster, and cheaper to make. There are companies like Modelez International (the owners of Oreo), WPP, and travel outlet Agoda already using Veo to create marketing content. Google went as far as showing the video that Agoda's marketing team made using Veo, which looks good... but it has that obvious "this was made with AI" feel, and I don't like it at all. Impressive technological feat, but I don't like the future of AI-made marketing, at all. I can see companies that want spend as least amount of money as possible using text-to-video services like Veo, but I know we'll see a flood of these videos in the months and years to come.
[12]
Veo, Google's video-generating AI, arrives at Google Cloud with Vertex AI - Softonic
You will be able to generate videos of up to 6 seconds in 1080p and with various styles Google has announced that Veo, its artificial intelligence model designed to generate short videos from images and texts, will be available as a private preview for Google Cloud customers using the Vertex AI platform. This expansion will allow companies to integrate AI into their platforms or use it for multiple purposes, such as generating advertising clips. Presented in May, Veo generates videos of up to six seconds in 1080p resolution with a wide variety of visual styles, such as landscapes and time lapses. Google claims it has optimized the model for businesses, now allowing the creation of videos in 720p and in landscape or portrait formats. "Since its launch, we have improved Veo for enterprise customers," stated Warren Barkley, product director at Google Cloud, in a statement. "We will continue to add capabilities, as we have done with other models in Vertex AI." The model also stands out for understanding visual effects like explosions and basic notions of physics, although it is not perfect. Sometimes, objects inexplicably disappear or the physical dynamics are incorrect, representing, for example, cars that move in a completely unrealistic way (limitations that reflect the current challenges of generative AI). Additionally, Veo allows editing specific regions of a video and creating longer projects, competing with tools from Adobe, OpenAI and other companies. Google has trained Veo with large volumes of videos and claims that it uses publicly available data. However, this focus on legality does not put the company in a good position, given that it does not offer mechanisms for creators to remove their content from the datasets used for AI training. That said, to avoid other types of legal issues, Google has included filters for violent or explicit content and has implemented a compensation policy for Veo users in Vertex AI. The model has already been integrated into YouTube Shorts and other platforms. However, the risks of this technology being used to create "deepfakes", as well as the impact on the creative industry, where thousands of jobs are expected to be affected by AI, make Google currently adopt a cautious approach with Veo.
[13]
Vertex AI customers can generate media with Imagen 3 and Veo.
Google Cloud customers can now use Imagen 3 and Veo to create images and videos for their businesses. Starting next week, all Google Cloud customers will be able to access Imagen 3 on Vertex AI. Imagen 3 generates high-quality images based on text prompts. Customers will get access to new tools that can help them edit and customize images generated by Imagen 3. Veo, which generates video based on text or image prompts, is now available to Vertex AI customers in private preview. This makes Google Cloud the first cloud provider of its scale to offer a generative AI service that takes still images and transforms them into videos.
[14]
Google's video generator comes to more customers | TechCrunch
Google's video generator is coming to a few more customers -- Google Cloud customers, to be precise. On Tuesday, Google announced that Veo, its AI model that can generate short video clips from images and prompts, will be available in private preview for customers using Vertex AI, Google Cloud's AI development platform. Google says that the launch will enable one customer, Quora, to bring Veo to its Poe chatbot platform, and another, Oreo owner Mondelez International, to create marketing content with its agency partners. "We created Poe to democratize access to the world's best generative AI models," Poe product lead Spencer Chan said in a statement. "Through partnerships with leaders like Google, we're expanding creative possibilities across all AI modalities." Unveiled in April, Veo can generate 1080p clips of animals, objects, and people up to six seconds in length at either 24 or 30 frames per second. Google says that Veo is able to capture different visual and cinematic styles, including shots of landscapes and time lapses, and make edits to already-generated footage. Why the long wait for the API? "Enterprise readiness," says Warren Barkley, senior director of product management at Google Cloud. "Since Veo was announced, our teams have augmented, hardened, and improved the model for enterprise customers on Vertex AI," he said. "As of today, you can create high definition videos in 720p, in 16:9 landscape or 9:16 portrait aspect ratios. Similar to how we have improved capabilities of other models such as Gemini on Vertex AI, we will continue to do this for Veo." Veo understands VFX reasonably well from prompts, says Google (think captions like "enormous explosion"), and has somewhat of a grasp on physics, including fluid dynamics. The model also supports masked editing for changes to specific regions of a video, and is technically capable of stringing together footage into longer projects. In these ways, Veo is competitive with today's leading video-generating models -- not only OpenAI's Sora, but models from Adobe, Runway, Luma, Meta, and others. That's not to suggest that Veo's perfect. Reflecting the limitations of today's AI, objects in Veo's videos disappear and reappear without much explanation or consistency. And Veo often gets its physics wrong. For example, cars will inexplicably, impossibly reverse on a dime. Veo was trained on lots of footage. That's generally how it works with generative AI models: provided with example after example of some form of data, the models pick up on patterns in the data that enable them to generate new data -- videos, in Veo's case. Google, like many of its AI rivals, won't say exactly where it sources the data to train its generative models. Asked about Veo specifically, Barkley would only say the model "may" be trained on "some" YouTube content "in accordance with [Google's] agreement with YouTube creators." (Google's parent company, Alphabet, owns YouTube.) "Veo has been trained on a variety of high-quality, video-description data sets that are heavily curated for safety and security," he added. "Google's foundational models are trained primarily on publicly available sources." Reporting by The New York Times in April revealed that Google broadened its terms of service last year in part to allow the company to tap more data to train its AI models. Under the old ToS, it wasn't clear whether Google could use YouTube data to build products beyond the video platform. Not so under the new terms, which loosen the reins considerably. While Google hosts tools to let webmasters block the company's bots from scraping training data from their websites, it doesn't offer a mechanism to let creators remove their works from its existing training sets. Google maintains that training models using publicly available data is fair use, meaning the company believes it isn't obligated to ask permission from -- or compensate -- data owners. (Google says it doesn't use customer data to train its models, however.) Thanks to the way today's generative models behave when trained, they carry certain risks, like regurgitation, which refers to when a model generates a mirror copy of training data. Tools like Runway's have been found to spit out stills substantially similar to those from copyrighted videos, laying a possible legal minefield for users of the tools. Google's solution is prompt-level filters for Veo, including for violent and explicit content. In the event those fail, the company says its indemnity policy provides a defense for eligible Veo users against allegations of copyright infringement. "We plan to indemnify Veo outputs on Vertex AI when it becomes generally available," Barkley said. Over the past few months, Google has slowly built Veo into more of its apps and services as it works to polish the model. In May, Google brought Veo to Google Labs, its early access program, for select testers. And in September, Google announced a Veo integration for YouTube Shorts, YouTube's short-form video format, to allow creators to generate backgrounds and six-second video clips. What about the deepfake risks of all this, you might be wondering? Google says that it's using its proprietary watermarking technology, SynthID, to embed invisible markers into frames that Veo generates. Granted, SynthID isn't foolproof against edits, and Google hasn't made the content ID piece available to third parties. These may be moot points if Veo doesn't gain meaningful traction. On the partnerships front, Google has ceded ground to generative AI rivals, who've moved quickly to woo producers, studios, and creative agencies with their tools. Runway recently signed a deal with Lionsgate to train a custom model on the studio's movie catalog, and OpenAI teamed up with brands and independent directors to showcase Sora's potential. Google at one point said it was exploring Veo's applications in collaboration with artists including Donald Glover (AKA Childish Gambino). The company gave no update on those outreach efforts today. Google's pitch for Veo -- a way to reduce costs and quickly iterate on video content -- runs the risk of alienating creatives. A 2024 study commissioned by the Animation Guild, a union representing Hollywood animators and cartoonists, estimates that more than 100,000 U.S.-based film, television, and animation jobs will be disrupted by AI by 2026. That might explain Google's cautious, "slow and steady" approach. When asked, Barkley wouldn't give an ETA for Veo's general availability in Vertex, nor would he say when Veo might come to additional Google platforms and services. "We typically release products in preview first, as it allows us to get real-world feedback from a select group of our enterprise customers before it becomes generally available for wider use," he said. "This helps improve functionality and ensure the product meets the needs of our customers." In a related announcement today, Google said that its flagship image generator, Imagen 3, is now available for all Vertex AI customers without a waitlist. It's gained new customization and image editing features -- but these are gated behind a separate waitlist for now.
[15]
Google Takes The Lead With Veo, Its Generative AI Video Model, While OpenAI's Sora Remains Missing From The Market
When it comes to generative AI dominance, companies are actively looking to establish themselves or take the lead by constantly exploring new capabilities. OpenAI has extensively grown and keeps introducing upgrades and innovations to its models through its cutting-edge technology. The company, however, has been promising its users an intelligence model for video creation, namely Sora, that still remains absent from the market. Google, on the other hand, is revolutionizing content production and on a salient head start by launching Veo in private preview. While OpenAI was the first to unveil its generative AI video model, Sora, back in February and even showcased its ability to create 60-second videos from text prompts, Google has now taken the lead by introducing its competition, Veo, in a private preview through its Google's Vertex AI platform. Google showcased its AI video model about three months after OpenAI demonstrated its text-to-video model Sora. During the I/O developer conference, Google claimed that Veo came with extensive capabilities that can help generate high-quality videos in over a minute, and users can use varied visual and cinematic styles. While sparse information is available on whether the length would mirror what was demonstrated previously in the videos, the new clips reveal impressive realistic AI videos. The videos are of high quality, making it extremely difficult to tell apart from real footage unless users look at them with extreme scrutiny. Google's latest Imagen 3, a text-to-image AI generator, will soon be available to Google Cloud customers, with the extensive rolling out next week. The new features would be available to users listed in the allow list and would give them the option for customization, prompt-based photo editing, or even tweaking logos or elements for the generated images. Google keeps on highlighting that the Veo and Imagen 3 models both come with built-in safeguards that would help prevent infringements but also block any inappropriate content. However, through testing, it was revealed that it is not impossible to bypass the safeguards. The content generated through the models is said to have an invisible digital watermark that is generated with the help of DeepMind's SynthID technology, making it easier to tell if the content, in fact, is AI-generated. With Google launching Veo, its video AI model, in a private preview, the pressure is on OpenAI to bring about its Sora model that users have been long awaiting and was promised near the end of 2024.
[16]
Google AI Video Generator Veo is Made For Creating AI Slop
Google’s Veo is being marketed to advertisers for making cheaper, quicker ads, even though consumers may hate it. Today’s Mad Men aren’t smooth-talking suits trying to convince CEOs their brands need to reassure people “You are okay,†it’s tech companies like Google filling the airwaves with cheap, mindless AI slop. Google now has an AI video generator called Veo, and it can take any imageâ€"even AI-generatedâ€"and transform them into mini-movies. The videos look semi-realistic, which is good enough for the advertisers Google is trying to sell its tool to. Google is the first to put its new model into private preview on its business-centric Vertex AI platform. Veo joins Meta Movie Gen and Runway’s video generation tool, though it also beats OpenAI’s Sora model to a wider launch. Last month, artists leaked a version of Sora to the internet in protest of using AI for artistic purposes. Veo can generate text from text prompts and/or images. Google is combining it with its existing Imagen 3 model for any business that wants to create brand-specific images. The AI video generator can create videos with different styles in mind. The company showcased a few examples, including a cartoon man looking up from his desk and smiling. There’s another video of a teddy bear strumming a guitar without any fingers to pluck the strings. Google says these images include invisible, digital watermarks, and the model should stop users from generating anything that might get the public mad if they see it pop up in their next cookies ad. Imagen 3 further allows users to modify images using its “Inpaint†generation to create new objects in an existing photo or its “Outpaint†tool to expand an image, with AI filling in the missing space. It’s akin to Adobe’s existing Firefly tools in programs like Photoshop. In its blog post, the Mountain View tech monolith said it was working with advertisers from Mondelez International. Never heard of them? They’re the massive international conglomerate that owns brands like Chips Ahoy!, Oreo, Ritz, and Tabisco, to name a few. That’s a lot of brands, and the companies together think that consumers demand fast, cheap ads streamed straight to their eyeballs. Google wrote that its Veo model will allow “rapid development of consumer-ready visuals at scale†for its many brands in languages specific to all the countries where it sells its snacks. Google said it’s also working with travel app Agoda. The company claimed it is already working on AI ads, including this odd-looking travel ad that will make you break your mouse trying to click the “skip ad†button. Like most AI video generators, Veo may be limited in how much footage it can generate. In effect, the generated ads may appear like a jumble of random stock videos. AGoda's chief marketer, Matteo Frigerio, said the company created images of “dream destinations†and animated them with Veo. The end result of these promotions is more ads of dubious quality. The modern advertising model relies on cheap ads made quickly, so much so that they can fill every ad space on your social media feed. Mondelez's VP of consumer experience and digital commerce says the company already uses Imagen 3 to produce “hundreds of thousands of customized assets†to reduce “time to market and costs.†The visuals for AI video may improve over time, though that won’t necessarily change how samey, drab, and lifeless they’ll be. A survey from the technology research firm YouGov showed about half of consumers don’t like ads that use AI-generated images, whether for humans or products. Perhaps advertisers will learn the hard way that you can’t cheap out on human-created content, even for ads.
Share
Share
Copy Link
Google has introduced its advanced AI models, Veo for video generation and Imagen 3 for image creation, on its Vertex AI platform, marking a significant advancement in generative AI technology for enterprise clients.
Google has made a significant move in the generative AI space by launching its advanced AI models, Veo and Imagen 3, on the Vertex AI platform. This development positions Google Cloud as the first hyperscale cloud provider to offer an image-to-video model to its enterprise customers [1][4].
Veo, Google's state-of-the-art video generation model, is now available in private preview on Vertex AI. Developed by DeepMind, Veo can create high-quality 1080p videos at 24 or 30 frames per second, with durations extending beyond a minute [2]. The model accepts both text and image prompts, offering unprecedented creative control and understanding of cinematic terms [2].
Key features of Veo include:
Imagen 3, set to launch next week, is Google's most capable image generation model to date. It produces photorealistic images from natural language prompts, surpassing previous versions in detail, lighting, and artifact reduction [1][3].
Imagen 3 offers:
For customers on the allowlist, Google is introducing additional features for Imagen 3, including:
The integration of Veo and Imagen 3 into Vertex AI expands the platform's capabilities, offering organizations a comprehensive suite of tools for AI-driven innovation in marketing, sales, and content creation [4]. This move is expected to streamline production processes, reduce costs, and accelerate prototyping for creative strategies.
Google has implemented several safety features:
Early adopters of these technologies include:
Google's launch of Veo and Imagen 3 comes amidst increasing competition in the AI space:
As generative AI continues to evolve, these advancements by Google represent a significant step forward in making sophisticated AI tools accessible to businesses, potentially transforming various industries through enhanced content creation capabilities.
Reference
[1]
[4]
[5]
Phandroid - Android News and Reviews
|Google Expands its Generative AI Software Offerings with Veo and Imagen 3 - PhandroidGoogle has quietly rolled out its latest AI image generator, Imagen 3, to all users in the United States. This move marks a significant expansion in the availability of Google's advanced text-to-image AI technology.
9 Sources
Google introduces Veo2, an advanced AI video generator that claims superior performance over competitors like OpenAI's Sora Turbo, featuring enhanced realism, cinematic quality, and improved prompt adherence.
24 Sources
Google has opened up access to Imagen 3, its latest AI text-to-image generator, to a wider audience. The tool is now available to Google Cloud's Vertex AI customers in public preview, marking a significant step in AI image generation technology.
2 Sources
Google's advanced AI image generator, Imagen 3, is now more widely accessible through the Gemini app. This move puts Google in direct competition with other AI image generation tools like DALL-E and Midjourney.
2 Sources
Google DeepMind has taken a significant lead in the artificial general intelligence (AGI) race with the launch of Veo 2, a powerful video generation tool, alongside other AI innovations. This development challenges OpenAI's recent advancements and reshapes the competitive landscape in AI technology.
3 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved