Curated by THEOUTPOST
On Tue, 7 Jan, 8:03 AM UTC
5 Sources
[1]
Google is building its own 'world modeling' AI team for games and robot training
World models are a relatively new development within AI that could serve a variety of purposes, such as creating real-time interactive media environments for video games and movies, and realistic training scenarios for robots and other AI systems. It's also part of Google's push to achieve an artificial general intelligence system, or AGI, before its competitors. "DeepMind has ambitious plans to make massive generative models that simulate the world," Brooks announced in an X post on Monday. Brooks included two open job listings for research engineers and scientists who will help to advance AI "world models" capable of simulating real-world scenarios by solving problems around training "at massive scale," curating training data, and studying how they can be integrated with multimodal language models.
[2]
Google is forming a new team to build AI that can simulate the physical world | TechCrunch
Google is forming a new team to work on AI models that can simulate the physical world. Tim Brooks, one of the co-leads on OpenAI's video generator, Sora, who left for Google's AI research lab, Google DeepMind, last October, will lead the new team, he announced in a post on X. It'll be a part of Google DeepMind. "DeepMind has ambitious plans to make massive generative models that simulate the world," Brooks wrote Monday morning. "I'm hiring for a new team with this mission." According to job listings Brooks linked to in his post, the new modeling team will collaborate with and build on work from Google's Gemini, Veo, and Genie teams to tackle "critical new problems" and scale models "to the highest levels of compute." Gemini is Google's flagship series of AI models for tasks like analyzing images and generating text, while Veo is Google's own video generation model. As for Genie, it's Google's take on a world model -- AI that can simulate games and 3D environments in real time. Google's latest Genie model, previewed last December, can generate a massive variety of playable 3D worlds. "We believe scaling [AI training] on video and multimodal data is on the critical path to artificial general intelligence," reads one of the job descriptions. Artificial general intelligence, or AGI, generally refers to AI that can accomplish any task a human can. "World models will power numerous domains, such as visual reasoning and simulation, planning for embodied agents, and real-time interactive entertainment." Per the description, Brooks' new team will look to develop "real-time interactive generation" tools on top of the models they build, and study how to integrate their models with existing multimodal models such as Gemini. A number of startups and big tech companies are chasing after world models, including influential AI researcher Fei-Fei Lee's World Labs, Israeli upstart Decart, and Odyssey. They believe that world models could one day be used to create interactive media, like video games and movies, and run realistic simulations like training environments for robots. But creatives have mixed feelings about the tech. A recent Wired investigation found that game studios like Activision Blizzard, which has laid off scores of workers, are using AI to cut corners, ramp up productivity, and compensate for attrition. And a 2024 study commissioned by the Animation Guild, a union representing Hollywood animators and cartoonists, estimated that over 100,000 U.S.-based film, television, and animation jobs will be disrupted by AI by 2026. Some startups in the nascent world modeling space, like Odyssey, have pledged to collaborate with creative professionals -- not replace them. We'll have to see if Google follows suit. There's also the unresolved matter of copyright. Some world models appear to be trained on clips of video game playthroughs, which could make the companies developing those models the target of lawsuits in cases where the videos were unlicensed. Google, which owns YouTube, asserts that it has permission to train its models on YouTube videos in accordance with the platform's terms of service. But the company hasn't said which specific videos it is sourcing for training.
[3]
Google Wants to Simulate the World With AI
Apparently not content with its grip on this world, Google is in the process of staffing up its DeepMind research lab to build generative models that are capable of simulating the physical world. The projectâ€"which will be headed up by Tim Brooks, one of the leads who helped build OpenAI’s video generator, Soraâ€"will be a critical part of the company's attempt to achieve artificial general intelligence, according to job listings related to the new team. Brooks, who joined DeepMind after fleeing from OpenAI back in October, and his team have "ambitious plans to make massive generative models that simulate the world." According to the role descriptions, the effort to build world models will "power numerous domains, such as visual reasoning and simulation, planning for embodied agents, and real-time interactive entertainment." If you're willing to take on one of these roles, maybe you can figure out what those vagueries mean and get back to us. A world model, put as simply as possible, typically seeks to simulate how the world actually works. Generative models like Sora are able to replicate things that it has seen before within its training data, it doesn't have any real understanding as to why that thing happens. So it can successfully generate a video of a person throwing a baseball, but it doesn't have any understanding of the physics of what is happening. World models aim to arm the machine with enough information to actually parse through how an action happens and the likely outcome of it. Meta’s chief AI scientist Yann LeCun described world models this way during a speech at Hudson Forum earlier this year: “A world model is your mental model of how the world behaves...You can imagine a sequence of actions you might take, and your world model will allow you to predict what the effect of the sequence of action will be on the world.†World models are difficult to build for a number of reasons, including the massive amount of compute needed to run a model and the lack of sufficient training data to create an accurate model, resulting in most world models working only for limited and specific contexts. DeepMind's team seems intent on taking the world model wider. The plan is to build “real-time interactive generation†tools on top of the models and potentially look into how they could integrate their world model into Google's large language model Gemini. One likely area that DeepMind will try to tackle is video games. The job description for the new team notes that they will collaborate with the Veo and Genie teams at Google. Genie is Google's Sora-like video generator and Genie is an existing world model that can simulate 3D environments in real time. The video game industry is already keen to adopt AI tools, displacing thousands of workers. A CVL Economics survey found that more than 86% of all gaming firms have already adopted generative AI tools and nearly 15% of all gaming jobs could be disrupted by 2026. Maybe improving this world would be a better use of time than modeling it.
[4]
Google's DeepMind is recruiting AI researchers to advance world model development - SiliconANGLE
Google's DeepMind is recruiting AI researchers to advance world model development Google LLC's DeepMind is trying to put together a team of specialist artificial intelligence researchers to expand its leadership in the field of "world models" that can simulate physical environments. DeepMind's new team will be led by former OpenAI researcher Tim Brooks, who joined the company in October. He announced the new team via a post on X, and the story was first picked up by TechCrunch. Brooks, who previously led the development of OpenAI's video generation model Sora, linked to a number of job listings in his tweet, and they reveal that his new team will collaborate with experts who led the development of Google's Gemini, Veo and Genie models. Gemini is Google's best known large language model for generating text and analyzing images, rivaling OpenAI's GPT-4o, while Veo is a video-generation model similar to Sora. Genie is not so well known. It's what's known in AI circles as a "world model" that's able to create and simulate playable 3D worlds in real time based on text prompts. DeepMind first announced Genie last month, saying it can simulate virtual worlds along with realistic animations and physics, and also support interactions between all of those elements. Users can prompt Genie with text, an image, or a combination of the two. DeepMind showed various example worlds created using Genie, including a sailing simulation, a cyberpunk Western and others. Brook's job listings suggest that his new team at DeepMind will tackle "critical new problems" while trying to scale AI models to "the highest levels of compute". It will work to create "real-time interactive generation" tools based on the models it creates, and also study how to integrate those models with existing LLMs like Gemini. In one of the job descriptions, DeepMind states that world models are key to building artificial general intelligence or AGI, which refers to AI systems that can perform any task that a human could. "World models will power numerous domains, such as visual reasoning and simulation, planning for embodied agents, and real-time interactive entertainment," the listing notes. DeepMind's Genie puts the company at the forefront of world model development, but it faces a number of competitors in this area, such as World Labs Technologies Inc., a startup created by the renowned AI researcher Fei-Fei Lee, as well Odyssey Systems Inc. and Decart.AI Inc. One of the most obvious use cases of world models is the creation of interactive media that could be used in the making of movies and video games. Alternatively, they can also be used as the basis of realistic training environments for robots and other AI systems. World model technology has raised some concerns from creative workers, though. A 2024 study by the Animation Guild, which is a union that represents animators and cartoonists in Hollywood, estimates that AI has the potential to disrupt more than 100,000 U.S.-based film, TV and animation jobs within the next two years. Another concern pertains to copyright. Many of the world models created by Genie and its rivals look and feel a lot like those we find in expansive video games such as Grand Theft Auto and Fortnite, which suggests they may have been trained on video game walkthroughs posted online, or even the games themselves. That could lead to legal problems for world model developers down the road.
[5]
Move Over GenAI. Google Says Get Ready for GenWorld | PYMNTS.com
Google DeepMind announced Monday (Jan. 6) that it is creating a new team to work on "massive" generative models that would "simulate the world." These models represent the next stage of advancement in artificial intelligence (AI) capabilities in decision-making, planning and creativity. World models are computational frameworks that help AI systems understand and simulate the real or virtual world. They are key to helping teach AI systems to navigate an environment and have widespread applications in robotics, gaming and autonomous systems. For example, autonomous vehicles use these world models to simulate traffic and road conditions. They can also train generalist AI robots in different environments. A common problem is the lack of rich, diverse and safe training environments for so-called embodied AI. DeepMind's job posting on Monday said that scaling AI models is also important to the tech's evolution. "We believe scaling pretraining on video and multimodal data is on the critical path to artificial general intelligence. World models will power numerous domains, such as visual reasoning and simulation, planning for embodied agents, and real-time interactive entertainment," the job posting said. PYMNTS reached out to Google but has yet to receive a reply. Tim Brooks, who left OpenAI in October to join Google DeepMind, will lead the team. At OpenAI, Brooks co-led the development of Sora, its video generation model that went viral upon unveiling because of its sophistication. According to job listings for the team, the new hires will "collaborate with and build on" the work from Gemini, Google's flagship large multimodal model, Veo (video generation model), and Genie (world model) teams. Google DeepMind's focus on world models comes as AI startup World Labs said it raised $230 million when it came out of stealth last September. The startup is developing large world models. Led by Stanford AI pioneer Fei Fei Li, the startup is funded by AI pioneer and Nobel laureate Geoffrey Hinton, Salesforce CEO Marc Benioff, LinkedIn co-founder Reid Hoffman, former Google Chairman Eric Schmidt as well as Andreessen Horowitz, NEA, NVentures and others. Google DeepMind has already developed several world models, including Genie and Genie 2. Genie 2 can turn text and image into 3D worlds that react according to a user's actions in this environment. (Genie created only 2D worlds). Genie 2 is a powerful AI model that learns from a large video dataset and uses a process that compresses video frames into simpler, meaningful representations through an autoencoder. These compressed frames are then analyzed by a transformer model that predicts how the video should progress, step-by-step, using a method similar to how text-generating models like ChatGPT work. Trained on a large-scale video dataset, Genie 2 can display object interactions, complex character animation, physics (such as gravity and splashing water effects) and behavior modeling of other agents. The world it creates can last up to a minute, with most in the 10- to 20-second range. Google DeepMind's expanded focus on world models will further sharpen its AI systems' capabilities as it competes with OpenAI, Meta, Microsoft and Amazon in serving enterprises. The latest innovation adds to its already rich array of innovations, one of which most recently led to Nobel Prize nods for CEO Demis Hassabis and John M. Jumper: AlphaFold2. It is an AI model that predicted the nature of all known proteins, solving a 50-year biochemistry challenge. In a paper published in October, Google DeepMind researchers said they trained a large language model called the Habermas Machine to serve as an AI mediator that helped small U.K. groups find common ground on controversial issues such as Brexit or immigration. It did so by writing a "group statement" that captured their shared viewpoints.
Share
Share
Copy Link
Google's DeepMind is forming a new team to create AI models capable of simulating the physical world, aiming to advance artificial general intelligence and revolutionize various domains including gaming, robotics, and interactive media.
Google DeepMind is making significant strides in the field of artificial intelligence by assembling a new team dedicated to developing advanced "world modeling" AI systems. This initiative, led by former OpenAI researcher Tim Brooks, aims to create massive generative models capable of simulating real-world scenarios with unprecedented accuracy and complexity [1][2].
World models represent a cutting-edge development in AI, designed to simulate the physical world and understand how it behaves. These models have the potential to revolutionize various domains, including:
Google DeepMind believes that scaling AI training on video and multimodal data is crucial for achieving artificial general intelligence (AGI), a long-standing goal in the field of AI research [2].
The new team will collaborate with and build upon the work of Google's existing AI projects, including:
This collaborative approach aims to tackle critical new problems and scale models to the highest levels of compute, pushing the boundaries of what's possible in AI simulation [2].
The development of advanced world models could have far-reaching implications across multiple industries:
While the potential of world models is immense, their development and implementation raise several concerns:
Google DeepMind is not alone in pursuing world model technology. Other notable players in this space include:
As the race to develop more advanced world models intensifies, the coming years are likely to see rapid advancements in this transformative technology, potentially reshaping numerous industries and pushing the boundaries of artificial intelligence capabilities.
Reference
[2]
[3]
[4]
Google DeepMind unveils Genie 2, an advanced AI model capable of generating playable 3D environments from single images or text prompts, showcasing potential applications in AI research and creative prototyping.
19 Sources
Google is making significant strides in developing AI models with human-like reasoning abilities, intensifying its competition with OpenAI. This move comes in response to OpenAI's recent release of its o1 model, known for complex problem-solving skills.
7 Sources
World Labs, led by AI pioneer Fei-Fei Li, has introduced an innovative AI system that transforms 2D images into explorable 3D environments, potentially revolutionizing content creation for games, movies, and virtual experiences.
6 Sources
Google's DeepMind takes the lead in the AI race with the launch of Veo 2, outperforming OpenAI's Sora in video generation capabilities. This development, along with other AI advancements, marks a significant shift in the competitive landscape of artificial intelligence.
4 Sources
Google has introduced its advanced AI models, Veo for video generation and Imagen 3 for image creation, on its Vertex AI platform, marking a significant advancement in generative AI technology for enterprise clients.
16 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved