Curated by THEOUTPOST
On Thu, 5 Dec, 12:03 AM UTC
19 Sources
[1]
Watch Google DeepMind's Genie 2 generate playable 3D worlds | TechCrunch
Last week, Google DeepMind announced Genie 2, a new video model that generates plausible, consistent, playable 3D environments based on a prompt image. DeepMind claims that Genie 2 has a slew of new and exciting emerging capabilities that improve the lighting, reflections, and can even generate videos from real-world images. Within these generated worlds, the model can also create animated characters that can act as embodied agents for training purposes. The characters can interact with the world by doing things like popping balloons, opening doors, and even engaging with non-playable characters. The DeepMind team seems to be hopeful that these AI-generated video games will be a helpful step in training agents, "Genie 2 shows the potential of foundational world models for creating diverse 3D environments and accelerating agent research," they wrote in a recent post. Video games are a helpful tool in AI research. They are interactive and have a unique blend of challenges. They are also safe play grounds to train, test, and measure agents that may end up in products used in the real world.
[2]
DeepMind's Genie 2 can generate interactive worlds that look like video games | TechCrunch
DeepMind, Google's AI research org, has unveiled a model that can generate an "endless" variety of playable 3D worlds. Called Genie 2, the model -- the successor to DeepMind's Genie, which was released earlier this year -- can generate an interactive, real-time scene from a single image and text description (e.g. "A cute humanoid robot in the woods"). In this way, it's similar to models under development by Fei-Fei Li's company, World Labs, and Israeli startup Decart. DeepMind claims that Genie 2 can generate a "vast diversity of rich 3D worlds," including worlds in which users can take actions like jumping and swimming by using a mouse or keyboard. Trained on videos, the model's able to simulate object interactions, animations, lighting, physics, reflections, and the behavior of "NPCs." Many of Genie 2's simulations look like AAA video games -- and the reason could well be that the model's training data contains playthroughs of popular games. But DeepMind, like many AI labs, wouldn't reveal many details about its data sourcing methods, likely for competitive reasons. One wonders about the IP implications. DeepMind -- being a Google subsidiary -- has unfettered access to YouTube, and Google has previously implied that its ToS gives it permission to use YouTube videos for model training. But is Genie 2 basically creating unauthorized copies of the games it "watched"? That's for the courts to decide, I suppose. Genie 2 can generate consistent worlds with different perspectives, like first-person and isometric views, for up to a minute, with the majority lasting 10-20 seconds. "Genie 2 responds intelligently to actions taken by pressing keys on a keyboard, identifying the character and moving it correctly," DeepMind explained in a blog post. "For example, our model [can] figure out that arrow keys should move a robot and not trees or clouds." Most models like Genie 2 -- world models, if you will -- can simulate games and 3D environments, but with artifacting, consistency, and hallucinatory issues. For example, Decart's Minecraft simulator, Oasis, has a low resolution and quickly "forgets" the layout of levels. Genie 2, however, can remember parts of a simulated scene that aren't in view and render them accurately when they become visible again, DeepMind claims. (World Labs' models can do this too.) Now, games created with Genie 2 wouldn't be all that fun, really. Having your progress erased every minute would drive anyone up the wall. So DeepMind's positioning the model as more of a research and creative tool -- a tool for prototyping "interactive experiences" and evaluating AI agents. "Thanks to Genie 2's out-of-distribution generalization capabilities, concept art and drawings can be turned into fully interactive environments," DeepMind wrote. "And by using Genie 2 to quickly create rich and diverse environments for AI agents, our researchers can generate evaluation tasks that agents have not seen during training." DeepMind says that while Genie 2 is in the early stages, the lab believes it'll be a key component in developing AI agents of the future. Google has poured increasing resources into world models, which promise to be the next big thing in AI. In October, DeepMind hired Tim Brooks, who was heading development on OpenAI's Sora video generator, to work on video generation technologies and world simulators.
[3]
Google DeepMind's Genie 2 can generate interactive 3D worlds
World models -- AI algorithms capable of generating a simulated environment in real-time -- represent one of the more impressive applications of machine learning. In the last year, there's been a lot of movement in the field, and to that end, Google DeepMind announced Genie 2 on Wednesday. Where its predecessor was limited to generating 2D worlds, the new model can create 3D ones and sustain them for significantly longer. Genie 2 isn't a game engine; instead, it's a diffusion model that generates images as the player (either a human being or another AI agent) moves through the world the software is simulating. As it generates frames, Genie 2 can infer ideas about the environment, giving it the capability to model water, smoke and physics effects -- though some of those interactions can be very gamey. The model is also not limited to rendering scenes from a third-person perspective, it can also handle first-person and isometric viewpoints. All it needs to start is a single image prompt, provided either by Google's own Imagen 3 model or a picture of something from the real world. Notably, Genie 2 can remember parts of a simulated scene even after they leave the player's field of view and can accurately reconstruct those elements once they become visible again. That's in contrast to other world models like Oasis, which, at least in the version Decart showed to the public in October, had trouble remembering the layout of the Minecraft levels it was generating in real time. However, there are even limitations to what Genie 2 can do in this regard. DeepMind says the model can generate "consistent" worlds for up to 60 seconds, with the majority of the examples the company shared on Wednesday running for significantly less time; in this case, most of the footage is about 10 to 20 seconds long. Moreover, artifacts are introduced and image quality softens the longer Genie 2 needs to maintain the illusion of a consistent world. DeepMind didn't detail how it trained Genie 2 other than to state it relied "on a large-scale video dataset." Don't expect DeepMind to release Genie 2 to the public anytime soon, either. For the moment, the company primarily sees the model as a tool for training and evaluating other AI agents, including its own SIMA algorithm, and something artists and designers could use to prototype and try out ideas rapidly. In the future, however, DeepMind suggests world models like Genie 2 are likely to play an important part on the road to artificial general intelligence. "Training more general embodied agents has been traditionally bottlenecked by the availability of sufficiently rich and diverse training environments," DeepMind said. "As we show, Genie 2 could enable future agents to be trained and evaluated in a limitless curriculum of novel worlds."
[4]
Google's Genie 2 AI Model Can Generate Playable 3D Worlds
Google DeepMind unveiled the successor to the Genie artificial intelligence (AI) model, which could generate endless 2D game worlds, on Wednesday. Dubbed Genie 2, the new AI model is capable of generating unique action-controllable, playable 3D environments based on a single image prompt. Calling Genie 2 an AI "world model", the company stated that it can generate up to minute-long environments with consistent objects. The company said these generated worlds could be played by humans or can be used to train AI agents. In a blog post, the company detailed the new AI model and its capabilities. While its predecessor could only generate game worlds for 2D platformer games, the Genie 2 AI model can generate 3D worlds complete with consistent models that can be interacted with. This means humans or AI agents can walk, run, swim, climb, and perform more actions in these environments. Genie 2's generative capabilities allow it to generate routes, buildings, and objects that cannot be seen in the input image. These elements are designed and rendered by the model from scratch. Additionally, the foundation model is also capable of maintaining consistency in these environments. This means even when a player moves away from one area and returns back, the environments remain the same. Apart from this, Genie 2 is capable of generating different perspectives such as first-person views, isometric views, or third-person views. Further, users can also interact with the objects in the generated worlds and can perform actions such as opening a door, bursting a balloon, or climbing a ladder. The model can also be prompted to generate physics-related effects such as water ripples, smoke, gravity, directional lighting, reflections, and more. Coming to the technical details, DeepMind explained that Genie 2 is an autoregressive latent diffusion model and has been trained on a large video dataset. The transformer architecture also includes an autoencoder which enables frame-by-frame generation of these worlds. Notably, DeepMind also released an AI model dubbed Scalable Instructable Multiworld Agent or SIMA earlier this year, which is essentially capable of agentic AI functions in 3D worlds. The company says Genie 2 is capable of providing unique environments to similar AI agents and training them for various real-life scenarios. Since the world model can generate unique environments, Google says this will eliminate the risk of data contamination and will allow developers to correctly assess an AI agent's capabilities.
[5]
This is Genie 2, the new model from Google DeepMind capable of generating interactive 3D worlds - Softonic
AI is capable of generating interactive scenes in real time from a single image or text description DeepMind, the artificial intelligence research division of Google, has unveiled Genie 2, an innovative model capable of creating an apparently infinite variety of playable three-dimensional worlds. This model, which follows Genie, launched earlier this year, stands out for generating interactive scenes in real-time from a single image or text description, such as "a humanoid robot in Ancient Egypt." Although it is reminiscent of developments by companies like World Labs and Decart, Genie 2 has features that make it unique. The DeepMind proposal promises an immense diversity of 3D worlds rich in details, where users can perform actions like jumping or swimming with keyboard and mouse. Thanks to its training with videos, Genie 2 can simulate object interactions, animations, lighting, physics, and even the behavior of non-playable characters (NPCs). Many of these worlds resemble triple-A video games, which raises serious questions about whether its training included sessions from popular titles. For now, DeepMind has avoided revealing details about how it collected the data. The model has also reignited the debate over intellectual property. As a subsidiary of Google, DeepMind can access YouTube videos, and the company itself has stated that its terms of service allow the use of these materials to train AI models. Despite its limitations, such as simulations lasting between 10 and 60 seconds, Genie 2 is more consistent than other similar models. For example, it avoids common issues of visual artifacts and scene forgetting, something that affects competitors like Oasis, from Decart. Additionally, it can remember and render objects that had gone out of the field of view. DeepMind does not see this model as a tool for traditional games, but as a creative and research resource. According to the company, "Genie 2 turns conceptual art into complete interactive environments" and facilitates the evaluation of AI agents in completely new tasks, opening new possibilities for prototypes and experimentation.
[6]
Google's Genie 2 "world model" reveal leaves more questions than answers
In March, Google showed off its first Genie AI model. After training on thousands of hours of 2D run-and-jump video games, the model could generate halfway-passable, interactive impressions of those games based on generic images or text descriptions. Nine months later, this week's reveal of the Genie 2 model expands that idea into the realm of fully 3D worlds, complete with controllable third- or first-person avatars. Google's announcement talks up Genie 2's role as a "foundational world model" that can create a fully interactive internal representation of a virtual environment. That could allow AI agents to train themselves in synthetic but realistic environments, Google says, forming an important stepping stone on the way to artificial general intelligence. But while Genie 2 shows just how much progress Google's Deepmind team has achieved in the last nine months, the limited public information about the model thus far leaves a lot of questions about how close we are to these foundational model worlds being useful for anything but some short but sweet demos. Much like the original 2D Genie model, Genie 2 starts from a single image or text description and then generates subsequent frames of video based on both the previous frames and fresh input from the user (such as a movement direction or "jump"). Google says it trained on a "large-scale video dataset" to achieve this, but it doesn't say just how much training data was necessary compared to the 30,000 hours of footage used to train the first Genie. Short GIF demos on the Google DeepMind promotional page show Genie 2 being used to animate avatars ranging from wooden puppets to intricate robots to a boat on the water. Simple interactions shown in those GIFs demonstrate those avatars busting balloons, climbing ladders, and shooting exploding barrels without any explicit game engine describing those interactions. Perhaps the biggest advance claimed by Google here is Genie 2's "long horizon memory." This feature allows the model to remember parts of the world as they come out of view and then render them accurately as they come back into the frame based on avatar movement. This kind of persistence has proven to be a persistent problem for video generation models like Sora, which OpenAI said in February "do[es] not always yield correct changes in object state" and can develop "incoherencies... in long duration samples."
[7]
Google's Genie 2 creates playable 3D worlds from text prompts and sample images
Through the looking glass: The number of potential use cases for generative AI tools is growing fast. With its newly introduced model in the Genie line, Google now wants to provide "creatives" who have no worldbuilding skills with a straightforward and rapid way to prototype their ideas. Genie 2 is a new foundation model capable of generating an "endless" amount of 3D environments that can be controlled by the user, Google said. The generative AI can sprout out an entire virtual world based on a single textual prompt and sample image, which can be either user-made or generated by external AI models. Other companies are trying to turn generative AI tech into a worldbuilding wonder, though results may vary. The Oasis experiment was designed to generate a Minecraft-like experience frame by frame, but it just resembles a low-definition Minecraft world affected by dementia at this point. Genie 2 can maintain an apparently consistent world for up to a minute, Google assures, though we're still talking about a low-res, garbled, and very unpleasant-to-watch visual mess. Anyhow, Google doesn't seem interested in visual repulsion or uncanny valley issues in the slightest. Mountain View said that games play a key aspect in AI research, providing an ideal environment to test new capabilities. The Genie 2 worlds can be controlled by a puny human using a traditional keyboard and mouse combo, with the generative model simulating all the consequences of the players' actions. Unlike the recently unveiled SIMA, Genie 2 can provide "intelligent" visual reactions in an endlessly generated virtual environment. The AI model can generate different routes, or "counterfactual experiences" for training agents, starting from the same basic frame. With different actions taken by the human player, the world around changes and hallucinates accordingly. Genie 2 can also remember previously generated parts of the virtual world that are outside the player's camera, and even render them "accurately" when they come back in the frame. The model can create different player perspectives, including first-person view, isometric views, and third-person driving cockpits. Complex 3D structures and object interactions are also part of the mix. Additional capabilities of the new foundation model include character animation, NPCs, physics, smoke, gravity, lighting, and reflections. Google said that Genie 2 and similar generative AI tech could be useful to prototype and experiment interactive experiences, with gaming being the first potential application that comes to mind. The research is still in its early stages, which means that there is a lot of room for improvement during the next few model training sessions.
[8]
Genie 2 can build playable games from a sentence
DeepMind has introduced Genie 2, an advanced AI model capable of generating interactive 3D worlds from simple text descriptions and images. Announced on December 4, 2024, the successor to Genie 1 creates diverse virtual environments, allowing users to engage dynamically through keyboard and mouse controls, simulating actions like jumping and swimming. It is trained on various videos and can model interactions, physics, and NPC behavior. Genie 2 has the ability to visualize a vast range of 3D scenes, offering functionalities similar to AAA video games. By processing prompts, the model can produce consistent environments, including first-person and isometric perspectives that last between 10 to 20 seconds, with some extending up to a minute. DeepMind's claims assert that the model remembers parts of a scene beyond immediate view, accurately rendering them upon reappearance. This contrasts with existing models like Decart's Minecraft simulator, which often struggle with consistency and resolution. These leaked Battlemage Intel GPUs are gaming beasts The foundational element of Genie 2 involves processing a reference image paired with a text prompt, streamlining the generation of highly interactive environments. For instance, a prompt like "A cute humanoid robot in the woods" allows the model to create a corresponding scene. The inclusion of complex interactions, such as enabling players to pop balloons or navigate environments, showcases its robust physics engine. Despite such capabilities, concerns arise regarding the model's interpretation of input, particularly whether it inadvertently replicates copyrighted game elements due to a lack of detail regarding training data sources. The intellectual property implications stemming from Genie 2's training data raise significant questions. DeepMind benefits from Google's extensive access to YouTube, possibly utilizing video content for model training. This access leads to speculation about the generation of unlicensed reproductions of observed games, which may become a matter for legal examination. As the AI model blurs the lines of originality, the concern of unauthorized replication looms large. Findings from a WIRED investigation highlighted that major companies, including Activision Blizzard, are employing AI technologies to streamline processes, potentially leading to job losses. The introduction of Genie 2 may further complicate the landscape, where AI tools accelerate development while challenging traditional job roles. DeepMind emphasizes Genie 2's role in developing diverse evaluation tasks for AI agents. While the model's computing requirements and commercialization prospects remain undisclosed, ongoing advancements point to an evolving relationship between AI and game design. Various reports indicate an influx of investment into world model technologies, signifying a pivotal shift in both AI research and interactive gaming experiences. This trend resonates with a growing fascination for AI's potential to reshape creative industries. DeepMind's focused hiring efforts highlight the commitment to pioneering developments. Their strategic recruitment of talent from other leading AI firms reinforces their intent to remain at the forefront of generative AI technologies.
[9]
Google can generate playable 3D game worlds from AI prompts
It took just a few years for AI art to evolve from text to 2D imagery to generated 3D video. Today, it's taken the next step with Google's Genie 2, which can generate playable 3D game worlds that are constructed on the fly, all from a simple text prompt. Google's Genie 2 is the evolution of its Generative Interactive Environments, which uses AI to construct new, interactive environments on the fly. Genie 1, which Google released in February, could construct 2D environments. Now, Genie 2, which Google announced today, takes that into 3D space. Google calls Genie 2 a "world model," which means it can simulate virtual worlds, with animations, physics, and object interactions. It's a two-step process: Genie 2 requires a prompt image to extrapolate into a world, but that image can be itself generated by an ordinary text prompt. Want a cyberpunk Western? Genie 2 will create it. A sailing simulation? That, too. You just need a reference or a prompt to begin. In Google's case, it used an image generated by Imagen 3, as well as concept art hand-drawn by an artist. Within the world, the player -- either an AI or a human -- can interact with the environment. Google's demo showed a traditional WASD setup, with the arrow keys as alternatives. The problem, however, is consistency. For whatever reason, the model loses coherency after a short time, typically around 20 seconds or so. (The "longest" model Google created was a minute in length.) In part, that may be because the model can generate "counterfactuals," or the different paths and actions a player can choose from a fixed starting point -- turning left or right at a fork in the road, for example. The model has to take into account a "long horizon," or what happens when a player turns away from a scene, then toward it once again. Google said that Genie 2 can accommodate different perspectives, such as an isometric view, a third-person driving video, or a first-person perspective. Water effects are taken into account, as are complex interactions with the environment. In one demonstration, a player was able to slash a balloon, which popped. Smoke, gravity, and reflections are all modeled, but Google isn't saying to what extent or resolution its models are rendered, or how many polygons are calculated per frame. The Genie 2 environments aren't just for humans. AI "players" can also be modeled, either as NPCs or as the player character. Google showed off how the AI could be told to go through a specific door with a text prompt, and how the AI could recognize the command, understand what it meant in the rendered environment, and then proceed. Google didn't divulge what computing resources Genie 2 requires, whether it will be released publicly, or even whether it plans to commercialize it. But with AI slowly creeping into games via AI-generated dialogue, it appears that AI-simulated games could eventually be real, too. Just not right away.
[10]
Googles Genie 2 AI Unveiled : Capable of Creating Fully Interactive 3D Gaming Worlds
Imagine being able to step into a fully realized 3D world crafted from nothing more than a single image prompt. A world where every action you take -- whether it's diving into water, opening a door, or rearranging furniture -- feels natural, responsive, and alive. For gamers and developers alike, this might sound like a distant dream, but with Google's Genie 2, that dream is becoming a reality. Building on the foundation of its predecessor, Genie 1, this innovative AI technology is redefining what's possible in gaming, offering tools to create immersive, interactive environments with unprecedented ease and precision. Whether you're a seasoned game developer or someone who simply loves exploring virtual worlds, Genie 2 opens up a realm of possibilities that were once unimaginable. From realistic physics and lifelike character animations to environments that adapt and evolve based on your actions, this innovation is more than just a technical upgrade -- it's a creative revolution. Google's Genie 2 introduces a new approach to gaming, using advanced artificial intelligence to generate fully interactive 3D environments from a single image prompt. With Genie 2, users can immerse themselves in adaptive, responsive worlds that react to their actions and decisions, showcasing the immense potential of AI in reshaping game development. At the heart of Genie 2 lies its ability to transform a simple image prompt into a rich, playable 3D environment. This process involves advanced AI algorithms that interpret the image and generate a fully interactive virtual world. Whether you are exploring the environment as a player or programming AI agents, the platform supports seamless interaction through keyboard and mouse inputs. Imagine stepping into a virtual space where every action -- whether it's jumping, swimming, or interacting with objects -- produces realistic and immediate consequences. For instance, jumping into water triggers natural splashes, while moving objects like chairs or doors feels intuitive and lifelike. This precision in simulating actions makes Genie 2 a powerful tool for creating immersive and engaging virtual experiences. Genie 2 marks a significant evolution from Genie 1, transitioning from 2D environments to fully realized 3D worlds. This leap forward is characterized by several key advancements: These enhancements elevate Genie 2 beyond a simple gaming tool, positioning it as a versatile platform for exploring the creative possibilities of AI-driven design. By allowing developers to craft more realistic and interactive worlds, Genie 2 sets a new standard for virtual environments. Here are additional guides from our expansive article library that you may find useful on AI-generated 3D environments. One of the standout features of Genie 2 is its emergent AI capabilities, which allow the system to predict and adapt to the behavior of in-game AI agents. This means that gameplay outcomes can vary significantly, even when starting from the same initial conditions. For example, AI-controlled characters may respond differently to your actions depending on their programmed personalities or environmental changes. This adaptability is further enhanced by Genie 2's long-term memory, which ensures that modifications made to the environment persist across gameplay sessions. If you alter the landscape, move objects, or interact with NPCs, these changes remain intact, creating a cohesive and immersive experience. This feature not only enhances gameplay but also opens up new possibilities for storytelling and world-building. Genie 2 offers a range of perspectives, including first-person, third-person, and isometric views, allowing users to tailor their experience to their preferences. The environments are designed to be both visually stunning and highly interactive, with every element contributing to a sense of realism. Players can engage with non-playable characters (NPCs), manipulate dynamic elements within the world, and experience realistic physics such as gravity, water flow, and smoke behavior. Advanced lighting effects, reflections, and shadows further enhance the visual fidelity, making the virtual worlds feel alive and tangible. These features combine to create an experience that is not only immersive but also deeply engaging. For game developers, Genie 2 is a innovative tool for rapid prototyping. By generating 3D environments from simple image prompts, developers can quickly test and iterate on game concepts without the need for extensive manual design work. This capability significantly reduces development time, allowing creators to focus on refining gameplay mechanics and storytelling. Additionally, Genie 2 enables developers to bring static concept art to life, transforming it into playable experiences. Its ability to generalize beyond its training data ensures that it performs well even on tasks it wasn't explicitly designed for. This flexibility opens up new avenues for creative exploration, making it an invaluable resource for both indie developers and large studios. The capabilities of Genie 2 signal a future where AI-generated worlds become a cornerstone of gaming. By simplifying the creation process, this technology enables developers to focus on innovation and storytelling, while offering players unprecedented levels of interactivity and immersion. As AI continues to advance, it is likely to push the boundaries of what's possible in gaming, creating experiences that are more dynamic, adaptive, and engaging than ever before. Genie 2 serves as a glimpse into this future, where creativity and technology converge to redefine the gaming landscape.
[11]
I just saw the future of gaming -- Google's Genie 2 can turn text into a playable game in real-time
Google DeepMind, a leader in AI innovation, has unveiled Genie 2, a groundbreaking tool capable of creating limitless, diverse 3D environments for training and evaluating AI agents. Jack Parker-Holder, research scientist at DeepMind, today announced the cutting-edge foundation world model, that can generate playable games based on a single text prompt, controllable by humans or AI using standard inputs like a keyboard and mouse. Genie 2 represents a significant leap forward in creating adaptable training grounds for AI, enabling advancements while opening the door to new possibilities for gaming and interactive experiences. For decades, games have served as a cornerstone for AI research. Their immersive and controlled nature, along with the measurable challenges they present, provide ideal environments for testing and advancing AI. From mastering Atari games in the early days of AI development to AlphaGo's world-changing victories in Go and AlphaStar's domination in StarCraft II, DeepMind has continually demonstrated the potential of games as an AI proving ground. However, the progression toward training general embodied agents -- AI that can learn to interact with the physical and virtual world in versatile ways -- has faced a significant roadblock: the availability of varied training environments. Traditional training tools lack the variety and depth required to prepare AI agents for real-world complexity. Genie 2 aims to solve this problem by generating an unlimited curriculum of interactive worlds through gaming. Genie 2 stands apart due to its ability to create highly customizable games on demand. Using a simple input image as a prompt, the system can craft playable worlds that adapt to specific training or gameplay needs. This flexibility allows AI researchers to expose agents to a never-ending stream of challenges, helping them develop skills that are transferable to real-world scenarios. Remarkably, Genie 2 environments can be controlled by both humans and AI agents, making it a valuable tool not just for AI training but also for human evaluation and interaction. This dual functionality has the potential to revolutionize how we test and refine AI systems, merging human creativity with AI's problem-solving possibilities. For gamers, the tech behind Genie 2 hints at a future where in-game environments are more dynamic, personalized, and immersive than ever before. Imagine video games that adapt to your skill level or preferences in real-time, providing a truly tailored experience. For developers, Genie 2 represents a powerful resource for innovation. The ability to rapidly prototype and test interactive worlds could streamline game development workflows and lead to more creative games. One of Genie 2's most exciting implications is its ability to facilitate general agent training. Unlike specialized agents, which excel at single tasks (e.g., playing chess or answering trivia), general agents can adapt to a wide variety of challenges, much like humans. By exposing these agents to new environments, Genie 2 equips them to tackle complex, real-world scenarios where adaptability and versatility are crucial. In addition to advancing AI research, Genie 2 paves the way for entirely new workflows in game development and interactive prototyping. Developers can use Genie 2 to quickly create unique, playable experiences, reducing the time and cost of traditional design processes. For indie developers, this could mean more time spent refining gameplay mechanics and less time building environments from scratch. The implications extend far beyond gaming. Genie 2 could serve as a platform for creative innovation in virtual reality, simulation, and robotics. For example, robots could train in gaming environments generated by Genie 2 to learn how to navigate unfamiliar terrains or interact with objects in new ways. Similarly, virtual assistants could refine their ability to understand and respond to real-world tasks by practicing in these environments. DeepMind's focus on games as a training ground for AI isn't just about creating better agents -- it's about pushing the boundaries of what's possible in both AI and gaming. Genie 2 builds on the legacy of tools like AlphaStar by not only solving immediate training bottlenecks but also reimagining how we approach the creation of interactive virtual worlds. While Genie 2 offers groundbreaking potential, it also raises important questions. Training AI agents in virtually infinite environments can amplify their use cases but ensuring that these capabilities are applied ethically is vital. As AI continues to evolve, developers and researchers must address concerns around misuse, fairness, and transparency. DeepMind has consistently emphasized ethical AI development, and Genie 2 appears to be no exception. The tool provides a safe and controlled environment for testing, making it a valuable resource for refining AI agents before they interact with the real world. Genie 2 is a bold step forward in AI and gaming, offering endless possibilities for training generalist agents while reimagining how gaming environments are created and explored.
[12]
Google claims AI model Genie 2 can generate interactive worlds
The new AI model can generate worlds akin to those seen in modern videogames, according to Google. Google DeepMind has unveiled its latest artificial intelligence (AI) model, Genie 2. According to Google, Genie 2, which was trained on a large-scale video dataset, can create interactive worlds with just a prompt, similar to a mythological 'genie'. Specifically, Google claimed that the models are prompted with a single image. In the examples given in the blogpost, the images were all generated by Imagen 3, Google DeepMind's text-to-image model. Genie 2 then produces "action-controllable, playable 3D environments". "It can be played by a human or AI agent using keyboard and mouse inputs," Google explained. "This means anyone can describe a world they want in text, select their favourite rendering of that idea, and then step into and interact with that newly created world, or have an AI agent be trained or evaluated in it." Examples of prompts being rendered into interactive settings include a robot walking around a futuristic city and a sailboat traversing a body of water. Explaining the motivation behind the project, Google DeepMind emphasised its view that videogames play a key role in the world of AI research: "Their engaging nature, unique blend of challenges and measurable progress make them ideal environments to safely test and advance AI capabilities." The company also pointed out what it considers to be the improvements made between Genie 1 and Genie 2: "Until now, world models have largely been confined to modelling narrow domains. In Genie 1, we introduced an approach for generating a diverse array of 2D worlds. Today, we introduce Genie 2, which represents a significant leap forward in generality." AI: improved technology, increasing concerns The Genie 2 model comes at a time when Big Tech is grappling with the potential consequences of AI technology on creative industries and news reporting. Last month, two news outlets lost a copyright lawsuit against OpenAI where they alleged that the ChatGPT-maker "intentionally" removed copyright management information from their work and used it to train its AI models. The plaintiffs, Raw Story Media and AlterNet Media, were unable to prove "concrete injury", the judge presiding over the case said, adding that the likelihood that ChatGPT - an AI model that processes large swaths of data from across the internet - would output plagiarised content from one of the plaintiffs is "remote". According to Dr Sean Ren, an associate professor of computer science at the University of Southern California and the CEO of Sahara AI, the news outlets' loss "highlights how hard it is for publishers to prove copyright infringement in the context of today's AI". And in October, The Guardian reported that UK ministers faced a backlash over plans to allow AI companies to train their models on content from publishers and artists by default unless they opt out. Earlier that month, thousands of creatives around the world signed a statement warning AI companies that the unlicensed use of their work to train generative AI models is a "major, unjust threat" to their livelihoods. Don't miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic's digest of need-to-know sci-tech news.
[13]
Google AI Lets You Design 3D Game Worlds with a Simple Prompt
The possibilities are vast: Who is interested in the virtual space travel of the future city or a journey into the medieval dark castle? Where Genie 2 accomplishes this is on the fly, by creating these worlds and providing an opportunity for the players to wander about interactivity. Google's demo was based on a regular WASD keyboard layout with an additional option for arrows, so humans or AI could interact with the environment. Genie 2 is revolutionary, but like any concept, it has its challenges too. At the moment, the generated worlds are not consistent over longer durations and most models hold for only about 20 seconds before coherence collapses. This is partly because the AI is trying to account for counterfactuals and other potential actions at a crossroads like turning left or right. Maintaining a consistent "long horizon" view remains a technical hurdle. Despite the limitations of the former, Genie 2 holds a lot of potential to change the game development industry. It empowers developers, designers and tinkerers with a super simple robot creation tool that frees them from the time-consuming, yet confusing, process of complex modeling and coding. With each passing moment of advancement in AI technology, Google's Genie 2 is an exciting gateway into the future of game design with worlds restricted just by the mind. Stay tuned, these are just starting to happen.
[14]
Google has developed an AI tool that generates fully playable 3D worlds
Google has developed a new AI tool that can generate a fully playable 3D game world from a single image prompt. Genie 2, as the tool is known, is described as a "world model". So where Genie 1 generated 2D worlds, this subsequent tool can create 3D worlds in a variety of perspectives and includes the consequences of taking actions like jumping or swimming. Google stresses the tool can be used for "rapid prototyping" as it "enables artists and designers to prototype quickly, which can bootstrap the creative process for environment design, further accelerating research". It works using a single image prompt, with Google of course using examples from its own text-to-image AI tool Imagen 3, though real world images can be used too. From there users can jump in and interact with the world, with the AI overcoming challenges like understanding control inputs and remembering the world it's created as the camera pans elements in and out of view. It then generates new plausible content on the fly for up to a minute of gameplay. It can further generate NPCs and interactions with them; the physics of water or gravity; lighting models and reflections; and multiple animations. "While this research is still in its early stage with substantial room for improvement on both agent and environment generation capabilities, we believe Genie 2 is the path to solving a structural problem of training embodied agents safely while achieving the breadth and generality required to progress towards AGI [Artificial General Intelligence]," reads a blog post on the tool from Google. So while the tool can be used for prototyping, it's also being used to train further AI models. Discussing how Google developed the technology responsibly, the blog post reads: "Genie 2 shows the potential of foundational world models for creating diverse 3D environments and accelerating agent research. This research direction is in its early stages and we look forward to continuing to improve Genie's world generation capabilities in terms of generality and consistency. "As with SIMA [Scalable Instructable Multiworld Agent], our research is building towards more general AI systems and agents that can understand and safely carry out a wide range of tasks in a way that is helpful to people online and in the real world." As such it's purpose as of yet doesn't seem to be directly in video game development, though the potential is clear. Indeed, AI is changing video game development with its use in multiple areas, from NPCs to scriptwriting and voice acting. Multiple studios have already stated their interest in AI technology, both for and against it. Most recently, PlayStation co-CEO Hermen Hulst stated AI technology has the capacity to "revolutionise" the games industry, but it won't replace the "human touch".
[15]
Google is back at it with AI, this time generating interactive 3D game worlds -- but you won't catch me playing ball
AI is a genie that refuses to go back in the bottle -- no matter how much I plead, "Put that thing back where it came from, or so help me!" Enter Google's Genie 2, another generative model. However, rather than merely guessing which word might come next a la an LLM, or proffering a kind of sludgey-looking still image, this AI instead outputs 3D interactive environments. Well, at least it's not calling me 'a stain on the universe' like Gemini AI allegedly told one user recently. Whereas Genie 1, revealed back in February, could only cobble together 2D scenes, the just announced Genie 2 is a step up, offering somewhat explorable 3D game environments (via PC World). I say 'somewhat' because it isn't long before the wheels come off. For one thing, the player character model struggles to look consistent throughout movement. For another, playable worlds generated by Genie 2 don't last long; Google's writes, "Genie 2 can generate consistent worlds for up to a minute," though admits that most of the examples they show lasted between 10 and 20 seconds. So, no, you won't feed Genie 2 a prompt regarding that long awaited sequel and be satisfied by the results. Google's prompting process is also not as straightforward as typing, "One Cyberpunk game, please," and diving into a Night City knockoff either. Genie 2 generates its game worlds in response to still images that were themselves generated by Imagen 3, a text-to-image model also from Google. That's a whole lot of snakes eating their own tail. Genie 2 itself is a "an autoregressive latent diffusion model" that draws from "a large-scale video dataset" that Google doesn't really detail the content or source thereof otherwise. Google has also experimented with feeding photos of the real world to Genie 2, highlighting responses that "model grass blowing in the wind or water flowing in a river." The Gifs they share of these responses are about as muddy looking as you'd expect, though. Credit where some credit may be due though, Genie 2 does make strides in a number of things generative models often struggle with. While the player character model warps and blurs like a water damaged printout, the environments one can trundle through remain surprisingly consistent -- definitely bland and generic, but consistently so. For instance, Genie 2's "Long Horizon Memory" remembers aspects of the environment after they disappear from view. Say you're running between pyramids -- when you look back, the pyramids will stand in the same spot you last saw them, rather than teleporting around behind you like it's a game of 'What's the time, Mr. Wolf?' Similarly, Google touts Genie 2's 'counterfactual' capabilities, which is really just a fancy way of saying that multiple players can play the same generated level and have a consistent experience. Google has shared a mix of examples controlled by humans and SIMA, their own AI player "designed to complete tasks in a range of 3D game worlds by following natural-language instructions." Many of the shared excerpts show digital avatars controlled with typical WASD keyboard controls. Google writes, "Genie 2 responds intelligently to actions taken by pressing keys on a keyboard, identifying the character and moving it correctly. For example, our model has to figure out that arrow keys should move the robot and not the trees or clouds." Good for you, Genie 2. But besides walking around these occasionally wonky looking levels, Genie 2 can also generate object interactions and even NPCs. Naturally, the NPCs have nothing of note to say, but Genie 2 can generate balloons to burst and barrels to explode -- though that's hardly the most compelling gameplay loop. While no one outside of Google can yet play around with Genie 2's output, the company is eager to tout its potential use cases, such as rapid prototyping based off of concept art. This may sound appealing -- right up until you ask what happens when a developer wants to slightly adjust absolutely anything about their AI generated prototype. While Genie 2's snatches of game may spark the imagination of AI-defenders, I'm far from convinced -- and honestly, I'm just a wee bit concerned about its potential labour implications for game development. Earlier this year, Take-Two CEO Strauss Zelnick dismissed the claim that AI would take away jobs, arguing, "It's not going to make people irrelevant. It's going to change the nature of certain forms of employment. And that's a good thing." But as Andy Chalk points out in his news piece, industry transitioning to large scale automation is not always a painless process. It's my sincere hope that Google's little experiment doesn't end up motivating even more layoffs.
[16]
Genie 2: A large-scale foundation world model
Until now, world models have largely been confined to modeling narrow domains. In Genie 1, we introduced an approach for generating a diverse array of 2D worlds. Today we introduce Genie 2, which represents a significant leap forward in generality. Genie 2 can generate a vast diversity of rich 3D worlds. Genie 2 is a world model, meaning it can simulate virtual worlds, including the consequences of taking any action (e.g. jump, swim, etc.). It was trained on a large-scale video dataset and, like other generative models, demonstrates various emergent capabilities at scale, such as object interactions, complex character animation, physics, and the ability to model and thus predict the behavior of other agents. Below are example videos of people interacting with Genie 2. For every example, the model is prompted with a single image generated by Imagen 3, GDM's state-of-the-art text-to-image model. This means anyone can describe a world they want in text, select their favorite rendering of that idea, and then step into and interact with that newly created world (or have an AI agent be trained or evaluated in it). At each step, a person or agent provides a keyboard and mouse action, and Genie 2 simulates the next observation. Genie 2 can generate consistent worlds for up to a minute, with the majority of examples shown lasting 10-20s.
[17]
Google's Genie 2 Creates 3D Game Worlds From A Single Image - Alphabet (NASDAQ:GOOGL)
With features like dynamic physics and NPC interactions, Genie 2 offers up to a minute of immersive gameplay from a simple image prompt. Alphabet Inc's GOOGL Google has introduced Genie 2, an advanced AI tool capable of generating fully interactive 3D game worlds from a single image prompt. The technology, hailed as a "world model," has potential applications ranging from video game prototyping to broader AI research. Genie 2 builds upon its predecessor, Genie 1, which was limited to creating 2D worlds. This new iteration allows users to generate 3D environments with dynamic perspectives, physics simulations, and interactions. Notably, it incorporates user actions like jumping or swimming. See Also: PlayStation Bosses Discuss Future Of AI In Gaming: Why Human Creativity Will Always Be Essential In Game Development How Genie 2 Works The process begins with an image prompt, often created through Google's text-to-image AI tool Imagen 3, though real-world images are also supported. Once the prompt is input, Genie 2 generates an interactive environment where users can move, explore, and interact. Key features include: Dynamic Physics: Realistic simulation of water, gravity, and other environmental elements. Lighting and Reflections: Advanced rendering techniques to replicate realistic illumination. NPC Interactions: Automatically generated non-playable characters with responsive behaviors. Live Content Updates: Continuous world generation that adapts as users explore, maintaining immersion. The generated worlds allow up to one minute of gameplay, during which the AI adapts to control inputs and camera movements, creating new plausible content on the fly. A Tool For Rapid Prototyping In a blog post, the company explained that the tool "enables artists and designers to prototype quickly, which can bootstrap the creative process for environment design, further accelerating research." For game developers, this could mean reducing the time required to design and test virtual environments. "Genie 2 shows the potential of foundational world models for creating diverse 3D environments and accelerating agent research... As with SIMA, our research is building towards more general AI systems and agents that can understand and safely carry out a wide range of tasks," the company said in a blog post. The ethical implications of AI in creative industries remain a point of debate. AI tools like Genie 2 have sparked discussions about their role in gaming, particularly concerning potential impacts on jobs in scriptwriting, voice acting, and environment design. Read Next: Future Of Investing: AI Is 'Starting To Empower Consumers' Image credits: Shutterstock. GOOGLAlphabet Inc$173.50-0.50%Overview Rating:Good62.5%Technicals Analysis1000100Financials Analysis400100WatchlistOverviewMarket News and Data brought to you by Benzinga APIs
[18]
Google AI Tool Could Fast-Track 3D Game Development and Retail | PYMNTS.com
Google's new Genie 2 AI system converts simple drawings and text descriptions into playable 3D game environments, potentially streamlining the construction of digital worlds. The technology could significantly impact game development by making 3D environment creation more accessible to creators without extensive technical expertise while offering established studios a new rapid prototyping tool. For businesses exploring virtual commerce, it may also provide a more efficient path to building digital retail spaces and interactive customer experiences. "Google's Genie 2 is a foundation world model capable of generating an endless variety of action-controllable, playable 3D environments from a single image," Michael Jung, CFO of DeepBrainAI, told PYMNTS. "It can also generate different perspectives, such as first-person view, isometric views, or third-person driving videos, creating a dynamic environment for gamers and consumers alike. "Imagine being able to generate an entire shopping mall or video game level from a single image," Jung added. "Genie 2's out-of-distribution generalization capabilities turn concept art and drawings into fully interactive environments. Interactive shopping experiences can be easily created through object affordances and interactions." Game studios are increasingly integrating AI tools into development processes. The technology automates environment generation, rapidly creating textures, landscapes and assets. AI-powered testing systems run thousands of gameplay scenarios to identify bugs and balance issues. Natural language processing enhances non-playable character interactions, while procedural generation creates diverse levels and quests. Machine learning optimizes game performance and helps predict player behavior. Asset creation tools particularly benefit indie developers by reducing production costs. These AI implementations focus on accelerating technical tasks while maintaining creative quality. As gamers' financial habits evolve with technology, a notable trend has emerged in their payment preferences. PYMNTS Intelligence has found that digital instant payment options are gaining popularity among gamers as alternatives to traditional cash payments. Genie 2 creates interactive 3D worlds from single images. Users input commands, and the system generates responsive environments with physics, character movement and consistent graphics for up to a minute. Most demos run 10-20 seconds. The system uses video-trained AI to address a significant challenge in AI research: more diverse training environments. While early tests with DeepMind's SIMA agent show promise -- like following verbal commands to explore generated buildings -- researchers acknowledge that both the AI agent and world generation still need significant work. AI and gaming expert Kelsey Falter, co-founder and CEO of Mother Games, told PYMNTS that AI-generated worlds can fundamentally change how we interact with gaming and retail. "Imagine an entire shop personalized with all of your favorite items in life -- from the aspirational to the obtainable -- a shop shaped by you," Falter said. "Within these AI retail environments, AI agents who offer personal shopping consultation will co-create an experience with the player, offering valuable perspective and a sense of connection during an otherwise transactional process. With these new capabilities, it's possible to see a product or apparel item in a variety of different settings, allowing for a level of intimacy that makes a digital good come to life." Falter said shopping taps into our instinct to search for hidden gems and good deals. Genie 2's rich 3D environments could help games capture that feeling of spontaneous discovery. "Rather than navigating a static catalog, players could explore an AI-generated market, exploring an immersive landscape of items as part of the game lore," she said. "This blurring of boundaries between play and shopping makes the discovery process immersive in a way that traditional eCommerce simply cannot replicate. Additionally, with the Genie 2 camera capabilities -- there is an opportunity to shift from first-person browsing to viewing one's avatar in third-person, seeing a 360-degree perspective come to life with environmental lighting and shadows."
[19]
The Morning After: Google DeepMind's Genie 2 can generate interactive 3D worlds
Google DeepMind has just revealed Genie 2, a world-modelling AI capable of creating 3D worlds and sustaining those environments for significantly longer. Genie 2 isn't a game engine. It's a diffusion model that generates images as the player (either a human being or another AI agent) moves through the world the software is simulating. All it needs to start is a single image prompt either generated by AI or from a real-world photo. There are limitations: DeepMind says the model can generate "consistent" worlds for up to 60 seconds, with the majority of the examples the company shared on Wednesday running for significantly less time -- most videos are between 10 to 20 seconds long. Image quality also softens and comes undone the longer Genie 2 needs to maintain the illusion of a consistent world. Indiana Jones and the Great Circle is here. Sure, that's not Harrison Ford, but Troy Baker's impersonation is an excellent Indie, augmented by the writing, audio and direction, which makes it feel like classic Spielberg and Lucas fare. The game has a few rough edges, but it's all unashamedly Indiana Jones. Continue reading. Oura is rolling out its respiratory sickness detection feature to Ring Gen 3 and Ring 4 wearers. Users with an active subscription should get Symptom Radar by December 11. The feature looks at metrics, including resting heart rate, skin temperature, sleep data and breathing rate, to see if there are any differences from your baseline stats. If there are, Oura may inform you it has detected possible common cold- or flu-like symptoms. Just in time for the holidays!
Share
Share
Copy Link
Google DeepMind unveils Genie 2, an advanced AI model capable of generating playable 3D environments from single images or text prompts, showcasing potential applications in AI research and creative prototyping.
Google DeepMind has announced Genie 2, a groundbreaking AI model that generates interactive, playable 3D environments from a single image or text prompt. This advancement marks a significant step forward in the realm of AI-generated content and world modeling [1][2].
Genie 2 demonstrates remarkable abilities in creating diverse and rich 3D worlds. Key features include:
The model can create environments resembling high-quality video games, complete with animated characters that can serve as embodied agents for training purposes [1][5].
Genie 2 is an autoregressive latent diffusion model with a transformer architecture, incorporating an autoencoder for frame-by-frame world generation. It has been trained on a large-scale video dataset, although specific details about the training data remain undisclosed [4][5].
While not intended for creating traditional games, Genie 2 shows promise in several areas:
DeepMind suggests that Genie 2 could accelerate the development of more general embodied agents by offering a limitless curriculum of novel training environments [3].
Despite its advancements, Genie 2 faces some limitations:
The impressive quality of Genie 2's outputs has raised questions about the nature of its training data. Speculation exists about whether the model was trained on popular video game footage, potentially accessed through YouTube. This has sparked discussions about intellectual property implications and the boundaries of fair use in AI training [2].
Google DeepMind positions Genie 2 as a key component in the journey towards artificial general intelligence. By providing rich, diverse environments for AI agent training, the model could play a crucial role in developing more advanced and versatile AI systems [3][5].
As world models like Genie 2 continue to evolve, they are expected to have far-reaching impacts on AI research, creative industries, and potentially even our understanding of intelligence itself.
Reference
[2]
[4]
Artificial intelligence has successfully recreated the iconic game DOOM, marking a significant milestone in AI-driven game development. This achievement showcases the potential of AI in creating playable game environments without traditional coding.
5 Sources
World Labs, led by AI pioneer Fei-Fei Li, has introduced an innovative AI system that transforms 2D images into explorable 3D environments, potentially revolutionizing content creation for games, movies, and virtual experiences.
6 Sources
Google researchers have achieved a significant milestone in AI technology by creating a model that can simulate the classic game DOOM in real-time, without using a traditional game engine. This breakthrough demonstrates the potential of AI in game development and simulation.
7 Sources
As generative AI makes its way into video game development, industry leaders and developers share their perspectives on its potential impact, benefits, and challenges for the future of gaming.
3 Sources
Game developers are exploring the use of AI to create more interactive and lifelike non-player characters (NPCs) in video games. This technological advancement promises to enhance player immersion and create more dynamic gaming experiences.
7 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved