4 Sources
4 Sources
[1]
Fei-Fei Li's World Labs speeds up the world model race with Marble, its first commercial product | TechCrunch
World Labs, the startup founded by AI pioneer Fei-Fei Li, is launching its first commercial world model product. Marble is now available via freemium and paid tiers that let users turn text prompts, photos, videos, 3D layouts or panoramas into editable, downloadable 3D environments. The launch of the generative world model, first released in limited beta preview two months ago, comes a little over a year after World Labs came out of stealth with $230 million in funding, and puts the startup ahead of competitors building world models. World models are AI systems that generate an internal representation of an environment, and can be used to predict future outcomes and plan actions. Startups like Decart and Odyssey have released free demos, and Google's Genie is still in limited research preview. Marble differs from these -- and even World Labs's own real-time model, RTFM -- because it creates persistent, downloadable 3D environments rather than generating worlds on-the-fly as you explore. This, the company says, results in less morphing or inconsistency, and lets users export worlds as Gaussian splats, meshes or videos. Marble is also the first model of its kind to offer AI-native editing tools and a hybrid 3D editor that lets users block out spatial structures before AI fills in the visual details. "This is a brand new category of model that's generating 3D worlds, and this is something that's going to get better over time. It's something we've already improved quite a lot," Justin Johnson, co-founder of World Labs, told TechCrunch. Last December, World Labs showed how its early models could generate interactive 3D scenes based on a single image. While impressive, the somewhat cartoonish scenes weren't fully explorable since movements were limited to a small area, and there were occasional rendering errors. In my trial of the beta preview, I found Marble generated impressive worlds from image prompts alone -- from game-like environments to photorealistic versions of my living room. Scenes morphed at the edges, though that's apparently been improved in today's launch. That said, a world I'd generated in the beta using a single prompt looked better and matched my intent more closely than the same prompt does now. I haven't yet tested the editing features, though Johnson says they make Marble practical for near-term gaming, VFX and virtual reality (VR) projects. "One of our main themes for Marble going forward is creative control," Johnson said. "There should always be a quick pathway to generate something, but you should be able to dive even deeper and get a lot of control over the things that you're generating. You don't want the machine to just take the wheel and pull all that creativity away from you." Marble's take on creative control starts with input flexibility. The beta only accepted single images, forcing the model to invent unseen details for a 360-degree view. With the full launch, users can now upload multiple images or short clips to show a space from different angles and have the model generate fairly realistic digital twins. Then we have Chisel, an experimental 3D editor that lets users block out coarse spatial layouts (think walls, boxes, or planes) and then add text prompts to guide the visual style. Marble generates the world, decoupling structure from style -- similar to how HTML provides the structure of a website and CSS adds in color. Unlike text-based editing, Chisel lets you directly manipulate objects. "I can just go in there and grab the 3D block that represents the couch and move it somewhere else," Johnson said. Another new feature that gives you more editing control is the ability to expand a world. "Once you generate a world, you can expand it up to once," Johnson said. "When you move to a piece of the world that's starting to break apart, you can basically tell the model to expand there or generate more world in the vicinity of where you currently are, and then it can add more detail in that region." Users who want to create extremely large spaces can combine multiple worlds with "composer mode." Johnson demonstrated this for me with two worlds he had already built - a room made of cheese with grape chairs, and another of a futuristic meeting room in space. Marble is available via four subscription tiers: Free (four generations from text, image, or panorama), Standard ($20/month, 12 generations plus multi-image/video input and advanced editing), Pro ($35/month, 25 generations with scene expansion and commercial rights), and Max ($95/month, all features and 75 generations). Johnson thinks the initial use cases for Marble will be gaming, visual effects for film, and virtual reality. Game developers have mixed feelings about the tech. A recent Game Developers Conference survey found a third of respondents believed generative AI has a negative impact on the games industry - 12% more than the survey indicated year earlier. Intellectual property theft, energy consumption and a decrease in quality from AI-generated content were among the top concerns aired. And last year, a Wired investigation found game studios like Activision Blizzard are using AI to cut corners and combat attrition. In gaming, Johnson sees developers using Marble to generate background environments and ambient spaces and then importing those assets into game engines like Unity or Unreal Engine to add interactive elements, logic and code. "It's not designed to replace the entire existing pipeline for gaming, but to just give you assets that you can drop into that pipeline," he said. For VFX work, Marble sidesteps the inconsistency and poor camera control that plague AI video generators, per Johnson. Its 3D assets let artists stage scenes and control camera movements with frame-perfect precision, he said. While Johnson said World Labs isn't focusing on virtual reality (VR) applications right now, he noted the industry is "starved for content" and excited about the launch. Marble is already compatible with the Vision Pro and Quest 3 VR headsets, and every generated world can be viewed in VR today. Marble may also have potential use cases for robotics. Johnson noted that unlike image and video generation, robotics doesn't have the benefit of a large repository of training data. But with generators like Marble, it becomes easier to simulate training environments. According to a recent manifesto by Fei-Fei Li, CEO and co-founder of World Labs, Marble represents the first step towards creating "a truly spatially intelligent world model." Li believes "the next generation of world models will enable machines to achieve spatial intelligence on an entirely new level." If large language models can teach machines to read and write, Li hopes systems like Marble can teach them to see and build. She says the ability to understand how things exist and interact in three-dimensional spaces can eventually help machines make breakthroughs beyond gaming and robotics, and even into science and medicine. "Our dreams of truly intelligent machines will not be complete without spatial intelligence," Li wrote.
[2]
Fei-Fei Li's World Labs unveils its world-generating AI model
Marble can reconstruct, generate, and simulate 3D worlds -- think of it as a type of "world model." In an interview with Fast Company, Li describes world models as a "significant" evolution of the generative AI era. "The large world model is really a significant step towards unlocking AI's capability," a category she calls "spatial." Spatial intelligence refers to a system's ability to perceive, model, reason about, and take actions within physical or geometric space -- similar to how humans or animals choose their actions based on their understanding of their surroundings. World Labs launched in September of 2024, when it began working on the Marble model. Two months ago it released a preview of the model to a group of creatives, who began buliding worlds and giving feedback. This week, Li posted a sort of manifesto on Substack arguing that spatial intelligence is the next frontier in AI. For humans, she says, spatial intelligence of the physical world around us provides the scaffolding upon which we build our cognition. "Spatial intelligence will transform how we create and interact with real and virtual worlds -- revolutionizing storytelling, creativity, robotics, scientific discovery, and beyond," she writes. World Labs believes that endowing machines (including robots) with such "spatial intelligence" could be transformative for a number of industries in the coming years. Using a web interface, users can feed Marble a scene description, images or videos, or coarse 3D layouts and the model will generate a realistic 3D environment. A user might input a set of images from the bedroom where they grew up, then upload the images to Marble, which will then intelligently sew them together to create an immersive digital 3D version of the room.
[3]
World Labs launches Marble, a commercial world model for generating entire virtual environments - SiliconANGLE
World Labs launches Marble, a commercial world model for generating entire virtual environments World Labs Technologies, a company founded by AI pioneer Fei-Fei Li and focused on developing breakthrough artificial intelligence models, today announced the launch of its first commercial world model product: Marble. Marble lets users generate entire virtual worlds from text prompts, photos, panoramas or 3D models and download fully editable 3D environments. World Labs initially debuted the world model in limited beta mode two months ago. In the preview, the company showed that the model can generate 3D worlds that users can then explore as long as they want, with no morphing and no inconsistency. According to the company, Marble represents a leap over previous models by producing larger, more stylistically diverse world with cleaner 3D geometry. World models are useful because they allow AI models connected to the real world to understand and predict the world's behavior. This is critical for developing more capable AI systems, such as autonomous vehicles and robots, by producing realistic training data. Marble's world models can also be used for entertainment, such as generating entire, complex worlds for cinema and video games. Many video games, for example, use virtual worlds for players to participate in that rely on 3D editing tools to recreate realistic or semi-realistic environments for users to play the game. Marble can generate worlds in a broad variety of styles, including cartoon, science fiction, futuristic, fantasy, anime, realistic and retro-styled low poly-count (where the objects and walls appear to be a "low graphics" as if rendered on an older computer). Li is well known for creating ImageNet in 2009, a landmark AI dataset that revolutionized the field of computer vision. ImageNet contains over 14 million images organized according to a hierarchy of English nouns and their relationships. Its creation transformed computer vision from a niche research pursuit into one of the most dynamic fields of AI and laid the groundwork for visual reasoning and, ultimately, for today's generative world models. World models and visual reasoning form the basis for spatial intelligence, Li said, a concept that will transform how users create and interact with real and virtual environments. "Today, leading AI technology such as large language models have begun to transform how we access and work with abstract knowledge," Li explained in a blog post. "Yet they remain wordsmiths in the dark; eloquent but inexperienced, knowledgeable but ungrounded." Building spatially intelligent AI, Li argued, requires creating world models that are generative and capable of understanding, reasoning, and producing the semantic context of not only objects but also their relationships. This mandates creating and reasoning in dynamic, complex worlds, real or virtual, beyond the current capabilities of modern LLMs. The current market contains several contenders working on world models, including Google LLC's Genie, Nvidia Corp.'s Cosmos and AI startup Decart AI Inc. Unlike many world foundation models in the industry currently, including World Labs' own Real-Time Frame Model, Marble allows users to generate persistent worlds and download them as 3D models rather than producing them on the fly. Like AI image editors, Marble also offers tools for users to modify virtual worlds. Chisel, an experimental 3D editor, allows users to define virtual spaces with layouts such as walls, rooms and terrain and then use a text prompt to refine how the rough "sketch" should be used. Another feature allows users to expand the world by extending already available portions or take pieces of the world and then bridge them together seamlessly. The model expands the world by generating more of the 3D space based on existing rules and style. Users who want to build extremely large spaces can combine already generated worlds with a "composer mode," allowing them to stitch together different styles. Marble is available in four pricing tiers: Free, with four virtual world generations from text, images or panoramas; Standard at $20 per month with 12 generations, multimedia support and extended editing; Pro at $35 per month with 25 generations and commercial rights; and Max at $95 per month with 75 generations and a full feature set.
[4]
Fei-Fei Li's Spatial A.I. Startup World Labs Unveils Its First Product
Spatial intelligence aims to teach A.I. systems physical concepts humans intuitively grasp, such as parking a car without bumping the curb. Last January, renowned A.I. researcher Fei-Fei Li took a leave of absence from Stanford to trade academia for startup life. Nearly two years later, her venture World Labs has unveiled its first commercial product: a world model Marble. Marble can create 3D virtual worlds from text, images, video or even rough layouts. It builds on an earlier World Labs prototype that created 3D scenes from 2D images, but with limitations, such as restricted interactive areas. Sign Up For Our Daily Newsletter Sign Up Thank you for signing up! By clicking submit, you agree to our <a href="http://observermedia.com/terms">terms of service</a> and acknowledge we may use your information to send you emails, product samples, and promotions on this website and other properties. You can opt out anytime. See all of our newsletters So-called world models like Marble are central to Li's vision of the future of A.I. Because these models can reason about and interact with complex environments, they are essential for building A.I. that understands not just language, but the physical world itself. World Labs aims to imbue its systems with spatial intelligence, teaching them physical concepts humans intuitively grasp, such as parking a car without bumping the curb, catching a tossed object, or pouring a drink without looking. "Today, leading A.I. technology such as large language models (LLMs) have begun to transform how we access and work with abstract knowledge," Li wrote in a Nov. 10 blog post. "Yet they remain wordsmiths in the dark; eloquent but inexperienced, knowledgable but ungrounded." An emphasis on visual and spatial intelligence has long been Li's "North Star," said the researcher, who in 2006 played a role in the release of ImageNet, a database of 15 million images that spurred the rise of deep learning. Li also co-directs Stanford's Institute for Human-Centered A.I. and serves as a United Nations advisor on A.I. policy. These days, however, Li is focused on World Labs, which has raised $230 million to pursue its spatial intelligence vision. Its backers include Radical Ventures, Andreessen Horowitz and Nvidia, as well as prominent tech figures such as Geoffrey Hinton, Eric Schmidt, Marc Benioff and Reid Hoffman. https://observer.com/wp-content/uploads/sites/2/2025/11/hero.mp4 Marble has been in beta for a few months and is now publicly available. It can create a full 3D world from a single image or text prompt. Users can also merge multiple environments by uploading several images within a prompt. According to World Labs, the model can combine photos or short videos of real-world spaces to generate immersive, realistic virtual worlds. The model includes a range of editing tools that let users customize their creations. A feature called Chisel allows users to sketch out a coarse 3D layout, while other tools make it possible to expand worlds or build entirely new scenes within the same environment. Looking ahead, World Labs plans to develop world models with more interactive capabilities for both humans and A.I. agents. While Li may be the most prominent figure developing world models, she isn't the only one in the field. Google DeepMind and Nvidia have explored similar technologies with their their Genie and Cosmos models, respectively. Yann LeCun, Meta's chief A.I. scientist, is reportedly in the early stages of fundraising for his own world model startup. Li said the applications of spatial intelligence tools like Marble will "span varying timelines." The model is already being used by filmmakers, game designers and architects to enhance creative workflows. In the medium term, Li expects such technology to advance robotics, while future applications in science, healthcare, and education could enable breakthroughs in experiment simulation, drug discovery and immersive learning. "Spatial intelligence will transform how we create and interact with real and virtual worlds -- revolutionizing storytelling, creativity, robotics, scientific discovery, and beyond," said Li. "This is A.I.'s next frontier."
Share
Share
Copy Link
AI pioneer Fei-Fei Li's World Labs has launched Marble, its first commercial world model that generates persistent 3D environments from text, images, or videos. The product marks a significant milestone in spatial intelligence AI and puts World Labs ahead of competitors in the emerging world model market.
World Labs, the AI startup founded by renowned researcher Fei-Fei Li, has officially launched Marble, its first commercial world model product
1
. The launch comes just over a year after the company emerged from stealth mode with $230 million in funding, positioning it as a frontrunner in the rapidly evolving world model space1
.
Source: Observer
Marble enables users to transform text prompts, photos, videos, 3D layouts, or panoramas into editable, downloadable 3D environments
1
. Unlike competitors such as Decart, Odyssey, and Google's Genie, which remain in limited research preview or offer only free demos, Marble creates persistent, downloadable 3D environments rather than generating worlds on-the-fly1
.
Source: TechCrunch
World models represent AI systems that generate internal representations of environments and can predict future outcomes and plan actions
1
. Li describes these models as a "significant" evolution of the generative AI era, introducing what she calls "spatial" intelligence2
.Spatial intelligence refers to a system's ability to perceive, model, reason about, and take actions within physical or geometric space, similar to how humans or animals choose their actions based on their understanding of their surroundings
2
. In a recent blog post, Li argued that current large language models "remain wordsmiths in the dark; eloquent but inexperienced, knowledgeable but ungrounded"4
.Marble distinguishes itself through comprehensive editing capabilities that provide users with creative control over generated environments. The platform introduces Chisel, an experimental 3D editor that allows users to block out coarse spatial layouts using walls, boxes, or planes, then add text prompts to guide visual style
1
."I can just go in there and grab the 3D block that represents the couch and move it somewhere else," explained Justin Johnson, co-founder of World Labs
1
. This approach decouples structure from style, similar to how HTML provides website structure while CSS adds visual elements.The platform also features world expansion capabilities, allowing users to extend existing environments once per generation
1
. For larger projects, "composer mode" enables users to combine multiple worlds seamlessly3
.Related Stories
Marble operates on a freemium model with four subscription tiers designed to accommodate different user needs. The Free tier offers four generations from text, image, or panorama inputs, while the Standard tier ($20/month) provides 12 generations plus multi-image/video input and advanced editing features
1
.Professional users can access the Pro tier ($35/month) with 25 generations, scene expansion, and commercial rights, or the Max tier ($95/month) offering all features and 75 generations
1
. Johnson identifies initial use cases in gaming, visual effects for film, and virtual reality applications1
.Summarized by
Navi
[1]
[2]
[3]