2 Sources
2 Sources
[1]
Netflix's Void AI can remove objects from video and show how scenes evolve without them
Serving tech enthusiasts for over 25 years. TechSpot means tech analysis and advice you can trust. What just happened? Top-tier video editing suites can seamlessly remove objects from scenes, even generating realistic shadows and reflections for the freshly removed elements. However, these tools fall short when the deleted object involves significant interactions, such as collisions. In such cases, existing solutions often struggle to produce plausible results. Netflix is addressing this shortcoming with a new video object removal framework called Void. Short for Video Object and Interaction Deletion, the model can effectively delete an object from a scene and adjust for its absence. For example, erasing a car crash from a scene will also modify the remaining elements accordingly as if the accident never happened. This means that flying debris, fire, and damage to nearby props will be removed as if the crash never occurred. Similarly, in a scene involving someone cannonballing into a pool, removing the person would leave the pool water naturally undisturbed. To train the model, its creators used Kubric and Humoto to generate a new paired dataset of counterfactual object removals. During the inference stage, a vision-language model is used to identify parts of the scene impacted by a removed object which serves as a guide for the diffusion model to fill in the blanks with the counterfactual data. Void sounds like a powerful video editing tool that could afford producers lots of flexibility long after filming has wrapped up. Not having to reshoot a scene would save an immense amount of time and money - assuming of course that the effect is seamless and doesn't look like AI slop. Interested parties can learn more about Void over on GitHub. Its creators - Saman Motamed, William Harvey, Luc Van Gool, Benjamin Klein, Ta-Ying Cheng, and Zhuoning Yuan - have also published a 19-page pre-print (PDF) on the subject. As The Register highlights, the model isn't exclusive to Netflix. The streaming giant has also made it available on Hugging Face, meaning anyone can install and use it. And while Void isn't the first of its kind, it might be the best currently available. In a survey of 25 individuals cited by The Register, Void was reportedly preferred over rivals like ProPainter, Rose, DiffuEraser, and Generative Omnimatte nearly 65 percent of the time.
[2]
Netflix's new AI doesn't create videos -- it rewrites reality (and it's open source)
Netflix challenges Sora with its new open source AI that transforms real footage I've spent a lot of time testing every AI video tool that hits the market, from OpenAI's Sora to the latest Runway updates. Usually, the pitch is the same: "Type a prompt, get a movie." But Netflix just quietly released a research model called VOID, and it's doing something completely different. Instead of building new worlds and scenes from scratch, it rewrites the one you've already filmed -- and it's so good at it, you might never trust a "real" video again. What is Netflix VOID? VOID stands for Video Object and Interaction Deletion. At first glance, it looks like a high-end version of the "Magic Eraser" on your Pixel 8 or Galaxy S24. You select an object, and it disappears. But here's where it gets wild: VOID understands physics and causality. In other words, while most editing tools just "patch" the hole left behind with background textures, VOID actually rewrites the logic of the scene to account for the missing object. Several tests on GitHub highlight what the AI can do: * The Guitar Test: In a research demo, a person holding a guitar is deleted. In any other tool, the guitar would just float or vanish. VOID realizes the guitar is no longer supported, so it generates frames where it falls naturally to the ground. * The Crash Test: Remove one car from a head-on collision, and VOID doesn't leave a ghost-impact of fire and smoke. It "re-imagines" the path of the remaining car as if the accident never happened -- turning a wreck into a peaceful drive down an empty road. Why this is the "end of the reshoot" For a company like Netflix, this underscores a massive cost-saving trick in the movie industry. Think about the infamous "Game of Thrones" Starbucks cup moment. Usually, fixing that requires expensive frame-by-frame digital surgery. With VOID, a producer could simply remove the unwanted object and let the AI realistically simulate what should happen next -- whether that's water splashing, dust settling or nothing at all. It goes beyond small fixes, too. Instead of bringing a 100-person crew back for a reshoot, the AI could correct mistakes after filming wraps. It could even change a story detail by removing a key object and recalculating the scene so everything still looks natural. Can you try it? The most surprising part of this release is that Netflix open-sourced it. You can find the model right now on Hugging Face (under an Apache 2.0 license). However, don't expect to run this on your MacBook Air. VOID is a beast. It requires a GPU with at least 40GB of VRAM (think NVIDIA A100 or H100) to run inference comfortably. Plus, It's built on a 5-billion parameter version of CogVideoX and uses a proprietary "quadmask" system to tell the AI which parts of the physics need to be recalculated. The takeaway The "visual receipt" used to be the ultimate proof. Now it's starting to lose its power. Netflix has introduced a tool that can rewrite real footage so seamlessly it looks completely real. At the same time, AI "slop" is getting more convincing than ever -- flooding the internet with content that feels authentic but isn't. The result looks like a world where seeing something no longer means you can trust it. We've officially entered the era of editable reality. Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds.
Share
Share
Copy Link
Netflix unveiled Void AI, an open-source model that doesn't just erase objects from video—it understands physics and causality to realistically adjust the scene. Unlike traditional editing tools, Void can remove a car from a crash and eliminate all evidence of impact, or delete a person diving into a pool while leaving the water undisturbed. The model is now available on GitHub and Hugging Face.
Netflix has released Void AI, an open-source AI model that transforms how creators remove objects from video footage
1
. While top-tier editing suites can already erase elements and generate realistic shadows and reflections, they struggle when deleted objects involve significant interactions like collisions or physical contact. The Video Object and Interaction Deletion framework addresses this limitation by understanding not just what to remove, but how the scene should evolve without it2
.
Source: Tom's Guide
Unlike tools from OpenAI or other generative AI platforms that create videos from scratch, Netflix's approach focuses on intelligently modify existing video footage. The model can remove a car from a crash scene and eliminate all evidence of impact—no flying debris, no fire, no damage to nearby props. In another demonstration, removing a person holding a guitar doesn't leave the instrument floating in mid-air; Void AI generates frames where the guitar falls naturally to the ground because it understands the object is no longer supported
2
.The model's creators—Saman Motamed, William Harvey, Luc Van Gool, Benjamin Klein, Ta-Ying Cheng, and Zhuoning Yuan—trained Void using Kubric and Humoto to generate a paired dataset of counterfactual object removals
1
. During inference, a vision-language model identifies parts of the scene impacted by the removed object, serving as a guide for the diffusion model to fill in the blanks with counterfactual data. This physics-aware approach means that when you remove a person cannonballing into a pool, the water remains naturally undisturbed as if they were never there.Source: TechSpot
Built on a 5-billion parameter version of CogVideoX, the open-source AI model uses a proprietary "quadmask" system to tell the AI which parts of the physics need recalculation
2
. In a survey of 25 individuals, Void was preferred over rivals like ProPainter, Rose, DiffuEraser, and Generative Omnimatte nearly 65 percent of the time1
.Netflix made the model available on both GitHub and Hugging Face under an Apache 2.0 license, allowing anyone to install and use it
1
2
. However, running Void requires substantial computational resources—specifically a GPU with at least 40GB of VRAM, such as an NVIDIA A100 or H100.For the film industry, this technology could eliminate costly reshoots. Think of the infamous Game of Thrones Starbucks cup incident—instead of expensive frame-by-frame digital surgery, producers could simply remove the unwanted object and let the AI realistically adjust the scene
2
. Not having to bring a 100-person crew back for reshoots would save immense amounts of time and money, assuming the effect remains seamless.Related Stories
The ability to rewrites reality in video footage introduces profound implications for trust in visual media. When AI can remove a car from a collision and recalculate the remaining vehicle's path as if the accident never happened, the line between authentic and manipulated footage blurs significantly. A 19-page pre-print detailing the research is available for those interested in the technical foundations
1
. As visual evidence loses its traditional authority, industries from journalism to legal proceedings will need to adapt to this new landscape where seeing no longer guarantees believing.Summarized by
Navi
1
Technology

2
Science and Research

3
Science and Research
