Netflix VOID model is now live on Hugging Face, giving developers and researchers a closer look at a video editing system built to remove objects from footage while preserving the logic of the scene around them. It appears to be Netflix’s first public model on the platform.
The Netflix VOID model is not a vague “creative AI” bundle, it is aimed at a specific and technically awkward task: deleting an object from a video while also dealing with the interactions that object caused in the scene.
Model: Netflix VOID model
Full name: Video Object and Interaction Deletion
Platform: Hugging Face
Code: GitHub repository
Paper: arXiv
The Netflix VOID model
The Netflix VOID model stands for Video Object and Interaction Deletion. According to the model card, it is designed to remove objects from videos along with the effects and interactions they induce. That includes not only visual residue such as shadows or reflections, but also the physical consequences a removed object may have had in the scene.

Basic object removal is one thing. Video object and interaction deletion is another. It asks the system to preserve continuity rather than simply paint over a gap and hope the eye forgives it.
How the Netflix VOID model works
The documentation says the model is built on CogVideoX-Fun-V1.5-5b-InP and fine-tuned for interaction-aware video inpainting. The system uses a quadmask setup to distinguish the object being removed, overlap regions, affected regions and the background that should remain. That is a more disciplined framing of the task than the generic “edit video with AI” pitch now doing the rounds elsewhere.
The surrounding material is reasonably thorough. Netflix has published a GitHub repository, a project page, a demo and a paper. The quick-start notes are also unusually candid about hardware requirements, flagging the need for a GPU with 40GB or more of VRAM for the notebook path. So no, this is not one for the average laptop and a hopeful attitude.
This Hugging Face release
There are two reasons this release is worth noting. First, it opens up a piece of work from a company that is better known for using machine learning internally than for publishing public model artefacts. That makes the release notable in its own right.
Second, the use case is real. Video editing is still one of the more difficult corners of generative AI. Images are easier. Short clips are easier. Sustained temporal consistency and scene logic are not. A narrow model with a clear task can therefore be more useful than another broad platform promising to do everything badly but at speed.
The Netflix VOID model also fits a wider shift in AI research culture. Large firms are becoming more willing to expose selected pieces of their stack to public scrutiny, whether for credibility, recruitment, ecosystem influence or all three at once. Whatever the motive, visibility helps. Developers can inspect the methods, the limits and the surrounding tooling instead of relying on marketing copy.
What to watch next
The main question is whether this stays a one-off release or becomes part of a broader public pattern from Netflix. If more models follow, the company could become a more serious participant in open audiovisual AI than many people would have expected.
At minimum, the Netflix VOID model gives the market a concrete example of how one large company is approaching a difficult video editing problem. That is more useful than another polished teaser and easier to judge on its merits.


