Meta has unveiled V-JEPA 2, a clever bit of AI that gives robots something approaching common sense about the physical world.
I’ve seen plenty of robotics breakthroughs, but this one seeks to tackle that fundamental problem we’ve all noticed with robots—they’re often remarkably stupid about basic physics that even toddlers grasp.
The V-JEPA 2 system, which Meta has been cooking up as part of their broader push into advanced machine intelligence, essentially gives robots the ability to understand their surroundings and predict what might happen next. It’s that prediction bit that’s the real breakthrough here.
When you toss your keys onto the table, you know they’ll land there rather than float up to the ceiling. Such basic understanding of the physical world has been difficult to instil in machines.
“We achieve this physical intuition by observing the world around us and developing an internal model of it, which we can use to predict the outcomes of hypothetical actions,” Meta explains.
“V-JEPA 2 helps AI agents mimic this intelligence, making them smarter about the physical world. The models we use to develop this kind of intelligence in machines are called world models, and they enable three essential capabilities: understanding, predicting and planning.”
Meta taught V-JEPA 2 physical intuition by, essentially, having it watch a whole bunch of videos. Through this training, it picked up patterns about how people handle objects, how things move through space, and how objects interact with each other.
When Meta’s team plugged V-JEPA 2 into their lab robots, the machines could perform basic tasks like reaching for things, picking them up, and placing them elsewhere with a newfound understanding of physics.
What’s particularly clever is that the robots can handle unfamiliar objects and environments. Traditional robotics has always struggled with the unexpected (i.e. program a robot to pick up red squares, and it falls to pieces when presented with a blue triangle.)


Meta has released three new benchmarks alongside V-JEPA 2 that will help researchers test how well their own AI systems understand and reason about the physical world through video. It’s a collaborative approach that acknowledges no single company will solve these complex challenges alone.
For robots in the real world, this kind of understanding could transform everything from warehouse automation to home helper robots. The warehouse robot that currently needs precisely placed items in predefined locations might soon handle the chaotic reality of actual stockrooms. Your future home robot might reliably grab a mug without sending it crashing to the floor.
The safety implications shouldn’t be overlooked either. Robots that can anticipate the consequences of their actions are far less likely to cause accidents. Nobody wants a delivery robot that can’t predict that rolling into a crowded playground might end badly.
While giving due credit to Meta and their work on V-JEPA 2, we’re still in the early days. The current demos focus on relatively simple manipulation tasks – we’re not seeing robots performing brain surgery, or crafting perfect soufflés – but the future potential is exciting. We’re watching the development of machines that don’t just follow rigid instructions but possess something approaching physical intuition.
In a few years, we might have robots that don’t seem baffled by the basic physics of the world around them. Wouldn’t that be something?
(Image credit: Meta)
See also: Robots could build massive solar power arrays in space


Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.