9 Sources
[1]
Meta's V-JEPA 2 model teaches AI to understand its surroundings | TechCrunch
Meta on Wednesday unveiled its new V-JEPA 2 AI model, a "world model" that is designed to help AI agents understand the world around them. V-JEPA 2 is an extension of the V-JEPA model that Meta released last year, which was trained on over one million hours of video. This training data is supposed to help robots or other AI agents operate in the physical world, understanding and predicting how concepts like gravity will impact what happens next in a sequence. These are the kinds of common sense connections that small children and animals make as their brains develop -- when you play fetch with a dog, for example, the dog will (hopefully) understand how bouncing a ball on the ground will cause it to rebound upward, or how it should run toward where it thinks the ball will land, and not where the ball is at that precise moment. Meta depicts examples where a robot may be confronted with, for example, the point-of-view of holding a plate and a spatula and walking toward a stove with cooked eggs. The AI can predict that a very likely next action would be to use the spatula to move the eggs to the plate. According to Meta, V-JEPA 2 is 30x faster than Nvidia's Cosmos model, which also tries to enhance intelligence related to the physical world. However, Meta may be evaluating its own models according to different benchmarks than Nvidia. "We believe world models will usher a new era for robotics, enabling real world AI agents to help with chores and physical tasks without needing astronomical amounts of robotic training data," explained Meta's Chief AI Scientist Yann LeCun in a video.
[2]
Meta Says Its New AI Model Can Understand the Physical World
Expertise Artificial intelligence, home energy, heating and cooling, home technology. Meta says a new generative AI model it released Wednesday could change how machines understand the physical world, opening up opportunities for smarter robots and more. The new open-source model, called V-JEPA 2 for Video Joint Embedding Predictive Architecture 2, is designed to help AI understand things like gravity and object permanence, Meta said. Current models that allow AI to interact with the physical world rely on labelled data or video to mimic reality, but this approach emphasizes the logic of the physical world, including how objects move and interact. The model could allow AI to understand concepts like the fact that a ball rolling off of a table will fall. Meta said the model could be useful for devices like autonomous vehicles and robots by ensuring they don't need to be trained on every possible situation. The company called it a step toward AI that can adapt like humans can. One struggle in the space of physical AI has been the need for significant amounts of training data, which takes time, money and resources. At SXSW earlier this year, experts said synthetic data -- training data created by AI -- could help prepare a more traditional learning model for unexpected situations. (In Austin, the example used was the emergence of bats from the city's famed Congress Avenue Bridge.) Meta said its new model simplifies the process and makes it more efficient for real-world applications because it doesn't rely on all of that training data.
[3]
Meta Says Its New AI Model Understands Physical Rules Like Gravity
Expertise Artificial intelligence, home energy, heating and cooling, home technology. A new generative AI model Meta released this week could change how machines understand the physical world, opening up opportunities for smarter robots and more, the company said. The new open-source model, called Video Joint Embedding Predictive Architecture 2, or V-JEPA 2, is designed to help artificial intelligence understand things like gravity and object permanence, Meta said. "By sharing this work, we aim to give researchers and developers access to the best models and benchmarks to help accelerate research and progress," the company said in a blog post, "ultimately leading to better and more capable AI systems that will help enhance people's lives." Current models that allow AI to interact with the physical world rely on labeled data or video to mimic reality, but this approach emphasizes the logic of the physical world, including how objects move and interact. The model could allow AI to understand concepts like the fact that a ball rolling off of a table will fall. Meta said the model could be useful for devices like autonomous vehicles and robots by ensuring they don't need to be trained on every possible situation. The company called it a step toward AI that can adapt like humans can. One struggle in the space of physical AI has been the need for significant amounts of training data, which takes time, money and resources. At SXSW earlier this year, experts said synthetic data -- training data created by AI -- could help prepare a more traditional learning model for unexpected situations. (In Austin, the example used was the emergence of bats from the city's famed Congress Avenue Bridge.) Meta said its new model simplifies the process and makes it more efficient for real-world applications because it doesn't rely on all of that training data. The next steps for world models include training models that are capable of learning, reasoning and planning across different time and space scales, making them better at breaking down complicated tasks. Multimodal models, that can use other senses like audio and touch in addition to vision, will also help future AI models understand the real world.
[4]
Meta launches AI 'world model' to advance robotics, self-driving cars
Mark Zuckerberg, CEO of Meta Platforms. Artificial intelligence has been an integral focus for the tech giant's leader amid competition from players like OpenAI, Microsoft and Google. Meta on Wednesday announced it's rolling out a new AI "world model" that can better understand the 3D environment and movements of physical objects. The tech giant, which owns popular social media apps Facebook and Instagram, said its new open-source AI model V-JEPA 2 can understand, predict and plan in the physical world. Known as a world model, these systems take inspiration from the logic of the physical world to build an internal simulation of reality, allowing AI to learn, plan, and make decisions in a more human-like manner. For example, in the case of Meta's new model, V-JEPA 2 can recognize that a ball rolling off a table will fall, or that an object hidden out of view hasn't just vanished. Artificial intelligence has been a key focus for Meta CEO Mark Zuckerberg as the company faces competition from players like OpenAI, Microsoft and Google. Meta is set to invest $14 billion into artificial intelligence firm Scale AI and hire its CEO Alexandr Wang to bolster its AI strategy, people familiar with the matter tell CNBC.
[5]
Meta's new AI helps robots learn real-world logic from raw video
"V-JEPA 2 represents meaningful progress toward our ultimate goal of developing advanced machine intelligence (AMI)," Meta stated in its official announcement. Unlike traditional AI models that require extensive annotations, V-JEPA 2 extracts patterns from raw video. This allows it to generalize across different contexts and handle new situations with greater ease. Meta has already tested the model on lab-based robots. These machines used V-JEPA 2 to pick up unfamiliar objects, reach for targets, and place items in new locations. This marks a step forward in enabling robots to function in unpredictable environments. The company sees major potential for V-JEPA 2 in autonomous machines like delivery robots and self-driving vehicles. These systems need to quickly interpret physical surroundings in order to avoid obstacles and make real-time decisions. With world models like V-JEPA 2, machines can start anticipating the outcomes of their actions in much the same way humans do. Meta joins other tech leaders in pushing world models forward. Google's DeepMind has been developing its own version, Genie, which can simulate entire 3D environments.
[6]
Meta releases J-VEPA 2 AI model that understands the world through video - SiliconANGLE
Meta releases J-VEPA 2 AI model that understands the world through video Meta Platforms Inc.'s AI research division today released a new artificial intelligence model that can improve training and AI understanding of the physical world for robots and AI agents through interpreting video information similar to how humans understand the world. The model, named J-VEPA 2, or Video Joint Embedding Predictive Architecture Model, builds on the company's previous work on J-VEPA, which allows AI agents and robots to "think before they act." "As humans we think that language is very important for intelligence, but in fact that's not the case," said Yann LeCun, vice president and chief AI scientist at Meta. "Humans and animals navigate the world by building mental models of reality. What if AI could develop this kind of common sense, an ability to make predictions of what is going to happen in some kind of abstract representation of space?" It is a state-of-the-art AI world model, trained on video that enables robots and other AI models to understand the physical world and predict how it will respond to their actions. World models allow AI agents and robots to build a concept of the physical world and understand the consequences of actions in order to plan a course of actions to a given task. With a world model, a company or organization does not need to run a million trials with an AI in the real world, because a world model can simulate the world for an AI model -- often within minutes -- for training with an understanding of how the world works. A world model can also be used to understand and predict what will happen after a certain action is taken, allowing a robot or AI attached to a sensor to understand the next event that might happen. Humans do this all the time when planning next steps, such as when walking from place to place when avoiding other people in an unfamiliar place or when playing hockey. An AI model could use this kind of planning to help prevent accidents in the workplace by guiding robots on safe paths with other robots and humans working alongside, reducing potential hazards. V-JEPA 2 helps AI agents understand the physical world and its interactions by understanding patterns of how people interact with objects, how objects move in the physical world and how objects interact with other objects. The company said, when the model was deployed on robots in its labs, it found that robots can use J-VEPA 2 to perform tasks such as reaching, picking up an object and placing an object in a new location with ease. "Of course, world models are essential for autonomous cars and robots," said LeCun. "In fact, we believe world models will usher in a new era for robotics enabling real-world AI agents to help with chores and physical tasks without needing astronomical amounts of robotic training data." In addition to the release of J-VEPA 2, Meta released three new benchmarks for the research community to evaluate existing reasoning models that that use video to understand the world.
[7]
Meta releases V-JEPA 2 to train AI on real-world physics
Meta introduced V-JEPA 2 on Wednesday, a new AI "world model" designed to enhance an AI agent's comprehension of its environment. V-JEPA 2 expands upon the original V-JEPA model released last year. Video: Meta The V-JEPA model was trained using over 1 million hours of video footage. This training aims to assist AI agents, particularly robots, in navigating the physical world by predicting outcomes based on concepts, such as gravity. Meta gives the example of a robot holding a plate and spatula while walking toward a stove with cooked eggs. The AI should predict that the next likely action would be transferring the eggs to the plate using the spatula. Meta reports that V-JEPA 2 operates 30 times faster than Nvidia's Cosmos model, which also aims to improve intelligence related to the physical world. Meta stated that these measurements may have been evaluated using different benchmarks from Nvidia. Video: Meta The company's chief AI scientist, Yann LeCun, stated in a video, "We believe world models will usher a new era for robotics, enabling real-world AI agents to help with chores and physical tasks without needing astronomical amounts of robotic training data."
[8]
Meta Debuts AI to Help Robots 'Understand the Physical World' | PYMNTS.com
The tech giant says these capabilities are key to developing AI agents that think before acting, with V-JEPA 2 marking progress toward the company's goal of creating advanced machine intelligence (AMI). "As humans, we have the ability to predict how the physical world will evolve in response to our actions or the actions of others. For example, you know that if you toss a tennis ball into the air, gravity will pull it back down," the company said. "V-JEPA 2 helps AI agents mimic this intelligence, making them smarter about the physical world. The models we use to develop this kind of intelligence in machines are called world models, and they enable three essential capabilities: understanding, predicting and planning." Meta said it trained V-JEPA 2 using video, which helped it discover important patterns in the physical world, such as how people interact with objects, how objects move in the physical world or interact with other objects. The launch of V-JEPA 2 comes one day after reports that Meta CEO Mark Zuckerberg was personally recruiting experts to help assist in his goal of turning Meta into a leader in the field of artificial general intelligence (AGI), a term for machines that can carry out tasks at the same level as humans. In other AI news, PYMNTS CEO Karen Webster spoke Wednesday with Tejas Manohar, co-founder and co-CEO of Hightouch, about the way the technology can help brands overcome "digital fatigue" when it comes to marketing. "Marketing that actually adds value to the consumer's life versus just sort of, 'Hey, here's my product, here's my product,' that is what resonates," Manohar said. Webster underscored the crux of the debate: "Our inboxes, our text inboxes, our apps are just filled with solicitations ... especially now as brands are really hoping to drive more spend." This collective burnout, she argued, is not a technology issue but a failure of strategic design. Manohar agreed, saying brands continue to follow analog rules: static calendars, batch-and-blast messaging, and audience segments defined by guesswork and not signal. "Our belief is that traditional channels just don't support the [human-feeling] capability that modern digital channels do," Manohar explained.
[9]
Meta introduces new AI model for physical reasoning By Investing.com
Investing.com -- Meta Platforms (NASDAQ:META) has unveiled V-JEPA 2, a new world model that improves AI's ability to understand and predict physical interactions. The company announced Thursday that this state-of-the-art model enables robots and other AI agents to better comprehend the physical world and anticipate how it will respond to actions. These capabilities are crucial for developing AI systems that can "think before they act." V-JEPA 2 builds upon Meta's first video-trained model released last year. The new version enhances understanding and prediction capabilities, allowing robots to interact with unfamiliar objects and environments to complete tasks. The model was trained using video to learn important patterns in the physical world, including human-object interactions, object movement, and object-to-object interactions. When tested on robots in Meta's labs, the model demonstrated abilities to perform tasks such as reaching, picking up objects, and placing objects in new locations. Meta has also released three new benchmarks to help researchers evaluate how well existing models learn and reason about the physical world using video. By sharing these resources, Meta aims to accelerate research progress toward developing more capable AI systems. The company emphasized that physical reasoning is essential for building AI agents that can operate in the physical world and for achieving advanced machine intelligence (AMI).
Share
Copy Link
Meta unveils V-JEPA 2, an advanced AI 'world model' designed to understand and predict physical interactions, potentially revolutionizing robotics and autonomous systems.
Meta has taken a significant leap forward in artificial intelligence with the release of V-JEPA 2, a sophisticated "world model" designed to help AI agents comprehend and interact with their physical surroundings 1. This open-source model, an extension of its predecessor V-JEPA, represents a paradigm shift in how machines understand and predict real-world phenomena.
Source: SiliconANGLE
V-JEPA 2 is trained on over one million hours of video data, enabling it to grasp fundamental concepts such as gravity and object permanence 12. Unlike traditional AI models that rely heavily on labeled data or video to mimic reality, V-JEPA 2 emphasizes the logic of the physical world, including how objects move and interact 2.
Meta's Chief AI Scientist, Yann LeCun, explained, "We believe world models will usher in a new era for robotics, enabling real-world AI agents to help with chores and physical tasks without needing astronomical amounts of robotic training data" 1.
The potential applications of V-JEPA 2 are vast and promising:
Source: Interesting Engineering
Autonomous Vehicles: The model could enhance the decision-making capabilities of self-driving cars, allowing them to better interpret their surroundings and make real-time decisions 45.
Everyday Tasks: Meta showcased examples where the AI could predict likely next actions, such as using a spatula to move eggs from a stove to a plate 1.
Meta claims that V-JEPA 2 is 30 times faster than Nvidia's Cosmos model, which also aims to enhance intelligence related to the physical world 1. This efficiency is crucial for real-world applications, as it simplifies the process and reduces the need for extensive training data 23.
Meta sees V-JEPA 2 as a stepping stone towards more advanced AI systems. Future developments may include:
Training models capable of learning, reasoning, and planning across different time and space scales 3.
Developing multimodal models that incorporate other senses like audio and touch, in addition to vision 3.
Advancing towards what Meta calls "Advanced Machine Intelligence" (AMI), which aims to create AI systems that can adapt and learn like humans 5.
Source: CNET
As the AI landscape continues to evolve, Meta's investment in this technology underscores its commitment to remaining competitive in the field. With plans to invest $14 billion into AI firm Scale AI and potentially hire its CEO Alexandr Wang, Meta is positioning itself at the forefront of AI innovation 4.
The release of V-JEPA 2 as an open-source model also reflects Meta's aim to accelerate research and progress in the field, potentially leading to AI systems that can significantly enhance people's lives 3.
Summarized by
Navi
[5]
AMD CEO Lisa Su reveals new MI400 series AI chips and partnerships with major tech companies, aiming to compete with Nvidia in the rapidly growing AI chip market.
8 Sources
Technology
6 hrs ago
8 Sources
Technology
6 hrs ago
Meta has filed a lawsuit against Joy Timeline HK Limited, the developer of the AI 'nudify' app Crush AI, for repeatedly violating advertising policies on Facebook and Instagram. The company is also implementing new measures to combat the spread of AI-generated explicit content across its platforms.
17 Sources
Technology
14 hrs ago
17 Sources
Technology
14 hrs ago
Mattel, the iconic toy manufacturer, partners with OpenAI to incorporate artificial intelligence into toy-making and content creation, promising innovative play experiences while prioritizing safety and privacy.
14 Sources
Business and Economy
14 hrs ago
14 Sources
Business and Economy
14 hrs ago
A critical security flaw named "EchoLeak" was discovered in Microsoft 365 Copilot, allowing attackers to exfiltrate sensitive data without user interaction. The vulnerability highlights potential risks in AI-integrated systems.
5 Sources
Technology
22 hrs ago
5 Sources
Technology
22 hrs ago
Spanish AI startup Multiverse Computing secures $217 million in funding to advance its quantum-inspired AI model compression technology, promising to dramatically reduce the size and cost of running large language models.
5 Sources
Technology
14 hrs ago
5 Sources
Technology
14 hrs ago