Curated by THEOUTPOST
On Sun, 3 Nov, 12:01 AM UTC
5 Sources
[1]
Meta's Embodied AI Can Now Feel Your Touch
Meta's three new research artefacts bring touch, dexterity and interaction to robots. Earlier this year, when Figure CEO Brett Adcock referred to 2024 as the year of embodied AI, few could predict the extraordinary advancements in robotics, pushing the boundaries of what once seemed implausible. Last week, an unexpected contender released research updates that advance robotics to a new level. Meta's Fundamental AI Research (FAIR) released three new research artefacts that advance touch perception, robot dexterity and human-robot interaction, namely Meta Sparsh, Meta Digit 360, and Meta Digit Plexus. Source: Meta Meta Sparsh, derived from the Sanskrit word for 'touch', is the first general-purpose encoder for vision-based tactile sensing. The technology aims to integrate robots with the sense of touch, thereby addressing a crucial modality to interact with the world. Sparsh operates across various types of vision-based tactile sensors and tasks that use self-supervised learning, eliminating the need for labelled data. It consists of a family of models pre-trained on an extensive dataset of over 4,60,000 tactile images. Meta has also released Meta Digit 360, a tactile fingertip featuring human-level multimodal sensing capabilities. Next up is Meta Digit Plexus, a platform paired with Digit 360, that integrates various tactile sensors into a single robotic arm. Meta FAIR mentioned in their blog that these new artefacts "advance robotics and support Meta's goal of reaching advanced machine intelligence (AMI)". AMI, also referred to as autonomous machine intelligence, is an innovation of Meta's AI chief scientist Yann LeCun. The technology is envisioned to help machines assist people in their daily lives. LeCun has proposed a future where systems can understand cause and effect and model the physical world. Interestingly, Meta's advancements in robotics are intended to help the whole ecosystem of builders develop machines that understand the world. However, robotics is not new for the company. Meta's robotics developments have largely revolved around Metaverse and AR/VR sets that leverage AI. Two years ago, at the 'Meta AI: Inside the Lab' event, the company highlighted AI and robotics developments that are central to creating the metaverse, largely to bring immersive virtual experiences. Features such as Builderbot, Universal Speech Translator, and others aim to enrich the metaverse experience. Notably, the former head of Meta hardware, Caitlin Kalinowski, who headed the development of Orion's augmented reality glasses at Meta, recently joined OpenAI to lead robotics and consumer hardware. With the recent announcements on tactile sensing innovations, Meta seems to be taking robotics to the next level. Furthermore, by open-sourcing these new models, Meta is continuing its run of enabling individuals and companies to grow in the open-source community. In the process, they are also attempting to take on NVIDIA. The graphics processing unit (GPU) giant has also been making significant strides in robotics. NVIDIA Omniverse and digital twins have been powering several domains, including automobile, semiconductor and healthcare. NVIDIA's Project GR00T, which was released earlier this year, is a new foundation model that aids the development of humanoid robots. Just last week the company released two updates in robotics. NVIDIA, along with researchers from the University of California, Berkeley, Carnegie Mellon University, and other universities, released HOVER (humanoid versatile controller), a 1.5-million-parameter neural network, to control the body of a humanoid robot. The model is said to improve efficiency and flexibility for humanoid applications. To further accelerate robotics developments, NVIDIA even released DexMimic Gen, a large-scale synthetic data generator that allows humanoids to learn complex skills from very few human demonstrations. This effectively reduces the time required for training robots, considering real-world data collection is one of the biggest hurdles in the humanoid development process. "At NVIDIA, we believe the majority of high-quality tokens for robot foundation models will come from simulation," said Jim Fan, senior research manager and lead of Embodied AI at NVIDIA. With these many advancements, it is evident that more companies are increasingly expressing interest in robotics. OpenAI is seemingly positioning itself for the future with the recent appointment of a new leader from Meta. It won't be too surprising if, tomorrow, Meta or OpenAI release a robot that embodies all senses. Now that robots can hear, see, think, move, and touch, the only thing left is smell. Considering how a company is already building technology to teleport smell, it won't be a surprise if robots get equipped with this capability in the future!
[2]
Losing Touch with the Metaverse, Meta Turns to Robotics
Meta's three new research artefacts bring touch, dexterity and interaction to robots. Earlier this year, when Figure CEO Brett Adcock referred to 2024 as the year of embodied AI, few could predict the extraordinary advancements in robotics, pushing the boundaries of what once seemed implausible. Last week, an unexpected contender released research updates that advance robotics to a new level. Meta's Fundamental AI Research (FAIR) released three new research artefacts that advance touch perception, robot dexterity and human-robot interaction, namely Meta Sparsh, Meta Digit 360, and Meta Digit Plexus. Source: Meta Meta Sparsh, derived from the Sanskrit word for 'touch', is the first general-purpose encoder for vision-based tactile sensing. The technology aims to integrate robots with the sense of touch, thereby addressing a crucial modality to interact with the world. Sparsh operates across various types of vision-based tactile sensors and tasks that use self-supervised learning, eliminating the need for labelled data. It consists of a family of models pre-trained on an extensive dataset of over 4,60,000 tactile images. Meta has also released Meta Digit 360, a tactile fingertip featuring human-level multimodal sensing capabilities. Next up is Meta Digit Plexus, a platform paired with Digit 360, that integrates various tactile sensors into a single robotic arm. Meta FAIR mentioned in their blog that these new artefacts "advance robotics and support Meta's goal of reaching advanced machine intelligence (AMI)". AMI, also referred to as autonomous machine intelligence, is an innovation of Meta's AI chief scientist Yann LeCun. The technology is envisioned to help machines assist people in their daily lives. LeCun has proposed a future where systems can understand cause and effect and model the physical world. Interestingly, Meta's advancements in robotics are intended to help the whole ecosystem of builders develop machines that understand the world. However, robotics is not new for the company. Meta's robotics developments have largely revolved around Metaverse and AR/VR sets that leverage AI. Two years ago, at the 'Meta AI: Inside the Lab' event, the company highlighted AI and robotics developments that are central to creating the metaverse, largely to bring immersive virtual experiences. Features such as Builderbot, Universal Speech Translator, and others aim to enrich the metaverse experience. Notably, the former head of Meta hardware, Caitlin Kalinowski, who headed the development of Orion's augmented reality glasses at Meta, recently joined OpenAI to lead robotics and consumer hardware. With the recent announcements on tactile sensing innovations, Meta seems to be taking robotics to the next level. Furthermore, by open-sourcing these new models, Meta is continuing its run of enabling individuals and companies to grow in the open-source community. In the process, they are also attempting to take on NVIDIA. The graphics processing unit (GPU) giant has also been making significant strides in robotics. NVIDIA Omniverse and digital twins have been powering several domains, including automobile, semiconductor and healthcare. NVIDIA's Project GR00T, which was released earlier this year, is a new foundation model that aids the development of humanoid robots. Just last week the company released two updates in robotics. NVIDIA, along with researchers from the University of California, Berkeley, Carnegie Mellon University, and other universities, released HOVER (humanoid versatile controller), a 1.5-million-parameter neural network, to control the body of a humanoid robot. The model is said to improve efficiency and flexibility for humanoid applications. To further accelerate robotics developments, NVIDIA even released DexMimic Gen, a large-scale synthetic data generator that allows humanoids to learn complex skills from very few human demonstrations. This effectively reduces the time required for training robots, considering real-world data collection is one of the biggest hurdles in the humanoid development process. "At NVIDIA, we believe the majority of high-quality tokens for robot foundation models will come from simulation," said Jim Fan, senior research manager and lead of Embodied AI at NVIDIA. With these many advancements, it is evident that more companies are increasingly expressing interest in robotics. OpenAI is seemingly positioning itself for the future with the recent appointment of a new leader from Meta. It won't be too surprising if, tomorrow, Meta or OpenAI release a robot that embodies all senses. Now that robots can hear, see, think, move, and touch, the only thing left is smell. Considering how a company is already building technology to teleport smell, it won't be a surprise if robots get equipped with this capability in the future!
[3]
Meta unveils AI tools to give robots a human touch in physical world
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Meta made several major announcements for robotics and embodied AI systems this week. This includes releasing benchmarks and artifacts for better understanding and interacting with the physical world. Sparsh, Digit 360 and Digit Plexus, the three research artifacts released by Meta, focus on touch perception, robot dexterity and human-robot interaction. Meta is also releasing PARTNR a new benchmark for evaluating planning and reasoning in human-robot collaboration. The release comes as advances in foundational models have renewed interest in robotics, and AI companies are gradually expanding their race from the digital realm to the physical world. There is renewed hope in the industry that with the help of foundation models such as large language models (LLMs) and vision-language models (VLMs), robots can accomplish more complex tasks that require reasoning and planning. Tactile perception Sparsh, which was created in collaboration with the University of Washington and Carnegie Mellon University, is a family of encoder models for vision-based tactile sensing. It is meant to provide robots with touch perception capabilities. Touch perception is crucial for robotics tasks, such as determining how much pressure can be applied to a certain object to avoid damaging it. The classic approach to incorporating vision-based tactile sensors in robot tasks is to use labeled data to train custom models that can predict useful states. This approach does not generalize across different sensors and tasks. Meta describes Sparsh as a general-purpose model that can be applied to different types of vision-based tactile sensors and various tasks. To overcome the challenges faced by previous generations of touch perception models, the researchers trained Sparsh models through self-supervised learning (SSL), which obviates the need for labeled data. The model has been trained on more than 460,000 tactile images, consolidated from different datasets. According to the researchers' experiments, Sparsh gains an average 95.1% improvement over task- and sensor-specific end-to-end models under a limited labeled data budget. The researchers have created different versions of Sparsh based on various architectures, including Meta's I-JEPA and DINO models. Touch sensors In addition to leveraging existing data, Meta is also releasing hardware to collect rich tactile information from the physical. Digit 360 is an artificial finger-shaped tactile sensor with more than 18 sensing features. The sensor has over 8 million taxels for capturing omnidirectional and granular deformations on the fingertip surface. Digit 360 captures various sensing modalities to provide a richer understanding of the environment and object interactions. Digit 360 also has on-device AI models to reduce reliance on cloud-based servers. This enables it to process information locally and respond to touch with minimal latency, similar to the reflex arc in humans and animals. "Beyond advancing robot dexterity, this breakthrough sensor has significant potential applications from medicine and prosthetics to virtual reality and telepresence," Meta researchers write. Meta is publicly releasing the code and designs for Digit 360 to stimulate community-driven research and innovation in touch perception. But as in the release of open-source models, it has much to gain from the potential adoption of its hardware and models. The researchers believe that the information captured by Digit 360 can help in the development of more realistic virtual environments, which can be big for Meta's metaverse projects in the future. Meta is also releasing Digit Plexus, a hardware-software platform that aims to facilitate the development of robotic applications. Digit Plexus can integrate various fingertip and skin tactile sensors onto a single robot hand, encode the tactile data collected from the sensors, and transmit them to a host computer through a single cable. Meta is releasing the code and design of Digit Plexus to enable researchers to build on the platform and advance robot dexterity research. Meta will be manufacturing Digit 360 in partnership with tactile sensor manufacturer GelSight Inc. They will also partner with South Korean robotics company Wonik Robotics to develop a fully integrated robotic hand with tactile sensors on the Digit Plexus platform. Evaluating human-robot collaboration Meta is also releasing Planning And Reasoning Tasks in humaN-Robot collaboration (PARTNR), a benchmark for evaluating the effectiveness of AI models when collaborating with humans on household tasks. PARTNR is built on top of Habitat, Meta's simulated environment. It includes 100,000 natural language tasks in 60 houses and involves more than 5,800 unique objects. The benchmark is designed to evaluate the performance of LLMs and VLMs in following instructions from humans. Meta's new benchmark joins a growing number of projects that are exploring the use of LLMs and VLMs in robotics and embodied AI settings. In the past year, these models have shown great promise to serve as planning and reasoning modules for robots in complex tasks. Startups such as Figure and Covariant have developed prototypes that use foundation models for planning. At the same time, AI labs are working on creating better foundation models for robotics. An example is Google DeepMind's RT-X project, which brings together datasets from various robots to train a vision-language-action (VLA) model that generalizes to various robotics morphologies and tasks.
[4]
Meta develops new technology that gives robots human-like touch
Meta's latest innovations -- Sparsh, Digit 360 and Digit Plexus -- focus on touch perception, tactile dexterity and human-robot interaction. Meta is betting big on the emerging field of embodied AI by incorporating the sense of touch and feel to its robotic innovations. The technology giant is collaborating with US-based sensor company GelSight and South Korean robotics firm Wonik Robotics to commercialize tactile sensors for AI. The new devices are not designed for consumers; instead, they are aimed at scientists. In this regard, Meta has released three research artifacts -- Sparsh, Digit 360, and Digit Plexus -- focusing on touch perception, robot dexterity, and human-robot interaction. Additionally, the company is introducing PARTNR, a new benchmark for assessing planning and reasoning in human-robot collaboration.
[5]
What is embodied AI and why Meta bets on it?
In what many are calling the "Year of Embodied AI," Meta has taken a big step in advancing robotic capabilities through a suite of new technologies. Meta's Fundamental AI Research (FAIR) division recently introduced three research artifacts -- Meta Sparsh, Meta Digit 360, and Meta Digit Plexus -- each bringing advancements in touch perception, dexterity, and human-robot collaboration. Embodied AI refers to artificial intelligence systems that are designed to exist and operate within the physical world, understanding and interacting with their surroundings in ways that mimic human perception and actions. Traditional AI systems excel at data analysis but fall short when applied to physical tasks, which require not only vision but also sensory feedback such as touch. By building embodied AI, researchers aim to create robots that can sense, respond, and even adapt to their environment, bridging the gap between digital intelligence and real-world functionality. Meta's innovations in embodied AI are aimed at achieving what its Chief AI Scientist Yann LeCun calls Advanced Machine Intelligence (AMI). This concept envisions machines that are capable of reasoning about cause and effect, planning actions, and adapting to changes in their environment, thereby moving from mere tools to collaborative assistants. What is Meta AI today? Meta's recent announcements underscore its commitment to tackling the limitations of current robotics technology. Let's explore the capabilities of each new tool. Meta Sparsh, which means "touch" in Sanskrit, is a first-of-its-kind vision-based tactile sensing model that enables robots to "feel" surfaces and objects. Sparsh is a general-purpose encoder that relies on a database of over 460,000 tactile images to teach robots to recognize and interpret touch. Unlike traditional models that require task-specific training, Sparsh leverages self-supervised learning, allowing it to adapt to various tasks and sensors without needing extensive labeled data. This ability to generalize is key for robots that need to perform a wide range of tasks. Sparsh works across diverse tactile sensors, integrating seamlessly into different robotic configurations. By enabling robots to perceive touch, Sparsh opens up opportunities in areas where dexterous manipulation and tactile feedback are critical, such as in medical applications, robotic surgery, and precision manufacturing. Digit 360 is Meta's new tactile fingertip sensor designed to replicate human touch. Equipped with 18 distinct sensing features, Digit 360 provides highly detailed tactile data that can capture minute changes in an object's surface, force, and texture. Built with over 8 million "taxels" (tactile pixels), Digit 360 allows robots to detect forces as subtle as 1 millinewton, enhancing their ability to perform complex, nuanced tasks. This breakthrough in tactile sensing has practical applications across various fields. In healthcare, Digit 360 could be used in prosthetics to give patients a heightened sense of touch. In virtual reality, it could enhance immersive experiences by enabling users to "feel" objects in digital environments. Meta is partnering with GelSight Inc to commercialize Digit 360, aiming to make it accessible to the broader research community by next year. Meta's third major release, Digit Plexus, is a standardized hardware-software platform designed to integrate various tactile sensors across a single robotic hand. Digit Plexus combines fingertip and palm sensors, giving robots a more coordinated, human-like touch response system. This integration allows robots to process sensory feedback and make real-time adjustments during tasks, similar to how human hands operate. By standardizing touch feedback across the robotic hand, Digit Plexus enhances control and precision. Meta envisions applications for this platform in fields such as manufacturing and remote maintenance, where delicate handling of materials is essential. To help build an open-source robotics community, Meta is making the software and hardware designs for Digit Plexus freely available. In addition to these technological advancements, Meta has entered partnerships to accelerate the adoption of tactile sensing in robotics. Collaborating with GelSight Inc and Wonik Robotics, Meta aims to bring its innovations to researchers and developers worldwide. GelSight Inc will handle the distribution of Digit 360, while Wonik Robotics will manufacture the Allegro Hand -- a robot hand integrated with Digit Plexus -- expected to launch next year. These partnerships are significant as they represent a shift towards democratizing robotic technology. By making these advanced tactile systems widely available, Meta is fostering a collaborative ecosystem that could yield new applications and improve the performance of robots across industries. Meta is also introducing PARTNR (Planning And Reasoning Tasks in humaN-Robot collaboration), a benchmark designed to evaluate AI models on human-robot interactions in household settings. Built on the Habitat 3.0 simulator, PARTNR provides a realistic environment where robots can interact with humans through complex tasks, ranging from household chores to physical-world navigation. With over 100,000 language-based tasks, PARTNR offers a standardized way to test the effectiveness of AI systems in collaborative scenarios. This benchmark aims to drive research into robots that act as "partners" rather than mere tools, equipping them with the capacity to make decisions, anticipate human needs, and provide assistance in everyday settings.
Share
Share
Copy Link
Meta unveils three new research artifacts - Sparsh, Digit 360, and Digit Plexus - advancing touch perception, robot dexterity, and human-robot interaction in the field of embodied AI.
Meta's Fundamental AI Research (FAIR) division has unveiled three groundbreaking research artifacts that promise to revolutionize the field of robotics and embodied AI. These innovations - Meta Sparsh, Meta Digit 360, and Meta Digit Plexus - focus on advancing touch perception, robot dexterity, and human-robot interaction [1][2].
Meta Sparsh, derived from the Sanskrit word for 'touch', is a pioneering general-purpose encoder for vision-based tactile sensing. This technology aims to integrate robots with a sense of touch, addressing a crucial modality for interacting with the world [1][2].
Key features of Sparsh include:
Meta Digit 360 is a tactile fingertip sensor that replicates human touch capabilities. Its advanced features include:
Digit Plexus is a hardware-software platform that integrates various tactile sensors into a single robotic arm. This platform aims to facilitate the development of robotic applications by encoding and transmitting tactile data efficiently [1][3].
These innovations support Meta's goal of reaching Advanced Machine Intelligence, a concept envisioned by Meta's AI chief scientist Yann LeCun. AMI aims to develop systems that can understand cause and effect, model the physical world, and assist people in their daily lives [1][2].
Meta is collaborating with GelSight Inc. to manufacture Digit 360 and with Wonik Robotics to develop a fully integrated robotic hand with tactile sensors on the Digit Plexus platform [3]. By open-sourcing these new models and designs, Meta continues its commitment to fostering growth in the open-source community [1][2].
In addition to hardware innovations, Meta is releasing PARTNR (Planning And Reasoning Tasks in humaN-Robot collaboration), a new benchmark for evaluating AI models in human-robot collaboration scenarios. Built on the Habitat simulator, PARTNR includes 100,000 natural language tasks in 60 houses, involving more than 5,800 unique objects [3][5].
These advancements signify a shift towards more sophisticated, touch-enabled robots capable of performing complex tasks. Potential applications span various fields, including:
As embodied AI continues to evolve, Meta's innovations are poised to play a crucial role in bridging the gap between digital intelligence and real-world functionality, potentially reshaping how robots interact with and assist humans in the future.
Reference
[1]
[2]
[4]
[5]
Meta has introduced a voice mode for its AI assistant, allowing users to engage in conversations and share photos. This update, along with other AI advancements, marks a significant step in Meta's AI strategy across its platforms.
10 Sources
Meta has released a range of new AI models and tools, including SAM 2.1, Spirit LM, and Movie Gen, focusing on open-source development and collaboration with filmmakers to drive innovation in various fields.
2 Sources
OpenAI hires former Meta AR executive Caitlin Kalinowski to lead its robotics and consumer hardware initiatives, signaling a significant move into physical AI applications.
10 Sources
Nvidia is pioneering spatial AI and the Omniverse platform, aiming to bring AI into the physical world through digital twins, robotics, and intelligent spaces. This technology could revolutionize industries from manufacturing to urban planning.
2 Sources
Meta is set to introduce AI-powered features to its Quest 3 VR headset, including advanced chatbot capabilities and computer vision. This move positions Meta to compete directly with Apple's upcoming Vision Pro headset.
9 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved