2 Sources
2 Sources
[1]
From folding boxes to fixing vacuums, GEN-1 robotics model hits 99% reliability
Robotic machine learning company Generalist has announced GEN-1, a new physical AI system that it says "crosses into production-level success rates" on "a broad range of physical skills" that used to require the dexterity and muscle memory of human hands. Generalist is also touting the new model's ability to respond to disruptions by improvising new moves and "connect[ing] ideas from different places in order to solve new problems." GEN-1 builds on Generalist's previous GEN-0 model, which the company touted in November as a proof of concept for the applicability of scaling laws in robotics training, showing how more pre-training data and compute time improve post-training performance. But while large language models have been able to effectively process trillions of words collectively written on the Internet as part of their training, robotic models don't have a similar, readily accessible source of quality data about how humans manipulate objects. To help solve this problem, Generalist has relied on "data hands", a set of wearable pincers that capture micro-movements and visual information as humans perform manual tasks. Generalist now claims it has collected over half a million hours and "petabytes of physical interaction data" to help train its physical model. The result is an autonomous system that is precise enough to put money into a wallet and adaptable enough to fold laundry or sort auto parts. The model now reaches 99 percent success rates on repetitive but delicate mechanical tasks such as folding boxes, packing phones, and servicing robot vacuums, according to Generalist, and at roughly three times the speed of the previous GEN-0 model. GEN-1 can hit these marks after only about an hour spent adapting its pretraining to "robot data" that applies to its specific robotic embodiment, according to the company. Recovering from mistakes In the past, complex robotic systems have usually relied on carefully pre-programmed motions or been trained to focus exclusively on a single task with little variation. What sets GEN-1 apart, Generalist says, is the ability for a single model to improvise based on its previous experience and respond to disruptions naturally, even when they are "well outside the training distribution." In an interview with Forbes, for instance, Generalist engineers describe the model giving a plastic bag a little shake to get a plush toy to shimmy inside, even though such a move wasn't explicitly programmed in the training data. A video posted by Generalist also shows robot hands adjusting intelligently as flexible objects spring out of their expected positions or refolding a shirt that gets moved in the middle of a folding task. Generalist also describes the model adjusting and regrasping small washers when they get nudged out of place, using both hands to insert them into their desired spot. "Nobody has programmed the robot to make mistakes, therefore nobody has programmed the robot to recover from mistakes," Generalist engineer Felix Wang says in that video. "And that just happens for free." Generalist isn't the only company working to bring machine learning techniques into the physical realm. Last year, Google showed off the "visual learning action" capabilities of its Gemini Robotics models, which can understand and respond to general action prompts from humans. And Physical Intelligence has made waves with a pair of robotic hands on a wheeled platform, trained in specially designed simulated household environments to perform tasks from cleaning up spills to making beds. Then there's Tesla, which first rolled out its humanoid Optimus robots in late 2024 with staged demos that were actually teleoperated by remote human pilots. In January, Tesla CEO Elon Musk admitted that current Optimus robots are still not doing "useful work" at Tesla, despite previous claims to the contrary. With GEN-1, though, Generalist says its physical models have reached a GPT-3-style inflection point, where some tasks are starting to "cross the level of performance needed to be deployed in economically useful settings" and where "we can expect each new generation of model to result in a new set of increasingly complex tasks that can be mastered." Color us hopeful that this means we're finally on the path to an affordable, at-home laundry-folding robot sometime in the near future.
[2]
Generalist releases GEN-1 highly capable robotic intelligence AI foundation model - SiliconANGLE
Generalist releases GEN-1 highly capable robotic intelligence AI foundation model Artificial intelligence startup Generalist AI Inc., a startup focused on embodied robotics intelligence, has released GEN-1, a highly capable foundation model for robot learning and mastery of physical tasks. The new model, which debuted Friday, arrives merely five months after the company launched GEN-0, a new class of robotics foundation model that allowed the company to train AI models by training directly on raw movement data. According to the company, GEN-1 represents a tremendous leap forward in robotic intelligence. It allows machines to master tasks rapidly, learn from interactions, react quickly and overcome challenges at rates never seen before. On multiple tasks, it has a success rate that exceeds 99%. It can also execute tasks almost three times faster than current state of the art models and recover from interruptions faster. The researchers said they worked on improving three core areas: reliability, speed and improvisation. Although most models can already reliably repeat tasks in the real world, they are limited to task-specific, repetitive motions and suffer reduced complexity by taking on simpler actions. GEN-1 is designed for managing longer step-by-step tasks, such as assembling items, folding multiple pieces of laundry, and other tasks that take complex reasoning over time and space without becoming confused. Speed is also often an issue with robotics, which often slows down when too many objects are in the field of vision or are moving too quickly. Part of the problem for most models is bringing what it sees to the reasoning engine quickly enough, translating vision to language to training data. This often slows down motion, which can lead to gaps and stutters. As mentioned above, the team managed an almost three-times speedup, resulting in more fluid motions. For example, the model can assemble a box in around 12.1 seconds, the company said this is around 2.8 times faster than the closest state-of-the-art model in the industry. GEN-0 and pi-0, another well-known robotics intelligence model from Physical Intelligence, took 34 seconds for an identical box. These two innovations dovetail into the third result, which is the most important: the ability to recover from interruptions, learn from changes in the environment, mistakes and changes. In human terms, this is improvisation. When something doesn't completely make sense, a part springs out of a hole, a box misses its mark or a door doesn't latch, a human normally just goes back and completes the action. An AI could have numerous different reactions, including a forced reassessment, a pattern break or failing to complete the task. It might not even remember how to react to the same event in the future. The human would. The researchers say that GEN-1 can creatively react to these factors by rapidly adapting to "glitches" in the environment, such as objects slipping, latches failing, items deforming or things not going as planned. It will approach things from different angles, adjust its thinking and try different patterns until something works. A classic example could be folding a shirt. Getting fabric to go exactly where you want it to is not always an easy feat - it can flop around, curl, warp and wrinkle. Sometimes the shirt will even flip inside-out. When these situations present themselves, the AI will adapt quickly to fix the mess and handle it without creating a worse problem. The researchers said the model plans and works around its training in ways that are less rigid than its training data. In more human words: It thinks outside the box. Although the researchers at Generalist had glowing things to say about GEN-1, they added that not all tasks hit the 99% success rate. Some complex tasks couldn't quite hit that ambitious bar, especially at a reasonable speed and reliability to be useful in everyday settings.
Share
Share
Copy Link
Robotic AI startup Generalist unveiled GEN-1, a physical AI system reaching 99% success rates on delicate tasks like folding boxes and servicing vacuums. Trained on over 500,000 hours of human movement data, the model executes tasks three times faster than predecessors and improvises solutions when disrupted, marking what the company calls a GPT-3-style inflection point for embodied robotics intelligence.
Robotic machine learning company Generalist has released GEN-1, a physical AI system that achieves production-level success rates across a broad range of manual tasks requiring human-like dexterity
1
. The AI foundation model, announced Friday, reaches a 99% success rate on repetitive but delicate mechanical tasks such as folding boxes, packing phones, and servicing robot vacuums1
. This marks a significant advance in embodied robotics intelligence, arriving just five months after the company launched its proof-of-concept GEN-0 model2
.
Source: Ars Technica
The robotics model executes tasks at roughly three times the speed of the previous GEN-0 model
1
. Generalist reports that GEN-1 can assemble a box in around 12.1 seconds, approximately 2.8 times faster than the closest state-of-the-art model in the industry2
. Both GEN-0 and pi-0, another well-known robotics intelligence model from Physical Intelligence, took 34 seconds for an identical box2
. The model achieves these marks after only about an hour spent adapting its pretraining to robot data that applies to its specific robotic embodiment1
.To rapidly master physical tasks, Generalist relied on data hands, a set of wearable pincers that capture micro-movements and visual information as humans perform manual tasks
1
. The company has collected over 500,000 hours and petabytes of physical interaction data to train its physical model1
. This addresses a fundamental challenge in robot learning: unlike large language models that process trillions of words from the Internet, robotic models lack a readily accessible source of quality training data about how humans manipulate objects1
.What distinguishes GEN-1 from traditional robotic systems is its ability to improvise based on previous experience and respond to disruptions naturally, even when they fall well outside the training distribution
1
. Generalist engineer Felix Wang explains that "nobody has programmed the robot to make mistakes, therefore nobody has programmed the robot to recover from mistakes. And that just happens for free"1
. The model demonstrates thinking outside the box by giving a plastic bag a shake to get a plush toy to shimmy inside, even though such a move wasn't explicitly programmed1
. Videos show robot hands adjusting intelligently as flexible objects spring out of expected positions or refolding a shirt that gets moved mid-task1
.
Source: SiliconANGLE
Related Stories
Generalist isn't alone in bringing machine learning techniques into the physical realm. Google showcased the visual learning action capabilities of its Gemini Robotics models last year, which can understand and respond to general action prompts from humans
1
. Physical Intelligence has developed a pair of robotic hands on a wheeled platform, trained in specially designed simulated household environments to perform tasks from cleaning up spills to making beds1
. Meanwhile, Tesla first rolled out its humanoid Optimus robots in late 2024 with staged demos that were actually teleoperated by remote human pilots1
. In January, Tesla CEO Elon Musk admitted that current Optimus robots are still not doing useful work at Tesla, despite previous claims to the contrary1
.Generalist claims GEN-1 has reached a GPT-3-style inflection point, where some tasks are starting to cross the level of performance needed to be deployed in economically useful settings
1
. The company expects each new generation of model to result in a new set of increasingly complex tasks that can be mastered1
. However, researchers acknowledge that not all tasks hit the 99% success rate, with some complex tasks unable to reach that ambitious bar, especially at reasonable speed and reliability for everyday settings2
. For businesses and consumers watching the robotics space, the model's ability to handle longer step-by-step tasks like assembling items and folding multiple pieces of laundry without becoming confused suggests we may finally be approaching affordable, at-home automation for tedious manual tasks1
2
.Summarized by
Navi
01 Nov 2024β’Technology

25 Sept 2025β’Technology

21 Feb 2025β’Technology

1
Technology

2
Science and Research

3
Science and Research
