Curated by THEOUTPOST
On Tue, 24 Sept, 4:04 PM UTC
3 Sources
[1]
I just tested Rabbit R1's next generation LAM -- is this what the company actually promised?
The Rabbit R1 has had quite the journey through its highs and lows over the course of this year -- starting strong on a ground swell of hype at CES 2024, and eventually launching to a lot of negative reviews for not doing what it promised (including from us). On top of that, there was just the key running issue in our testing that many of the features we were told about didn't work as intended (bugs aplenty). In fairness to the Rabbit team, they have worked hard in the background and delivered 16 over-the-air updates to bring new features, fix bugs and speed up pre-existing features. But the real heartbreaker at launch was that the promise of this Large Action Model (LAM) agentic AI never truly came to pass. As Founder and CEO Jesse Lyu said while talking to me in a two-hour interview (got a lot to share this weekend), the pre-existing system was based on a smaller set of "recordings from people." This means there would be some things it could do, but was rather limited in scope from the big promise made at the beginning. Well, Rabbit is back with a next generation LAM -- launching in beta as LAM playground on October 1 -- and I got the chance to try it and see what it's like. I can't show you what it looks like, but I can talk about it. This is what Rabbit is calling a Generic Website Agent -- something that is capable of doing stuff for you, either through a text prompt within Rabbit Hole or (the one you were all waiting for) making the request with natural language to your Rabbit R1. So to begin with, it all works via the virtual machine-esque system that Rabbit users are probably used to interacting with when logging into their respective accounts. From here, if you make a request for something on the internet -- Jesse's example was to add a 12-pack of Diet Coke to his Amazon shopping cart -- the LAM gets to work. You can watch it happen in real-time, as a vision model observes what's going on by taking screenshots, analyzing and directing a script to interact in a particular way based on its understanding of what's happening in the browser window. This includes closing cookies prompts, and can be resilient to UI changes in the future. Well, there is really only one big thing I liked about it, and that's that Rabbit has followed through with what it promised all those months ago. In the tasks I saw and threw at the LAM playground -- from visiting a property website and finding all the homes under £500,000 in South London, to playing a Wordle-esque game for three rounds, you watch it fulfill your request in real-time. Nobody spoke about purchasing things like what was claimed on stage at CES, but it is possible if you give the Rabbit R1 the right instructions and have yourself logged into the retail site's respective account. One big thing that we did get an answer on was the agentic approach to travel -- giving you a plan and actually taking action to book the flights and experiences. "We didn't know that in the US, we require a travel agency license to be able to handle booking. That was our fault," Jesse quickly commented when I asked him about this feature. But now, with the LAM built in the way it is -- a way that you can see it working and have the ability to directly interact and interrupt what it's doing too, this absolves them of needing said license. Another big feature of this playground is that the LAM has been trained on a desktop OS too -- namely Linux. This is still all very early days, but what Lyu managed to show me was a request to open this OS' equivalent to Microsoft Word and write a poem. What this means is with more development, it could fundamentally take over and create the work you may not be so bothered to do, like create a presentation deck. Watching the LAM once again do this work in the background is a sign that Rabbit is moving in the right direction. The tool that has been fundamentally built here and shown working to me is what was talked about all the months ago. It's been quite the catch-up effort, but it's the first sign of the AI agent I was looking for. As the old adage goes, better late than never. There was clear transparency on the issues of this beta right now (emphasis on beta) because I stumbled across a bunch of problems in its running. The first thing I noticed is that it is slow. On average, a new instruction is given every 5-6 seconds after a screenshot is taken and analyzed. And when I say task, I mean going down to every single instruction -- opening a website, then accepting the cookies prompt, clicking the text box, entering text in the text box, and hitting enter. All of these take that amount of time each. Rabbit is aware that there's a lot to work through here in terms of making it faster and reducing the latency. Second, as you'd expect from any beta, there are bugs. For example, with that poem the LAM opened up a word document to type in, the model hallucinated and gave us roughly four pages of garbled letters and numbers. Again, Lyu made it clear these things will happen and the beta test is specifically to find these bugs. Finally, let's cast an eye on the future here for a second. One obstacle that every AI company has been coming up against is whether the very companies that its models interact with actually agree to play ball. In Rabbit's case, the user-friendliness of a generic AI agent working across the entire web could be huge, but it's just as dependent on these websites allowing this bot to visit and navigate the website. What happens if the big players say "no?" And Lyu is aware of the task ahead of him in securing these permissions. He talked about using Linux for its OS part of LAM, and how it could work with Windows or macOS, but that would require working on an extensive license agreement to do so. Just like OpenAI, I can imagine a scenario where money would have to change hands for permissions to visit certain big brand sites. It's an interesting mountain that isn't here just yet for Rabbit to climb, but is certainly on the horizon. It's been a while since the launch in May, but from what I've heard and tested, I'm quietly confident that Rabbit is on the right track for its development into the LAM we all imagined after Jesse went big with the CES announcement. Will it wash the bad taste out of the mouths for those who felt jilted by the expectations vs reality of picking up the R1 without these claimed features in the first place? That remains to be seen. But taking an actions over words approach to the initial response is definitely a smarter approach, and I'm keen to see this speed up and grow into something cross-platform and really quite clever. Other things that Rabbit are bringing are more natural language interactions with the R1 (this went public most recently), and currently an Alpha test of Teach Mode is running. From my time using this, it almost feels like a slicker version of Apple's Shortcuts app -- relying instead on a natural language prompt over needing to manually enter each step of the process. The biggest question that Rabbit hasn't satisfactorily answered is whether a smarter version of its AI gadget will make sense in an age of powerful AI phones like the iPhone 16, Galaxy S24 and Pixel 9. Lyu told me that his future vision expands beyond these phones by having the capabilities of being cross platform -- bringing agentic AI to fulfill any request regardless of the OS needed to do it, rather than being limited to Android or iOS like these phones. But does the regular user actually care about this? That's the big one, which is up for debate -- especially since it's more likely than not that while Rabbit is first out of the gate with a beta, you may see phones get a similar feature set soon. There is a lot of work to do to get to the purported public release of this in roughly six months time, according to Lyu, but based on achieving an OTA update every week to squash bugs and improve the experience, there's the capability to possibly pull it off.
[2]
Rabbit's web-based 'large action model' agent arrives on r1 as early as this week | TechCrunch
The Rabbit r1 was the must-have gadget of early 2024, but the blush fell off it pretty quick when the company's expansive promises failed to materialize. CEO Jesse Lyu admits that "on day one, we set our expectations too high" -- but that an update coming to its devices this month will finally set their vaunted Large Action Model free on the web. While skeptics may (justifiably) see this as too little, too late, or another shifting of goalposts, Rabbit's aspiration of building a platform-agnostic agent for web and mobile apps still has fundamental -- if still largely theoretical -- value. Speaking to TechCrunch, Lyu said that the last six months have been a whirlwind of shipping, bug fixes, improving response times, and adding minor features. But despite 16 over-the-air updates to the r1, it remains fundamentally limited to interacting with an LLM or accessing one of 7 specific services, like Uber and Spotify. "That was the first ever version of the LAM, trained on recordings collected from data laborers, but it isn't generic -- it only connects to those services," he said. Whether or not it was what they call the LAM is pretty much academic at this point -- whatever the model was, it didn't provide the capabilities Rabbit detailed at its debut. But Rabbit is ready to release the first generic, which is to say not specific to any app or interface, version of the LAM, which Lyu demonstrated for me. This version is a web-based agent that reasons out the steps to do any ordinary task, like buying tickets to a concert, registering a website, or even playing an online game. "Our goal is very clear: at the end of September, your r1 will suddenly do lots more things. It should support anything you can do on any website," said Lyu. Given a task, it first breaks that task down into steps, then starts executing them by analyzing what it sees on screen: buttons, fields, images, regardless of position or appearance. Then it interacts with the appropriate element based on what it has learned in general about how websites work. I asked it (through Lyu, who was operating it remotely) to register a new website for a film festival. Taking an action every few seconds, it searched for domain registries on Google, picked one (a sponsored one, I think), put film festival in the domain box, and from the resulting list of options picked "filmfestival2023.com" for $14. Technically I hadn't given it any constraints like "for 2025" or "horror festival" or anything. Similarly, when Lyu asked it to search for and buy an r1, it quickly found its way to eBay, where dozens were on sale. Perhaps a good result for a user but not for the founder of the company presenting to the press! He laughed it off, and did the prompt again with the addition that it should buy only from the official website. The agent succeeded. Next, he had it play Dictionary.com's daily word game. It took a bit of prompt engineering (the model found an out in that it could quickly finish by hitting "end game") but it did it. Whose browser does it use, though? A fresh, clean one in the cloud, Lyu said, but they are working on local versions, like a Chrome extension, that would mean you can use existing sessions and it wouldn't have to log into your services. To that end, as users are understandably (and rightly) wary of giving any company full access to their credentials, the agent is not equipped with those. Lyu suggested that a walled-off small language model with your credentials could be privately invoked in the future to perform logins. It seems to be an open question how this will work, which is somewhat to be expected given the newness of the space. The demo showed me a couple things. First, if we give the company and its developers the benefit of the doubt that this isn't all some elaborate hoax (as some believe), it does appear to be a working, general-purpose web agent. And that would be, if not a first in itself, certainly the first to be easily accessible to consumers. "There are companies doing verticals, for Excel or legal documents, but i believe this is one of the first general agents for consumers," said Lyu. "The idea is you can say anything that can be achieved through a website. We'll have the generic agent for websites first, then for apps." Second, it showed that prompt engineering is still very much needed. How you phrase a request can easily be the difference between success and failure, and that's probably not something ordinary consumers will tolerate. Lyu cautioned that this is a "playground version," not final by any means, and that although it is a fully functioning general web agent, it still can be improved in many ways. For instance, he said, "the model is smart enough to do the planning, but isn't smart enough to skip steps." It wouldn't "learn" that a user prefers not to buy their electronics on eBay, or that it should scroll down after searching to avoid the wall of sponsored results. User data won't be harvested to improve the model... yet. Lyu attributed this to the fact that there's basically no evaluation method for a system like this, so it is difficult to say quantitatively whether improvements have been made. A "teach mode" is also coming, though, so you can show it how to do a specific type of task. Interestingly, the company is also working on a desktop agent that can interact with apps like word processors, music players, and of course browsers. This is still in the early stages, but it's working. "You don't even need to input a destination, it just tries to use the computer. As long as there is an interface, it can control it." Third, there is still no "killer app," or at least no obvious one. The agent is impressive, but I personally would have little use for it, being unfortunately sitting in front of a browser for 8 hours a day anyway. There are almost certainly some great applications, but none sprang to mind that makes the utility of a browser-based automaton as obvious as that of, say, a robot vacuum. I raised the common objection to the entire Rabbit business model, essentially that "this could be an app." Lyu has clearly heard this criticism many times, and was confident of his answer. "If you do the math, it doesn't make sense," he said. "Yes, it's technically achievable, but you're going to piss off Apple and Google from day one. They will never let this be better than Siri or Gemini. Just like there's no way Apple intelligence is going to control Google stuff better, or vice versa. And they take 30% of revenue! If at the beginning we'd just built an app, we'd never have this momentum." The fundamental pitch Rabbit is making is that there can be a third party AI or device that can access and operate all your other services, and from outside them, like you are. "A cross-platform, generic agent system," as Lyu called it. "We'll control every UI, and the website is a good start. Then we'll go to Windows, to MacOS, to phones." Speaking of which: "We never said we'd never build a phone in the future." Isn't that antithetical to their original thesis of a smaller, simpler device? Maybe, maybe not. In the meantime, they're working on starting to fulfill the promises they made early this year. The new model should be available to any r1 owner sometime this week when the OTA update goes out. Instructions on how to invoke it will arrive then as well. Lyu cautioned expectant users with his characteristic understatement. "We're setting the expectations right. It's not perfect," he said. "It's just the best the human race has achieved so far."
[3]
Web-based Large Action Model (LAM) arrives on Rabbit R1: Will it deliver?
Rabbit, the business that gained recognition this year with its R1 gadget, is back with a vital upgrade. As of October 1, the business is introducing a web-based Large Action Model (LAM) to increase the capacities of the R1 by a noteworthy degree. With the R1's popularity diminishing due to not meeting expectations, this news could either revitalize interest in the device or strengthen the opposition. When Rabbit reveals their new feature, there comes the major question of whether or not they will fulfill their hopes of success. Earlier this year, the Rabbit R1 device gained attention throughout the world, because it promised to be a major AI helper or even a "smartphone killer". Unfortunately, the enthusiasm quickly faded as users recognized the limited capabilities of the device and the below-average AI integration present. Since its launch, the R1 has experienced 16 updates, yet it still communicates with a constrained set of services, which has led to widespread dissatisfaction among early supporters. The journey of the R1 has been a challenging one, but Rabbit is not giving up on its mission to deliver a truly revolutionary device. Now, Rabbit is making an ambitious move to deliver on its original promises, through the introduction of a Large Action Model (LAM) due for release on October 1. This innovative AI-driven capability is designed to transform the R1 into a versatile tool capable of handling a wide range of tasks across various websites. Jesse Lyu of Rabbit has indicated that the web-based LAM will enable the R1 to engage in activities such as buying tickets, registering domains, and even playing internet games, all through interactions with web interfaces directly. This promising development could be the key to revitalizing the R1's potential. Nevertheless, what defines a Large Action Model? A Large Access Model (LAM) improves on the framework of a Large Language Model (LLM), such as GPT-4 from OpenAI, by permitting it to take action based on what users input. Even though LLMs are excellent at understanding and producing text, they cannot act. LAMs serve this gap by interpreting user instructions, reviewing multiple data sources, and running intricate tasks such as exploring websites, filling forms, or working with software interfaces. This adjustment from passive language processing to active task execution is a notable progress in AI. Rabbit's web-based LAM aims to make the R1 more engaging and practical. In a recent demonstration, Lyu displayed the R1's ability to secure a domain name for a film festival, using it to search for available options, choose one, and finish the purchase -- all done automatically without human involvement. The demo emphasized the potential for LAMs to carry out procedures that ordinarily demand several steps and user input. Yet, the update faces various obstacles. Acknowledging current realities, Lyu noted that effective, prompt engineering is still necessary for the R1 to produce correct outcomes. When prompted by the user to buy an R1 device online, the AI agent first went to eBay instead of the official website, indicating that the model is still developing strategies to optimize its responses based on user preferences. Despite the criticism, Rabbit remains steadfast in its mission to create a cross-platform artificial intelligence agent that can work without being tied to specific apps or devices. According to Lyu, the R1's unique selling point lies in its capability to engage with multiple digital environments, which could set it apart from other AI-driven devices on the market. This potential for the R1 to be a game-changer in AI technology is something to be excited about. In the lead-up to the October 1 update, Rabbit hopes that the web-based LAM will tackle many of the challenges that have plagued the R1 since its debut. The success of this update in re-engaging people with the device remains uncertain, or it may be regarded as an opportunity lost. For now, the tech sector is keeping a close eye on Rabbit as it tries to realize its ambitious goals.
Share
Share
Copy Link
Rabbit's highly anticipated web-based Large Action Model (LAM) is set to debut on the R1 device this week. This update promises to revolutionize AI assistants by enabling them to interact with web interfaces, potentially changing how we interact with technology.
The tech world is abuzz with excitement as Rabbit prepares to roll out its web-based Large Action Model (LAM) for the R1 device as early as this week. This groundbreaking update promises to elevate the capabilities of AI assistants to unprecedented levels, potentially reshaping our interaction with technology 1.
LAM, or Large Action Model, is Rabbit's proprietary AI system designed to understand and interact with web interfaces. Unlike traditional language models, LAM is trained to comprehend the structure and functionality of websites and apps, enabling it to perform complex tasks across various platforms 2.
With the introduction of web-based LAM, the R1 device is expected to handle a wide array of online tasks. From booking flights and making restaurant reservations to managing social media accounts, the AI assistant aims to simplify digital interactions for users. This advancement could potentially reduce the need for multiple apps and streamline online activities 3.
Tom's Guide recently conducted tests on the R1's next-generation LAM, providing insights into its capabilities. The AI demonstrated proficiency in tasks such as searching for flights, finding nearby restaurants, and even assisting with social media management. However, the testing also revealed some limitations and areas for improvement 1.
The integration of web-based LAM into the R1 device has the potential to significantly enhance user experience. By eliminating the need to navigate multiple apps and websites, users could save time and reduce digital fatigue. This seamless interaction across various online platforms could mark a shift in how we approach daily digital tasks 2.
While the prospects of web-based LAM are exciting, there are challenges to address. Privacy concerns, accuracy of task execution, and the ability to handle complex, multi-step processes are among the factors that will determine the success of this technology. Additionally, the adaptability of LAM to evolving web interfaces and new online services will be crucial for its long-term viability 3.
The introduction of Rabbit's web-based LAM could have far-reaching implications for the tech industry. It may prompt other companies to develop similar technologies, potentially leading to a new generation of AI assistants capable of more sophisticated web interactions. This could, in turn, influence web design practices and app development strategies to better accommodate AI-driven interactions 2.
Reference
[1]
[2]
Rabbit demonstrates a new Android AI agent capable of controlling smartphone apps, shifting focus from its R1 hardware device. This move comes as competitor Humane discontinues its AI Pin, raising questions about the future of AI hardware.
4 Sources
4 Sources
Rabbit Inc. launches 'teach mode' for its R1 AI mobile device, allowing users to create custom AI agents without coding experience. This feature enables the AI to learn and automate complex tasks through user demonstrations.
2 Sources
2 Sources
A comprehensive look at the AI landscape in 2024, highlighting key developments, challenges, and future trends in the rapidly evolving field.
8 Sources
8 Sources
AI agents capable of using computers like humans are emerging, promising to revolutionize how we interact with technology. While still in early stages, these tools raise questions about efficiency, safety, and the future of human-computer interaction.
2 Sources
2 Sources
DeepSeek's emergence disrupts the AI market, challenging industry giants and raising questions about AI's future development and societal impact.
3 Sources
3 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved