5 Sources
[1]
Hugging Face releases a free Operator-like agentic AI tool | TechCrunch
A team at Hugging Face has released a freely available, cloud-hosted computer-using AI "agent." But be forewarned: it's quite sluggish and occasionally makes mistakes. Hugging Face's agent, called Open Computer Agent, is accessible via the web and can use a Linux virtual machine preloaded with several applications, including Firefox. Similar to OpenAI's Operator, you can prompt Open Computer Agent to complete a task -- say, "Use Google Maps to find the Hugging Face HQ in Paris" -- and sit back as the agent opens the necessary programs and figures out the required steps. Open Computer Agent can handle simple requests well enough. But more complicated ones, like searching for flights, tripped it up in TechCrunch's testing. Open Computer Agent also often runs into CAPTCHA tests that it's unable to solve. You'll also have to wait in a virtual queue to use Open Computer Agent -- a queue seconds to minutes long, depending on demand. Of course, the Hugging Face team's goal wasn't to build a state-of-the-art computer-using agent. Rather, they wanted to demonstrate that open AI models are becoming more capable -- and cheaper to run on cloud infrastructure. "As vision models become more capable, they become able to power complex agentic workflows," Ameryic Roucher, a member of the agents team at Hugging Face, wrote in a post on X. "[Some of these models] support built-in grounding, i.e. [the] ability to locate any element in an image by its coordinates, [and] thus [can] click any item [in a virtual machine]." While it's far from perfect, agentic technology is attracting increasing investment as enterprises look to adopt it to boost productivity. According to a recent KPMG survey, 65% of companies are experimenting with AI agents. Markets and Markets projects that the AI agent segment will grow from $7.84 billion in 2025 to $52.62 billion by 2030.
[2]
There's a new AI agent ready to browse the web and fill in forms without the need to touch your mouse
The agent and its open-source demo can see what's on screen, click buttons, fill out forms, and move step-by-step through tasks like a human Hugging Face has introduced its own take on the growing number of semi-independent AI agents that can run online errands for people. The new and free (if limited) Open Computer Agent is like having a personal assistant living inside your web browser. Part of the company's ongoing "smolagents" initiative, the Open Computer Agent can engage with websites and apps like you would, handling an invisible mouse and keyboard to complete requests. The AI can open a browser, type things into forms, click buttons, and more. Ask it to find directions, and it'll go to Google Maps, enter the origin and destination, and show you the route like a dutiful digital chauffeur. You can try it yourself with the live demo. Fair warning, its popularity is causing some delays and errors due to a backlog. The Open Computer Agent is a different philosophy of an idea that has led to similar tools like OpenAI's Operator, Browser Use, Proxy 1.0, and Opera's Browser Operator. Like those tools, Hugging Face's AI agent is all about being an active participant instead of a passive source of information. Like Browser Use, Open Computer Agent is open-source, meaning anyone can see how it works and build on top of it, or at least tweak it for niche use cases. The agent is the start of something more flexible, not a finished product with a million legal disclaimers. That also means the demo is exactly that, a demonstration, not a polished package. It can get things wrong and require you to jump in for logins and CAPTCHA tests. Booking tickets, checking store hours, doing searches, looking up directions, and clicking through menus are all things a lot of people would like to be able to do with a single natural language prompt. It's one thing to ask ChatGPT how to find cheap flights. It's another to watch a tool go to a travel website, scroll through listings, and attempt to click "book now." It might be flawed and far from flashy, but Open Computer Agent represents an approach to AI that might become as common as the now ubiquitous AI image generators.
[3]
Hugging Face Launches Web-based AI Agent Similar to OpenAI's Operator
Hugging Face uses open technologies to build a computer-using agent. Hugging Face on Tuesday launched Open Computer Agent, a free, cloud-hosted AI agent that can operate a virtual computer using text instructions. The tool lets users remotely control a Linux-based machine through a browser. The agent makes use of smolagents, Qwen2-VL-72B (a vision language model), and E2B Desktop under the hood to power it up. As reported by TechCrunch, the agent is equipped with common applications like Firefox and responds to plain-English prompts, such as opening a website or searching for directions. While it manages basic commands effectively, tasks with more complexity often may not be its cup of tea yet. During early testing, slower response times, inconsistent performance, and problems with handling CAPTCHA were also observed, as per the report. Open Computer Agent is currently accessible to the public, though users may have to wait in a virtual queue to get a chance to see it in action. The report states that the experiment isn't aimed at delivering a flawless product. Instead, Hugging Face's goal is to show that open models are becoming increasingly competent and more affordable to run in the cloud. Aymeric Roucher, the project lead for building Agents at Hugging Face, took to X to announce the agent and shared an example to describe its capability, "I asked it how long the soldiers from Alexander had walked from their departure in Macedonia, all the way to India, when they decided they were too tired to continue. It turns out, they had walked quite a bit!" This can be described as an alternative to OpenAI Operator, even though it has some differences. Operator can interact with websites like a human, navigating them, filling out forms, and making purchases. Unlike API-driven automation tools, Operator uses visual processing, controlling a virtual mouse and typing within a browser.
[4]
Hugging Face unleashes a free AI to sort of run your computer
Hugging Face has released Open Computer Agent, a free, cloud-hosted AI tool that can perform tasks on a virtual Linux machine preloaded with applications like Firefox. The agent, accessible via the web, can be prompted to complete tasks such as navigating Google Maps or searching for information. Similar to OpenAI's Operator, Open Computer Agent works by opening necessary programs and figuring out the required steps to complete a task. However, in TechCrunch's testing, it struggled with more complicated requests like searching for flights and often encountered CAPTCHA tests it couldn't solve. Users may also experience wait times ranging from seconds to minutes due to a virtual queue, depending on demand. Aymeric Roucher, a member of Hugging Face's agents team, said the goal was to demonstrate that open AI models are becoming more capable and cheaper to run on cloud infrastructure. "As vision models become more capable, they become able to power complex agentic workflows," Aymeric Roucher, stated in a post on X. "[Some of these models] support built-in grounding, i.e. [the] ability to locate any element in an image by its coordinates, [and] thus [can] click any item [in a virtual machine]." Roucher noted that advancements in vision models, such as Qwen-VL, enable complex agentic workflows by supporting built-in grounding, allowing them to locate and interact with elements in an image. The release aims to showcase the potential of open AI models. The development of agentic technology is attracting significant investment as enterprises look to boost productivity. A KPMG survey found that 65% of companies are experimenting with AI agents, and Markets and Markets projects that the AI agent segment will grow from $7.84 billion in 2025 to $52.62 billion by 2030.
[5]
Hugging Face's Free-to-Use AI Agent Can Perform Computer-Based Tasks
It is a Computer Use agent similar to OpenAI's Operator Hugging Face added an agentic workflow in vision models Open Computer Agent is currently available as a demo Hugging Face released a demo of an artificial intelligence (AI) agent on Tuesday that can complete various web-based tasks. Dubbed Open Computer Agent, the tool is available for free, and anyone can go to the website to access the agent. Since it can perform tasks on web browsers, it can access websites such as Google Search, Maps, and even ticket booking platforms to complete tasks autonomously. Notably, this AI agent is part of Hugging Face's smolagents library, which was introduced in January. In a post on X (formerly known as Twitter), Aymeric Roucher, Project Lead - Agents at Hugging Face, announced the release of Open Computer Agent. As the name suggests, it is an open-source computer-based agent that can autonomously complete a wide range of tasks. The agent has been provided with a Linux virtual machine and several apps, and the Mozilla Firefox web browser. Roucher explained that the AI agent is powered by Qwen2-VL-72B, a vision language AI model that can identify elements in an image by their coordinates. The model allows the agentic tool to analyse whatever is on the screen, take action, and then move to the next step. Notably, the agentic capability in the model is being added by Hugging Face's smolagents tool. It is a free-to-use AI agent, meaning anyone can click here to go to the Open Computer Agent website and try out the tool. Users can ask the computer use agent to find directions to a location, and it will open the browser, go to Google Maps, type in the origin and destination location, and start navigation to show the result. Gadgets 360 staff members were able to test out the AI agent. While it functions as advertised, we found it to be quite slow in completing tasks. It also makes mistakes or fails to perform a task when the prompt is complex. Additionally, since it is a cloud-based free tool, there is a long queue, and visitors might have to wait several minutes before the agent starts working on the task.
Share
Copy Link
Hugging Face has released Open Computer Agent, a free cloud-hosted AI tool that can perform various web-based tasks autonomously. This development showcases the growing capabilities of open AI models and their potential in agentic workflows.
Hugging Face, a prominent AI company, has launched a new tool called Open Computer Agent, a free cloud-hosted AI agent capable of performing various computer-based tasks 1. This development marks a significant step in the evolution of AI agents and their ability to interact with computer systems autonomously.
Open Computer Agent operates on a Linux virtual machine preloaded with applications like Firefox. Users can prompt the agent to complete tasks such as navigating Google Maps or searching for information 2. The agent can open necessary programs, type into forms, click buttons, and execute multi-step processes to accomplish given tasks.
The AI agent is powered by Qwen2-VL-72B, a vision language model that can identify elements in an image by their coordinates 5. This capability allows the agent to analyze screen content, take appropriate actions, and proceed to the next step. The agentic functionality is implemented using Hugging Face's smolagents library, introduced in January 2025.
While Open Computer Agent shows promise, it currently faces several limitations:
Open Computer Agent is similar to other AI agents like OpenAI's Operator, Browser Use, and Opera's Browser Operator 2. However, it distinguishes itself by being open-source, allowing developers to examine its workings and potentially build upon or customize it for specific use cases.
The release of Open Computer Agent reflects a growing trend in AI development:
As vision models continue to advance, they are expected to power increasingly complex agentic workflows, potentially revolutionizing how users interact with computers and online services.
Databricks raises $1 billion in a new funding round, valuing the company at over $100 billion. The data analytics firm plans to invest in AI database technology and an AI agent platform, positioning itself for growth in the evolving AI market.
12 Sources
Business
19 hrs ago
12 Sources
Business
19 hrs ago
Microsoft has integrated a new AI-powered COPILOT function into Excel, allowing users to perform complex data analysis and content generation using natural language prompts within spreadsheet cells.
9 Sources
Technology
19 hrs ago
9 Sources
Technology
19 hrs ago
Adobe launches Acrobat Studio, integrating AI assistants and PDF Spaces to transform document management and collaboration, marking a significant evolution in PDF technology.
10 Sources
Technology
19 hrs ago
10 Sources
Technology
19 hrs ago
Meta rolls out an AI-driven voice translation feature for Facebook and Instagram creators, enabling automatic dubbing of content from English to Spanish and vice versa, with plans for future language expansions.
5 Sources
Technology
11 hrs ago
5 Sources
Technology
11 hrs ago
Nvidia introduces significant updates to its app, including global DLSS override, Smooth Motion for RTX 40-series GPUs, and improved AI assistant, enhancing gaming performance and user experience.
4 Sources
Technology
19 hrs ago
4 Sources
Technology
19 hrs ago