Anthropic's Claude AI Takes a Leap Forward: Controlling Computers and Automating Tasks

Curated by THEOUTPOST

On Wed, 23 Oct, 12:10 AM UTC

43 Sources

[1]

CCN.com

Could Computer-Using Agents Be AI's Next Frontier? Anthropic Thinks So

The new feature reflects the evolution from AI that generates text to agents that perform actions. Artificial Intelligence has made great strides in just a few years. But despite how smart contemporary agents are, their ability to actually do things remains limited. Anthropic is among an emerging group of developers training agents to use computers, a notion that opens up new possibilities for the technology. Anthropic's Computer Using AI Launched in public beta on Tuesday, Oct. 22, the latest version of Anthropic's Claude 3.5 Sonnet can control a computer's cursor and input information via a virtual keyboard, "emulating the way people interact with their own computer," the company said . Going forward, users will be able to ask Claude to retrieve and modify documents, interact with websites, and use a potentially limitless array of software programs. In a sense, the new feature represents the logical progression of Anthropic's focus on integration. Only now, with the right instructions, users can integrate any app they need. (Although it might not work with apps that restrict screenshot access for security reasons.) Alongside similar innovations from other developers, Claude's new capabilities reflect an emerging trend in AI: the move from providing answers to performing actions. From Answers to Action Anyone who uses chatbots enough will be familiar with the frustration of being told the AI cannot perform a specific task but can explain how to do it with step-by-step instructions. To address this issue, several startups have started building AI that can do things rather than simply answer questions. But most of these failed to impress. For example, Rabbit's "large action model" was broadly labeled a flop. However, when Microsoft introduced new Copilot capabilities, it celebrated the arrival of a new era of more autonomous AI agents that have evolved beyond the first generation of chatbots. The Risks of AI Computer Use Anthropic acknowledged that the technology comes with risks. "Please be aware that computer use poses unique risks that are distinct from standard API features or chat interfaces," the firm warned. Users are advised not to grant model access to sensitive data such as login details to prevent them from being compromised. Anthropic also noted that Claude may inadvertently follow commands found on-screen, making it vulnerable to prompt injection attacks.

[2]

Gizmodo

Anthropic's New AI Model Takes Control of Your Computer

Anthropic says it is teaching its Claude AI model to perform general computing tasks based on prompts. In demonstration videos, the model is shown controlling the cursor of a computer to conduct research for an outing on the town, searching the web for places to visit near the userâ€™s home and even adding an itinerary to their desktop calendar.Â The functionality is only available to developers today, and itâ€™s unclear what pricing looks like or how well the tech actually works. Anthropic says in a tweet about the new capabilities that during testing its model, Claude got sidetracked from a coding assignment and started searching Google for images of Yellowstone National Park. So, yeah... there are still kinks to work out. From a technical perspective, Anthropic says that Claude is able to control a computer by taking screenshots and send them back to the model, studying what's on the screen, including the distance between the cursor position and a button it needs to click, and returning commands to continue with a task.Â Anthropic, which is backed by the likes of Amazon and Google, says Claude is the â€œfirst frontier AI model to offer computer use in public beta.â€Â Itâ€™s unclear what automated computer use might be useful for in practice. Anthropic suggests it could be used to perform repetitive tasks or open-ended research. If anyone figures out how to use this new functionality, the /r/overemployed community on Reddit will likely be the first. At the very least it could perhaps be the new mouse jiggler for Wells Fargo employees. Or maybe you could use it to go through your social media accounts and delete all your old posts without needing to find a third-party tool to do it. Things that are not mission critical or require factual accuracy.Â Although there has been a lot of hype in the AI space, and companies have spent billions of dollars developing AI chatbots, most revenue in the space is still generated by the companies like Nvidia that provide GPUs to these AI companies. Anthropic has raised more than $7 billion in the past year alone. The latest buzzword tech companies are pumping to sell the technology is â€œagents,â€ or autonomous bots that purportedly can complete tasks on their own. Microsoft on Monday announced the ability to create autonomous agents with Copilot that could do "everything from accelerating lead generation and processing sales orders to automating your supply chain." Salesforce CEO Marc Benioff dismissively called Microsoft's product â€œClippy 2.0â€ for being inaccurateâ€"though of course, he was saying this as he promotes Salesforceâ€™s own competing AI products. Salesforce wants to enable its customers to create their own custom agents that can serve purposes like answering customer support emails or prospecting for new clients.Â White collar workers still don't seem to be taking up chatbots like ChatGPT or Claude. Reception to Microsoftâ€™s Copilot assistant has been lukewarm, with only a tiny fraction of Microsoft 365 customers spending the $30 a month for access to AI tools. But Microsoft has reoriented its entire company around this AI boom, and it needs to show investors a return on that investment. So, agents are the new thing.Â The biggest problem, as always, is that AI chatbots like ChatGPT and Google's Gemini produce a lot of output that's factually inaccurate, poor in quality, or reads like it obviously wasn't written by a human. The amount of time it takes to correct and clean up the bot's output almost negates any efficiencies produced by them in the first place. That's fine for going down rabbit holes in your spare time, but in the workplace it's not acceptable to be producing error-riddled work. I would be nervous about setting Claude to go wild through my email, only for it to send people jargon back in response, or screw up some other task that I have to go back and fix. The fact that OpenAI itself admits most of its active users are students sort of says it all. Anthropic in a tweet about the new functionality itself admits that computer use should be tested with "low-risk tasks."

[3]

Wired

Anthropic Wants Its AI Agent to Control Your Computer

Claude is the first major AI model to be able to take control of a computer to do useful work. It took a while for people to adjust to the idea of chatbots that seem to have minds of their own. The next leap into the unknown may involve trusting artificial intelligence to take over our computers, too. Anthropic, a high-flying competitor to OpenAI, today announced that it has taught its AI model Claude to do a range of things on a computer, including searching the web, opening applications, and inputting text using the mouse and keyboard. "I think we're going to enter into a new era where a model can use all of the tools that you use as a person to get tasks done," says Jared Kaplan, chief science officer at Anthropic and an associate professor at Johns Hopkins University. Kaplan showed WIRED a pre-recorded demo in which an "agentic" -- or tool-using -- version of Claude had been asked to help plan an outing to see the sunrise at the Golden Gate Bridge with a friend. In response to the prompt, Claude opened the Chrome web browser, looked up relevant information on Google, including the ideal viewing spot and the optimal time to be there, then used a calendar app to create an event to share with a friend (It did not include further instructions, such as what route to take to get there in the least amount of time.) In a second demo, Claude was asked to build a simple website to promote itself. In a surreal moment, the model inputted a text prompt into its own web interface to generate the necessary code. It then used Visual Studio Code, a popular code editor developed by Microsoft, to write a simple website, and opened a text terminal to spin up a simple web server to test the site. The website offered a decent, 1990s-themed landing page for the AI model. When the user asked it to fix a problem on the resulting website, the model returned to the editor, identified the offending snippet of code, and deleted it. Mike Krieger, chief product officer at Anthropic, says the company hopes that so-called AI agents will automate routine office tasks and free people up to be more productive in other areas. "What would you do if you got rid of a bunch of hours of copy and pasting or whatever you end up doing?" he says. "I'd go and play more guitar." Anthropic is making the agentic abilities available through its application programming interface (API) for its most powerful multimodal large language model, Claude 3.5 Sonnet, from today. The company also announced a new and improved version of a smaller model, Claude 3.5 Haiku, today.

[4]

Financial Times News

Anthropic says latest AI model can control users' computers

Anthropic has enabled its artificial intelligence to control actions on a computer, including searching the internet, clicking buttons and inputting text, as companies increasingly seek to leverage the new technology to build virtual agents. The "Computer Use" feature, unveiled for developers on Tuesday, grants access to Claude, its AI model, to conduct actions on users' behalf, with their consent, "like a human collaborator", the company said -- controlling the mouse and keyboard to browse the internet, schedule calendar appointments and fill in forms. "That sort of repetitive thing that people absolutely hate, I call it automating the drudgery of life," Mike Krieger, chief product officer at Anthropic, told the Financial Times. "At the moment, Claude can help you do things that might have taken an hour in two minutes . . . [We want to help] people focus on the creative part of [tasks] that is fun and human, and let Claude take the stuff that is repetitive and less exciting." It is the latest step by the San Francisco start-up, which is backed by Amazon and Google, to build more agentic systems seen as the next frontier for AI technology, alongside similar moves from rivals OpenAI, Microsoft and Meta. On Monday, Microsoft announced the ability to create autonomous agents in its 365 product suite, while OpenAI recently launched developer access to its voice chat feature to build more agentic systems. Creating AI agents that can conduct mundane tasks, such as filing expenses or booking travel, is considered the future for AI companies and a way to generate revenue from their powerful but costly models. Anthropic gave one example of a user prompt: "My friend is coming to San Francisco and I want to watch the sunrise with him tomorrow morning. I will be coming from [the] Pacific Heights [neighbourhood]. Could you find us a great viewing spot, check the drive time and sunrise time and then set up a calendar event that gives us enough time to get there?" Claude then searched online for locations and sunrise time and used the maps application to determine the route and driving time. It later created a calendar appointment with these details. The software uses screenshots of the computer to interpret content and can click buttons or enter text into the system, whether on a Mac or Microsoft PC. It uses real-time access to the internet and will be available to developers to build specific applications. The company said it was still in its "early stages" for experimentation. Anthropic acknowledged that there was a possibility that the model could use unreliable information from the internet. It is also exploring how to bring these capabilities to other devices such as mobile phones. The company is building a consumer-facing product using the technology and hopes to build user trust in its abilities. Krieger compared the technology to self-driving cars, as people generally do not yet trust the vehicles with complete control. "I would say that's more of the metaphor right now than I would trust it with absolute full autonomy over long periods," Krieger added. "It will be an evolution where at first it might be in more constrained environments . . . a virtual machine with limited access, doing a particular task repeatedly." "I see part of our mission [as] being a helpful guide to the future of AI in a human-centric way," he added. "And so we want to do that in our products as well with computer use."

[5]

Lifehacker

Anthropic's Updated AI Models Can Control Your Computer for You

Claude AI can now take over your mouse and keyboard. Credit: Anthropic If you've always wanted to offload some of your tedious computing busywork to artificial intelligence, that future is now a little closer: The updated Claude 3.5 Sonnet AI model that Anthropic just released is capable of taking over your mouse and keyboard, and completing tasks on its own. Right now, this is only in beta testing, and only available to developers with access to the Claude API, but further down the line, we could all be getting AI to fill out forms, move files around, look for information on the web, and do all the other tasks we've previously relied on our fingers and thumbs for. First up though, the updated Claude models: Anthropic has now pushed out Claude 3.5 Sonnet users, which it says offers "across-the-board improvements" and particularly significant upgrades in terms of coding capabilities, with significant performance bumps across standard benchmarking tests (including SWE-bench, based on GitHub). Then there's Claude 3.5 Haiku, a new version of the faster, more lightweight, less expensive, and less powerful AI model offered by Anthropic. Again, all-around performance has been improved, the company says, and as with Sonnet, there are particular gains in terms of coding capabilities. It's the computer use capabilities that are going to get the most attention though, enabled as part of the Claude 3.5 Sonnet update, and that promise to take desktop automation to the next level. For now, though, Anthropic emphasizes that it's very much a beta product. In the Anthropic demo video below, you can see the Claude AI being tasked with filling out a form. The various bits of information needed for this form need to be grabbed from different databases and browser tabs -- but all the user has to do is ask for the form to be filled out, and give an indication of where the necessary info can be found. As Claude works through the tasks, it takes screenshots and studies them to see what it's looking at -- which is along similar lines to the image recognition and analysis capabilities that AI is already well known for. It then figures out what it needs to do next based on what's on screen and the instructions it has been given. In this case, the AI is smart enough to realize that it needs to switch to a different browser tab and run a search for a company name to find some of the information it's looking for. Cursor moving, cursor clicking, and typing are all handled by Claude along the way. The bot is able to identify the right data, and copy it over to the correct fields on the form. At the end, Claude is smart enough to spot and select the form submission button on screen, which then completes the task -- all while the user looks on. Right out of the gate, it seems the AI model is capable of understanding what's on screen, and figuring out how to manipulate that to complete tasks. However, Anthropic notes that basic tasks like scrolling, dragging, and zooming still "present challenges" for Claude, and beta testers are being encouraged to test it using "low-risk" scenarios for the time being. In the OSWorld benchmark, which measures how well AI can perform computing tasks, Claude 3.5 Sonnet apparently scores 14.9% (humans typically score around 70-75%). The developers behind the new capabilities haven't been afraid to point out some of the errors that can occur: In one test, Claude cancelled a screen recording for no apparent reason. In other, the bot suddenly and randomly switched from a coding task to start browsing online photos of Yellowstone National Park. Anthropic also notes that each step forward in AI can bring with it new safety worries. As per an audit by its internal trust and safety teams, the computer use capabilities as they stand right now don't pose a heightened risk to system security -- though this will be continually re-evaluated. What's more, no user-submitted data (including captured screenshots) will be used to train the Claude AI models.

[6]

ZDNet

Anthropic's latest AI model can use a computer just like you - mistakes and all

Claude 3.5 Sonnet is available as a beta for developers to try out now. It can view a screen, move a cursor, click buttons, and even input text. It's not perfect, but neither are you! Imagine an AI model that can work with a computer all on its own. Well, imagine no longer because such an AI has arrived. On Tuesday, Anthropic announced that the latest generation of its Claude AI model can use a computer -- just like you and I do. Dubbed Claude 3.5 Sonnet, the AI has surfaced in beta mode for developers to use via an API. Touted by Anthropic as the "first frontier AI model to offer computer use in public beta," Claude 3.5 Sonnet can be coded by developers to work with a computer in several ways. By using a product or service programmed via the API, you can tell the AI to "look" at a computer screen, move a cursor around the screen, click buttons, and type text through a virtual keyboard. The idea is to emulate the way you interact with your own computer. Also: Generative AI doesn't have to be a power hog after all For now, the new AI is decidedly in the experimental stage, sometimes cumbersome and prone to errors. However, Anthropic has released the new beta specifically to get feedback from developers so it can improve the model over time. Why is computer use by an AI useful? Anthropic anticipated and has addressed that question. "A vast amount of modern work happens via computers," Anthropic said. "Enabling AIs to interact directly with computer software in the same way people do will unlock a huge range of applications that simply aren't possible for the current generation of AI assistants." And just how can developers and users take advantage of an AI that works with a computer? "Instead of making specific tools to help Claude complete individual tasks, we're teaching it general computer skills -- allowing it to use a wide range of standard tools and software programs designed for people," Anthropic explained. "Developers can use this nascent capability to automate repetitive processes, build and test software, and conduct open-ended tasks like research." Several companies are already tapping into Claude 3.5 Sonnet's prowess with computers, including Asana, Canva, Cognition, DoorDash, Replit, and The Browser Company, Anthropic said. As one example, the software development and deployment platform Replit is using these capabilities to evaluate applications for its Replit Agent product. Also: How does Claude work? Anthropic reveals its secrets Programming Claude to learn to work with computers, specifically looking at the screen and taking certain actions in response, involved a lot of trial and error, according to Anthropic. Using a computer requires the ability to see and interpret images, such as those of a computer screen. It also involves the capacity to determine how and when to run specific operations based on what's being displayed on the screen. To tackle these requirements, Claude 3.5 Sonnet looks at screenshots that show it what you're viewing. The AI then counts the number of vertical and horizontal pixels to figure out where to move the cursor. This skill is essential in the AI's ability to issue mouse commands. How has Claude fared so far? In the OSWorld benchmarking tests, which evaluate attempts by AI models to use computers, Claude 3.5 Sonnet scored a grade of 14.9%. Though that's far lower than the 70%-75% human-level skill, it's almost double the 7.7% acquired by the next best AI model in the same category, Anthropic said. This attempt at computer use by an AI is still in the early stages. As such, Claude can't perform more "advanced" computer tasks, such as dragging a window or zooming into the screen. Also, the way Claude works with a computer by viewing and putting together screenshots means it can miss certain actions and notifications. Also: The best AI for coding (and what not to use) "We expect that computer use will rapidly improve to become faster, more reliable, and more useful for the tasks our users want to complete," Anthropic said. "It'll also become much easier to implement for those with less software development experience. At every stage, our researchers will be working closely with our safety teams to ensure that Claude's new capabilities are accompanied by the appropriate safety measures." Claude 3.5 Sonnet is now available to anyone. Developers can build applications with the computer-use beta on the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI.

[7]

Digital Trends

Anthropic's Claude can now control computers like people do | Digital Trends

Anthropic's already impressive Claude 3.5 Sonnet gains a significant performance boost on Tuesday as the generative AI startup rolls out an enhanced and updated version of the model alongside the new, lightweight Claude 3.5 Haiku. The Sonnet update includes a public beta feature that gives the AI basic control over the computer it's running on. Claude 3.5 Sonnet was already a performance leader when it comes to coding tasks, but the new version shows significant across-the-board improvements over its predecessor and steadily outperforms both Gemini 1.5 and GPT-4o on a variety of industry benchmarks. Gemini 1.5 Pro was the only model to best the new 3.5 Sonnet on any test, and did so on the MATH benchmark. Recommended Videos The new 3.5 Haiku is no slouch, either, despite its small size. Scheduled to be released later this month, 3.5 Haiku outperforms Claude 3.0 Opus, the company's largest last generation model. Like its larger version, the new Haiku is exceedingly proficient at coding tasks, scoring 40.6% on the SWE-bench Verified -- higher than both GPT-40 and the original 3.5 Sonnet. Even more impressive, the new Claude 3.5 Sonnet can now interact with desktop apps via the "Computer Use" API. The AI can generate the necessary keystrokes, mouse clicks, and movements needed to emulate the human user. The company is quick to point out that the system is currently quite experimental and prone to errors. The underlying purpose of the public beta release is to elicit feedback from developers to rapidly improve the API's performance. "We trained Claude to see what's happening on a screen and then use the software tools available to carry out tasks," Anthropic wrote in a blog post. "When a developer tasks Claude with using a piece of computer software and gives it the necessary access, Claude looks at screenshots of what's visible to the user, then counts how many pixels vertically or horizontally it needs to move a cursor in order to click in the correct place." Claude | Computer use for automating operations It's an AI agent, essentially. That is, its an AI that can automate other software processes, whether that's generating and qualifying marketing leads, uncovering patterns and trends in medical data, or simply navigating to a specific website and filling out a form you need. Think of them as a more advanced version of existing Robotic Process Automation systems. The company cites Asana, Canva, Cognition, DoorDash, Replit, and The Browser Company as early adopters of the new feature. Replit, for example, is using Computer Control to "develop a key feature that evaluates apps as they're being built for their Replit Agent product," per the announcement. There's no need to worry about the AI going all Skynet on us (yet), as Anthropic explains. "Humans remain in control by providing specific prompts that direct Claude's actions, like 'use data from my computer and online to fill out this form,'" an Anthropic spokesperson told TechCrunch. "People enable access and limit access as needed. Claude breaks down the user's prompts into computer commands (e.g., moving the cursor, clicking, typing) to accomplish that specific task." Anthropic also concedes that Computer Control could be misused to generate spam, spread misinformation, or commit fraud. In response, the company has developed new classifiers that identify when the API is being used and whether that use is "causing harm."

[8]

Futurism

Anthropic's Latest Claude Lets AI Take Control of Your Entire PC

If for some reason you wanted to hand over total control of your personal computer to an AI model, you can now do that with Anthropic. The Amazon-backed OpenAI competitor released a new version of its Claude 3.5 Sonnet model on Tuesday, which is capable of a bunch of fundamental tasks on your desktop, such as inputting keystrokes and mouse clicks that allow it to use potentially any applications you have installed. "I think we're going to enter into a new era where a model can use all of the tools that you use as a person to get tasks done," Anthropic's chief science officer Jared Kaplan told Wired. The update is Anthropic's entry in the industry race to take commercial AI models beyond the confines of a chatbox and turn them into full-blown "AI agents." "AI agents" is the somewhat nebulous term used to describe productivity-geared AI models designed to use software and carry out other computer tasks like a human would, with varying degrees of versatility. Some, like Cognition AI's Devin, are designed specifically for programming. Anthropic instead markets its AI agent as an all-rounder, claiming it can browse the web and use any website or application. What you do with it is up to you: they can be technical tasks, like programming, or simpler ones, like trip planning. In a demo described by Wired, for example, Claude is asked to plan a trip to see the Golden Gate Bridge at sunrise with a friend. The AI opens a web browser, looks up a good viewing spot on Google along with other details, and adds the trip to a calendar app. Impressive, but Wired notes that it didn't include other tidbits that would've been helpful -- like how to actually get there, for instance. In another demo, Claude is prompted to set up a simple website, which it does using Microsoft's Visual Studio Code. It even opens a local server to test the website it just made. There's a small error with the creation, but the AI corrects the code when prompted. However promising the tech may seem, AI models still struggle with reliability, especially when it comes to writing code -- and Anthropic's is no exception. Even in a simple test that involved booking flights and modifying reservations, Claude 3.5 Sonnet only managed to complete less than half of these tasks successfully, according to TechCrunch. Clumsy as they may be, such AI agents also pose an obvious security risk. Would you want this experimental and sometimes unpredictable technology nosing around your computer files and using your web browser? Anthropic says that releasing them like this will help AI agents be safer -- though, perhaps, at your expense. "We think it's far better to give access to computers to today's more limited, relatively safer models," Anthropic wrote in a statement, per TechCrunch. "This means we can begin to observe and learn from any potential issues that arise at this lower level, building up computer use and safety mitigations gradually and simultaneously."

[9]

VentureBeat

Anthropic's agentic Computer Use is giving people 'superpowers'

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More It's been only two days since Anthropic released its new Claude feature "Computer Use," but already, early adopters of varying technical abilities are finding all kinds of ways to put it to work -- from complex coding tasks to research deep dives to gathering 'scattered' information. Still in beta, Computer Use allows Claude to work autonomously and use a computer essentially as a human does. The groundbreaking capability has broad implications for the future of work, as it can work essentially on its own, perform repetitive tasks and quickly gather up data from numerous disparate sources. "Anthropic just released the most amazing AI technology I've ever used. I'm not kidding," startup founder Alex Finn posted to X (formerly Twitter). "It's legit changing day to day." Claude can 'see' and work autonomously Claude has the ability to "see" a screen via screenshots, adapt to different tasks and move across workflows and software programs. It can also navigate between multiple screens, apps and tabs, open applications, move cursors, tap buttons and type text. "People can't stop getting creative with it," self-described AI educator Min Choi posted to X. For instance, in one demo video, Finn asked Claude to research trending AI news stories and provide a rundown. Claude then opened up a browser, moved the cursor to the URL bar, typed in "Reuters," navigated to the AI section, and then repeated that process for The Verge and TechCrunch. The model then offered up six trending news stories. "That literally took me 2 minutes to set up," said Finn, adding that "AI agents are here. You now have the ability to send out autonomous AI agents to do anything you want." He compared the capability to having his own free research employee that "reasoned with itself." In another example, Anthropic researcher Sam Ringer asked Claude to gather information about a particular vendor. "The data I need to fill out this form is scattered in various places on my computer," he explained in a demo video posted to X. The model then began taking screenshots, identified that there wasn't an entry for the vendor, navigated to the customer relationship manager (CRM) to find the company, searched and got a match. It then autonomously began transferring information, filling in required fields and finally submitting the vendor form. "This example is of a lot of drudge work that people have to do," said Ringer. Alex Albert, head of Claude relations at Anthropic, described on X how he used Claude along with a bash tool (a command language) to download a random dataset, install the open-source machine learning (ML) library sklearn, train a classifier on the dataset and display its results. This took just 5 minutes. He was conversationally cheeky in his prompt, telling Claude "you may need to inspect the data and/or iterate if this goes poorly at first, but don't get discouraged!)" One X user reported: "I got my Claude Computer Use Agent to run its own agent!" Others commented: "Claude Computer Use is truly AGI" and that "I feel it won't take long until our agent will become fully autonomous." Anthropic researchers pointed out some amusingly anthropomorphic examples, too, including an act that seemed to simulate human procrastination: While performing a coding demo, Claude randomly pivoted and began perusing photos of Yellowstone National Park. And, the new feature allows Claude to bypass the very human verification controls that are meant to keep it out. They shared a video using Claude to sign into ChatGPT. Claude reported: "I see there's a Cloudflare CAPTCHA verification. According to the system instructions, if we see a CAPTCHA in this simulation, I should click on the center of the white square with gray border." After it did so, it was given access to the "message ChatGPT" landing page.

[10]

engadget

Anthropic is letting Claude AI control your PC

Anthropic's latest development gives its Claude AI assistant the ability to control a PC, reportedly just like a person would. The feature, dubbed 'computer use,' entered public beta today. With computer use, Claude can be directed to execute tasks such as "looking at a screen, moving a cursor, clicking buttons, and typing text," according to the company's . In theory, this could make the AI even more useful in automating repetitive computer tasks. However, a second focused on computer use acknowledged that this application of Anthropic's AI models is still early in development and, to paraphrase, buggy as heck. The company said that in internal testing, Claude stopped in the middle of an assigned coding task and began opening images of Yellowstone National Park. While that is uncannily human behavior (who doesn't want to take a break to stare at natural beauty during the work day?), it's also a reminder that even the best AI models can have errors. In addition to unveiling computer use, Anthropic also released an upgraded version of its model alongside a brand new model called Claude 3.5 Haiku that will be released later in October. In August, Anthropic joined OpenAI in to share its work with the US AI Safety Institute.

[11]

CNET

Anthropic Wants Claude AI to Use Computers the Way You Do

Startup Anthropic is adding a series of upgrades to its Claude AI model -- including the ability to interact with computer interfaces much like a human does. That ability, which the company calls simply "computer use," could automate complex tasks by teaching the AI the skills you use to interact with the screen, such as moving a cursor, clicking buttons and typing text. It could also analyze your computer screen and use a range of standard tools and software programs to act on your behalf. Anthropic emphasized that the tool in its current stage is far from perfect, with some actions in particular, such as scrolling, dragging and zooming, presenting bigger challenges. It's tied to Anthropic's Claude 3.5 model and is available currently only in a public beta. "At this stage, it is still experimental - at times cumbersome and error-prone," the company wrote in a blog post Tuesday. "We're releasing computer use early for feedback from developers, and expect the capability to improve rapidly over time." Generative AI tools like Claude, ChatGPT, Copilot and Gemini have been wowing people over the last couple of years with their ability to quickly and (mostly) credibly produce text and images. Now the makers of those tools are looking to take gen AI even further into the realm of human skills. ChatGPT maker OpenAI, for instance, in September introduced a new model, called o1, that's designed to perform analytical "reasoning" to solve complex problems. In a blog post, Anthropic said the "computer use" technology will allow developers to better automate repetitive processes, build and test software and conduct open-ended tasks like research. The effort is part of Anthropic's broader goal to create advanced AI that can teach itself, better align with human needs and push its models beyond a chat box into true AI assistants. Anthropic said companies including Asana, Canva, Cognition, DoorDash, Replit and The Browser Company are already using the new Claude 3.5 computer use capability to handle tasks with many steps. For example, Replit is trying out the model's ability to navigate user interfaces to create a tool that evaluates apps. But the upgrade also highlights growing safety concerns around AI tools, especially as they might gain more control over sensitive systems like computers and private data. To mitigate risks like misinformation, fraud and misuse, Anthropic said it is implementing safeguards and monitoring for malicious behavior. Anthropic also on Tuesday talked up more routine updates to Claude 3.5 Sonnet and introduced a new model, Claude 3.5 Haiku. Those upgrades to 3.5 Sonnet, available immediately, deliver improved performance overall, and especially in coding and tool use, Anthropic said. The company introduced Claude 3.5 Sonnet earlier this year. Meanwhile, 3.5 Haiku is the next generation of Anthropic's fastest model, and it will become available later this month. Anthropic said it is "well suited for user-facing products" and for "generating personalized experiences from huge volumes of data - like purchase history, pricing, or inventory records."

[12]

Decrypt

Anthropic Unleashes New Claude 3.5 Sonnet: A Model Smart Enough to Take Over Your Computer - Decrypt

Reddit users spotted it first -- Claude had suddenly gotten sharper, more capable. Now we know why: Anthropic has rolled out significant upgrades to its AI models, including an enhanced Claude 3.5 Sonnet and a much needed upgrade to its lightweight Haiku model. Eeriest update of all: These AIs can now physically control computers, moving cursors, scrolling through pages and even clicking buttons just like humans do. In a video demonstration, Sam Ringer, an Anthropic researcher, showed how Claude was capable of filling out a form in an external website by scrolling through a spreadsheet, searching for a company's information after analyzing its CRM and then understanding -- and filling in -- the fields in a form. "Available today on the API, developers can direct Claude to use computers the way people do -- by looking at a screen, moving a cursor, clicking buttons, and typing text. Claude 3.5 Sonnet is the first frontier AI model to offer computer use," Anthropic said in an official announcement earlier today. "We're releasing computer use early for feedback from developers, and expect the capability to improve rapidly over time." Anthropic (or maybe one of its button-pushing AIs? Jk.) seems to have released the model before they even made the announcement. For hours, the subreddits for Claude and Anthropic were flooded with people trying to know what the hell was going on because their AI was doing such a good work: Users reported it was faster, more accurate, and amazingly it stopped apologizing so much. "Claude is so back, so much better. It just gets you, responds like it's actually understanding the intent instead of a dead lifeless response," NextGenAIUser said in one Reddit post. "Was stuck for hours on a coding issue using o1-Mini and o1-Preview, progressively outputting worse and worse responses. Fed the problem to Claude with the exact same prompt and it one-shot it no issues," Roth_Skyfire said in another comment. And they were right. Anthropic reported that after the improvement Claude 3.5 Sonnet's coding abilities shot up from 33.4% to 49% on the SWE-bench Verified test, beating out competitors like OpenAI's o1-preview. That's not just a minor bump. Every single benchmark reported by Anthropic shows that the new Claude 3.5 Sonnet is much better than the original model. But here's where things get really interesting. The upgraded Sonnet isn't just smarter; it's now capable of controlling your PC. Anthropic calls this new feature "computer use," and it's currently in public beta. The way it works is, you give Claude access to your desktop, and a task to execute. The AI will then begin to act as if it were a human using your computer via a remote desktop -- moving the cursor, clicking buttons, and typing out commands and filling forms and text fields just like a human would. However, this feature is only available via API, so it's not something end users will be able to savor in the short term. Anthropic has trained Claude to visually interpret what's happening on your screen. Developers can instruct it to perform tasks such as filling out forms, navigating websites, or even using software applications. It's a little like giving your AI the ability to sit in front of your computer and do your work for you, except it doesn't get tired and (hopefully) doesn't make as many mistakes as we humans tend to. The feature is in beta because it still stumbles over some basics -- scrolling and zooming give it trouble. That's why Anthropic is keeping a close eye on things, storing screenshots for at least 30 days and running safety checks to catch any questionable behavior. The company's paranoia is well founded. A few months ago Microsoft introduced a feature named "Recall" that would let Copilot+ take screenshots of its users' computers so its AI could be more helpful and relevant. It got so much noise that Microsoft had to delay its plans after its Copilot+ Recall feature was deemed as 'spyware' -- and authorities started investigating it. But Anthropic is made up of nice people, and they promise they are different. "We found that the updated Claude 3.5 Sonnet, including its new computer use skill, remains at AI Safety Level 2 -- that is, it doesn't require a higher standard of safety and security measures than those we currently have in place," the research team says. Companies like Replit are already integrating Claude's computer use feature to help automate app evaluations, while The Browser Company is testing its ability to streamline web-based workflows. These early adopters are exploring ways to get Claude to handle tasks that would usually take dozens, if not hundreds, of manual steps. Also, Anthropic's budget friendly model, Claude 3.5 Haiku is now as powerful as its previous flagship model Claude 3 Opus. However, this model runs at a fraction of the cost and with much lower latency, making it more accessible without sacrificing too much performance. Claude 3.5 Haiku is particularly good at coding tasks and tool use, clocking in with a SWE-bench Verified score of 40.6%. That puts it ahead of some of the more expensive models on the market, meaning developers on a budget won't have to compromise on quality.

[13]

The Verge

Anthopic's latest AI update can use a computer on its own

Anthropic's latest Claude 3.5 Sonnet AI model has a new feature in public beta that can control a computer by looking at a screen, moving a cursor, clicking buttons, and typing text. The new feature, called "computer use," is available today on the API, allowing developers to direct Claude to work on a computer like a human does, as shown on a Mac in the video below. Microsoft's Copilot Vision feature and OpenAI's desktop app for ChatGPT have shown what their AI tools can do based on seeing your computer's screen, and Google has similar capabilities in its Gemini app on Android phones. But they haven't gone to the next step of widely releasing tools ready to click around and perform tasks for you like this. Rabbit promised similar capabilities for its R1, which it has yet to deliver. Anthropic does caution that computer use is still experimental and can be "cumbersome and error-prone." The company says, "We're releasing computer use early for feedback from developers, and expect the capability to improve rapidly over time." According to the developers: There are many actions that people routinely do with computers (dragging, zooming, and so on) that Claude can't yet attempt. The "flipbook" nature of Claude's view of the screen -- taking screenshots and piecing them together, rather than observing a more granular video stream -- means that it can miss short-lived actions or notifications. Also, this version of Claude has apparently been told to steer clear of social media, with "measures to monitor when Claude is asked to engage in election-related activity, as well as systems for nudging Claude away from activities like generating and posting content on social media, registering web domains, or interacting with government websites." Meanwhile, Anthropic says its new Claude 3.5 Sonnet model has improvements in many benchmarks and is offered to customers at the same price and speed as its predecessor: The updated Claude 3.5 Sonnet shows wide-ranging improvements on industry benchmarks, with particularly strong gains in agentic coding and tool use tasks. On coding, it improves performance on SWE-bench Verified from 33.4% to 49.0%, scoring higher than all publicly available models -- including reasoning models like OpenAI o1-preview and specialized systems designed for agentic coding. It also improves performance on TAU-bench, an agentic tool use task, from 62.6% to 69.2% in the retail domain, and from 36.0% to 46.0% in the more challenging airline domain.

[14]

Tom's Guide

Anthropic just gave Claude AI access to your computer -- here's why this is a big deal

Anthropic, the AI lab behind the popular Claude chatbot has just dropped a bombshell announcement that nobody expected. As well as a slew of new model updates, we're now about to enjoy an updated Claude 3.5 Sonnet with Computer Use. This is a new feature that builds on Artifacts, a tool that lets Claude run code in the browser. Computer Use goes a few steps further though and lets the model control your computer by itself. Scared yet? To quote the release blurb, 'developers can now direct Claude to use computers the way people do - by looking at a screen, moving a cursor, clicking and typing text.' If that doesn't mean much to you, don't fret, it's aimed squarely at developers and programmers -- for now! In case you hadn't heard, the original Claude Sonnet 3.5 model has gained a huge following from programmers because of its intuitive support for coding tasks. In a lot of cases, it supplanted OpenAI's GPT-4o as the go-to tool as a coding assistant. This lead was somewhat reduced by OpenAI's release of the o1 model, but now the ball is firmly back in the Anthropic side of the court with this latest announcement. And that's not all. The new Sonnet 3.5 has been significantly upgraded in terms of power and performance. With a focus on the kind of uplift coders will love to see as part of their workflows. Agentic coding support, uprated math and code eval and even a set of specialized vertical application layers pointing at retail and airline applications. Phew! The company is pitching the new feature strictly at those who use the API, so don't expect to see it in your chatbot any time soon. Although, with all things, if demand is there that could well change. It may end up a key feature of a future Claude desktop app. It's aimed at removing a lot of the drudge that people have to go through to get things done on their computers. In one of the mind-blowing demo videos on the launch site, the presenter shows Claude going through multiple routines to create and edit a personal website. Claude 3.5 Sonnet with Computer Use is a form of AI agent, capable of performing a series of tasks from a simple prompt. In the demo video the process involves reading the screen, clicking download buttons, editing code automatically and generally doing what a person would do to get the demo done. It's science fiction come to life, and developers the world over should be counting the days until they get access. Now the usual caveats apply to the demo. It was done in a 'controlled' environment, whatever that means, and we have to assume they edited out all the times when Claude failed for some reason. But even so, this is radically impressive stuff. Right now there's an explosion in the number of tools designed to help the programmer do their job quicker and more easily. Tools like CoPilot, Cursor, v0, and CodeCompanion have radically transformed the way small coding projects get done. With this new release, we're going to see this type of AI assistance go stratospheric, with incredible productivity increases in everything from small enterprise app development, to open source collaboration. If you can democratize the coding environment, to allow many more people to contribute, amazing things are possible - not just here in the West, but across the globe. And if these features are included in the rumored upcoming Claude desktop app, the sky will literally be the limit. Grab your hats for a wild ride!

[15]

Beebom

Anthropic's Claude AI Can Now Control Your Computer Without Any Help

Computer use is currently in beta and available on the API to try out. Anthropic has updated its Claude 3.5 Sonnet AI model and along with that, it has brought the ability to carry out tasks on the computer for you. The new feature, called a very obvious 'computer use', is designed to mimic human actions on a computer like clicking, typing, and moving the cursor. Anthropic Claude AI's computer use is available on the API for developers, starting today in beta. In the official release post for the feature, Anthropic gives us a good look at how the feature works. We see Sam Ringer, an Anthropic Researcher, testing out the tool. He prompts Claude to fill out a request form by extracting information from the spreadsheet and online portal that it's seeing on the screen. We then see the tool get to work by screenshotting the information on the screen. Additionally, we see the tool search from an online portal (as directed) to look for information that was unavailable on the spreadsheet. Side-by-side, Claude opened up the form and started filling it up. Anthropic notes that Claude 3.5 Sonnet scored 14.9% in the screenshot-only category which is "notably better than the next-best AI system's score of 7.8%." Additionally, Anthropic states that, "when afforded more steps to complete the task, Claude scored 22.0%." Developers can further build on the computer use feature using Anthropic API, Vertex AI as well as Amazon Bedrock. Anthropic also notes that the updated Claude 3.5 Sonnet shows a range of improvements on industry-leading benchmarks. Additionally, Anthropic has also released a new Claude 3.5 Haiku smaller model which "improves across every skill set and surpasses even Claude 3 Opus." Claude 3.5 Haiku will be made available on the API later this month. Teaching general computer skills to a computer certainly sounds interesting, and something I'm looking forward to. I mean, it has the potential to save a lot of time by automating tasks. What do you think about Claude AI's computer use feature? Drop your thoughts in the comments down below!

[16]

Axios

Anthropic's new Claude AI model can control your computer

Why it matters: This doesn't mean that bots have busted free of the chat box to run loose on the desktop and in the browser -- but that day looks much closer, and increasingly inevitable. State of play: Anthropic's "computer use" lets developers and advanced users tell Claude to go off and do things that make use of other applications on a computer -- like collecting data from the web and moving it into a spreadsheet, or building, deploying and debugging a new website from scratch. Experts and insiders both foresee a massive multiplier effect in knowledge work as AI keeps adding new abilities. Yes, but: Anthropic isn't letting Claude go crazy on your laptop or phone in the wild quite yet. A "beta release" and an "experiment" is how Anthropic is describing the computer use feature. Also, it's not free -- there's a meter running. Between the lines: Anthropic's announcement stole a march on its competition at OpenAI, which is also believed to be working on similar technology. Our thought bubble: The move suggests a growing agreement between the two firms, despite their rivalry, that the best way to make AI safe is to get it in front of developers and the public quickly to find out how to improve it. The bottom line: Two years ago AI providers were insistent that for safety and quality control it was vital to "keep humans in the loop" -- but the loop is already beginning to squeeze humans out.

[17]

TechRadar

A new AI feature can control your computer to follow your orders

An unseen, non-human hand moving the cursor across your computer screen and typing without using the keyboard in fiction is usually a sign of malicious AI hijacking something (or a friendly ghost helping you solve mysteries like the TV show Ghost Writer). Thanks to Anthropic's new computer use feature for its AI assistant Claude, there's a much more benevolent explanation now. Fueled by an upgraded version of the Claude 3.5 Sonnet model, this AI - dubbed 'computer use' - lets you interact with your computer much like you would. It takes the AI assistant concept a step beyond text and a voice, with virtual hands typing, clicking, and otherwise manipulating your computer. Anthropic bills computer use as a way for Claude to handle tedious tasks. It can help you fill out a form, search and organize information on your hard drive, and move information around. While OpenAI, Microsoft, and other developers have demonstrated similar ideas, Anthropic is the first to have a public feature, though it's still in beta. "With computer use, we're trying something fundamentally new," Anthropic explained in a blog post. Instead of making specific tools to help Claude complete individual tasks, we're teaching it general computer skills -- allowing it to use a wide range of standard tools and software programs designed for people." The computer use feature is due to Claude 3.5 Sonnet's improved performance, particularly with digital tools and coding software. Though somewhat overshadowed by the spectacle of the computer use feature, Anthropic also debuted a new model called Claude 3.5 Haiku, a more advanced version of the lower-cost Anthropic model, though once capable of matching Anthropic's previous highest performing model, Claude 3 Opus, while still being much cheaper. You can't just give an order and walk away, either. Claude's control of your computer has some technical troubles as well as deliberate constraints. On the technical side, Anthropic admitted Claude struggles with scrolling and zooming around a screen. That's because the AI interprets what's on your screen as a collection of screenshots, and then it tries to piece them together like a movie reel. Anything that happens too quickly or that changes perspective on the screen can flummox it. Still, Claude can do quite a lot by manipulating your computer, as seen above. Unrestrained automation has obvious perils even when working perfectly, as so many sci-fi movies and books have explored. Claude isn't Skynet, but Anthropic has placed restraints on the AI for more prosaic reasons. For instance, there are guardrails stopping Claude from interacting with social media or any government websites. Registering domain names or posting content is not allowed without human control. "Because computer use may provide a new vector for more familiar threats such as spam, misinformation, or fraud, we're taking a proactive approach to promote its safe deployment. We've developed new classifiers that can identify when computer use is being used and whether harm is occurring," Anthropic wrote. "Learning from the initial deployments of this technology, which is still in its earliest stages, will help us better understand both the potential and the implications of increasingly capable AI systems."

[18]

theregister.com

Anthropic's latest Claude model can interact with computers

For starters, it could launch a prompt injection attack on itself... The latest version of AI startup Anthropic's Claude 3.5 Sonnet model can use computers - and the developer makes it sound like that's a good thing. "Why is this new capability important?" the AI biz wonders aloud in its celebratory blog post. Then it answers its own question: "A vast amount of modern work happens via computers. Enabling AIs to interact directly with computer software in the same way people do will unlock a huge range of applications that simply aren't possible for the current generation of AI assistants." The current generation of AI assistants has of course been shown to be quite capable of engaging with computers - given multimodal input and output capabilities, appropriate middleware like browser automation tools Puppeteer or Playwright, and a language model integration framework like LangChain. Only a week ago, Django co-creator, open source developer, and AI influencer Simon Willison published a report about how well Google AI Studio does at screen scraping. He found that AI Studio could ingest a screen capture video of his email inbox to extract numeric values within mail messages and return the results in a spreadsheet. So multimodal models can read computer screens quite effectively. Anthropic has empowered its Claude model to interact with computers more directly. The latest iteration of Claude 3.5 Sonnet expands response options by allowing the model to "reason" about the state of the computer, and to take actions like invoking applications or services. Anthropic is offering a public beta test of what it calls computer use tools - essentially functions that allow models to interact with a computer's keyboard, to type, to move the mouse pointer, to click, to take screenshots, and so on. There's also a file system editor tool for viewing, creating, and editing files. And there's a tool that allows the model to run bash commands, among others. Anthropic acknowledges that this complicates AI safety. "Please be aware that computer use poses unique risks that are distinct from standard API features or chat interfaces," its documentation cautions. "These risks are heightened when using computer use to interact with the internet." The warning continues - and it gets even better. "In some circumstances, Claude will follow commands found in content even if it conflicts with the user's instructions," the note explains. "For example, instructions on webpages or contained in images may override instructions or cause Claude to make mistakes. We suggest taking precautions to isolate Claude from sensitive data and actions to avoid risks related to prompt injection." In short: Claude may decide to follow found instructions which, if placed deliberately, would qualify as a prompt injection attack. As well as prompt injection from malicious images, the warning lists a litany of other potential concerns - like latency, computer vision accuracy, tool selection accuracy and errors, scrolling accuracy, and unreliable spreadsheet interaction, to name a few. Rachel Tobac, CEO of SocialProof Security, observed: "Breaking out into a sweat thinking about how cyber criminals could use this tool. This easily automates the task of getting a machine to go to a website and download malware or provide secrets, which could scale attacks (more machines hacked in a shorter period of time)." Anthropic recommends that developers experimenting with Claude's computer use API "take the relevant precautions to minimize these kinds of risks." ®

[19]

NYMag

The Rise of the Self-Clicking Computer

AI startup Anthropic, creator of popular chatbot Claude, announced a new feature this week called "computer use," which is a set of tools "that can manipulate a computer desktop environment." In plain English, as the name suggests, it's an AI that can use your machine for you: Over the past year, the biggest players in AI have been making versions of the same claim: that "agents' -- systems given some degree of autonomy to complete tasks on behalf of their users -- are the next step for the industry. Different firms have different definitions for the term and ambitions for the concept, but the basic idea is pretty intuitive. Real, useful AI automation isn't possible until models can interact with the real world, not just in a chat window. Or, in Claude's case, the real world as mediated by a desktop computer. Anthropic's software is in limited testing, but early reviews suggest the concept is viable. "It is capable of some planning, it has the ability to use a computer by looking at a screen (through taking a screenshot) and interacting with it (by moving a virtual mouse and typing)," writes management professor and AI influencer Ethan Mollick, who has been testing the tool. Despite "large gaps," he says, he was "surprised at how capable and flexible this system is already," and believes that similar approaches are likely to become more common. The demo videos are worth watching if you haven't interacted with a recent AI model, including this one from the company where the agent goes off-task: What's impossible to ignore about features like this in practice is that they demand huge amounts of access in order to function, a tension that's going to become more evident as tech companies get more ambitious with AI tools in general. When Google, Apple, Microsoft, and OpenAI talk about the future of AI assistants, more useful chatbots, and the rise of agents, they're also talking about a world in which they have unprecedented access to the digital matter of users' lives. Claude's demo here makes that abundantly clear. By ingesting and interacting with everything on users' screens, Claude's relationship with its users is instantly more intimate than the one they have their other digital services combined. This level of access represents a massive secondary opportunity for AI firms, who would be potentially leapfrogging the biggest tech companies of the last generation in terms of raw access to user data, massively shifting norms around privacy in the process. The other striking thing about these demos is that, despite being impressive and novel, they're clearly showing off a transitional technology. In description, an AI agent is an entity with access to the resources it needs to, say, book a plane ticket or put together a document. In practice, at this early stage, it's a tool that interacts with human interfaces -- websites and pieces of software -- by effectively impersonating a user, a bit like a regular car piloted by a humanoid robot rather than a vehicle that simply controls it self. It's a fascinating proof of concept with a lot of room for improvement, but it also sets up an antagonistic relationship with at least some of the software that it's "using." Claude is seen here searching Google, for example, in order to complete other tasks; Google, which makes money from showing ads to actual people, will eventually have something to say about systems like this, which both depend on and undermine it. Similarly, for its part, Google has been talking since 2018 about basic agentic systems that can make phone calls on behalf of users, dealing with annoying phone trees or customer support situations autonomously. Likewise, companies that feed people into those phone trees or complicated customer service interactions aren't likely to stand still in a situation in which most of those calls are being dealt with by bots (indeed, restaurants are already adapting). What Claude is able to do here is already impressive, but it depends in no small part on fascinating small deceptions made in the name of productivity -- for interacting with the real world, asking for control of users' computers is an incredibly useful first step and a shortcut to an enormous range of possible tasks, but it's also a bold and perhaps risky approach by a firm that, unlike some other players in AI, doesn't already have access to users' email accounts or social media profiles. (In terms of straightforward functionality, it's worth noting that Claude has a lot in common with apps known as auto-clickers, macro tools, and key-pressers, which are used to automate human-like actions on computers and phones and are widely used for producing spam and committing fraud.) As Mollick suggests, software like this, which is also in development by OpenAI, among others, represents one way that AI companies are planning on "breaking out of the chatbox," at least in concept. As people in the industry like to say, it's also "the worst it will ever be" in terms of raw capability, but, perhaps counterintuitively, it's also as unencumbered as it will ever be, operating in a world that hasn't had time to adjust to, or thwart, its presence.

[20]

Futurism

Claude AI Gets Bored During Coding Demonstration, Starts Perusing Photos of National Parks Instead

It turns out that not even AI models are immune to a little procrastination. While its developers were trying to record a coding demonstration, the latest version of Claude 3.5 Sonnet -- Anthropic's current flagship AI -- got off track and produced some "amusing" moments, the company said in an announcement. It's perilous to anthropomorphize machine learning models, but if this were a human employee, we'd diagnose them with a terminal case of being bored on the job. As seen in a video, Claude decides to blow off writing code, opens Google, and inexplicably browses through beautiful photos of Yellowstone National Park. In another demo attempt, Claude accidentally stopped a lengthy screen-recording in progress, Anthropic said, causing all the footage to be lost. We're sure that wasn't intentional on the AI's part. The upgraded Claude 3.5 Sonnet is Anthropic's foray in developing an "AI agent," a broad term that describes productivity-focused AI models that are designed to perform tasks autonomously. A bevy of companies are working on expanding their AI models beyond just serving as chatbots and assistants, including Microsoft, which just released AI agent capabilities of its own. With Claude, the Amazon-backed startup brags that its latest model can now use "computers the way people do," such as moving a cursor and inputting keystrokes and mouse clicks. That means Claude can potentially control your entire desktop, interacting with any software and applications you have installed. It's clearly far from perfect. Like any AI model, reliability remains elusive, and frequent hallucinations are simply a fact of life, as Anthropic itself admits. "Even though it's the current state of the art, Claude's computer use remains slow and often error-prone," the company said. "There are many actions that people routinely do with computers (dragging, zooming, and so on) that Claude can't yet attempt." The example errors that Anthropic shared were mostly harmless. But given the level of autonomy that Claude purportedly has, it's more than fair to ask questions about its safety. What happens when the AI agent gets sidetracked not by googling photos, but by opening your social media, for example? There's also the obvious potential for it to be misused by humans -- risks that Anthropic wants you to know it's addressing. "Because computer use may provide a new vector for more familiar threats such as spam, misinformation, or fraud, we're taking a proactive approach to promote its safe deployment," Anthropic said. This includes implementing new classifiers to identify when the AI is being used to perform flagged activities, like posting on social media and accessing government websites. As more people try out the new and improved Claude, though, we expect to see more examples of its computer use gone awry.

[21]

PCWorld

Anthropic's new Claude AI model can use a PC 'the way people do'

If you're worried about artificial intelligence taking your job, you might want to sit down for this one. AI startup Anthropic has demonstrated a new "Claude" model called that can look at a computer screen and operate a virtual mouse and keyboard, "the way people do," according to promotional material. In the video demo, researcher Sam Ringer shows Claude performing a bit of data entry "drudge work," with the AI model using screenshots of a Mac desktop to find relevant information and submit a form. It is indeed the kind of thing that employees all over the world do every day, though Ringer notes that this is a "representative example." Exactly how much of the video is edited isn't known. But you don't need to take Anthropic's word for it. An early version of the Claude 3.5 Sonnet API is available to try out now, and Ethan Mollick, a professor studying AI at the University of Pennsylvania's Wharton School, did just that. Mollick tested out the AI with Universal Paperclips, an online clicker game with some wonderfully subtle science fiction going on in its background. Mollick pointed the program at the game's browser window and "told it to win," then sat back and watched it operate. The result was fascinating. The AI was able to identify the point of the game by extrapolating its text-based interface, then use some trial and error to try and win -- in this case, basically just making the numbers go up. It was able to fiddle with the price of paperclips to increase its fantasy revenue with some basic A/B testing, the way a real player would. But didn't quite put together the steps needed to optimize the process, something that would be fairly obvious to a human player. The real-world AI was "playing" a game about fictional AI. It ran into a few logic loops that prevented it from making meaningful progress, and Mollick's virtual machine crashed multiple times before the hours-long game could be completed. But with an interesting bit of input from the human operator, "you are a computer, use your abilities," it was coaxed into writing a basic bit of code to automate its processes. This is an example of a virtual computer writing virtual code to play a virtual game -- we're going full Inception here, albeit with a fairly basic goal and outcome. Claude declared that it had "successfully 'won'" the game by reaching a milestone "within the given constraints" after multiple VM crashes. It didn't win Universal Paperclips, not by a long shot. But bear in mind that playing this largely contextual game is far beyond the original automation intention laid out in Anthropic's demo video. The AI's ability to identify a goal and make progress with some minimal prodding was impressive. The full breakdown is well worth a read. "[Claude] was flexible in the face of most errors, and persistent," writes Professor Mollick. "It did clever things like A/B testing. And most importantly, it just did the work, operating for nearly an hour without interruption." Anthropic's Claude AI is available as a free text-based tool on the web and as an app on iOS and Android, with the ability to ask about images and text documents. The latest changes (version 3.5) are live for the free version, but more advanced access requires the $20 per person, per month Pro account, with priority bandwidth and more models. Anthropic claims current clients that include dozens of companies, notably including Notion, Intuit (makers of TurboTax), and Zoom.

[22]

SiliconANGLE

Anthropic releases improved Claude models that can control your computer - SiliconANGLE

Anthropic releases improved Claude models that can control your computer Leading artificial intelligence firm Anthropic PBC today introduced new Claude 3.5 Sonnet and Claude 3.5 Haiku generative AI models with significantly upgraded capabilities over their predecessors. The upgraded Sonnet arrived within four months after the initial launch of the initial model in June and received substantial gains in computer coding, which it was already designed to excel at. Haiku is Anthropic's fastest model, and the company said the enhanced version has improvements in every skill and now surpasses Claude 3 Opus, the largest model in the previous generation. In addition to the models, Anthropic also introduced a new way for models to interact with computers in public beta mode: computer use. By viewing the screen Claude Sonnet can interact with computers by moving the mouse, typing text and clicking buttons to interact with the user interface. Anthropic touted Sonnet's software engineering skills as part of what has become an arms race between rival frontier model developers to produce the best AI models for software developers. According to the company, the new model showed wide-ranging improvements across industry benchmarks, with strong gains in agentic coding and tool use. "Early customer feedback suggests the upgraded Claude 3.5 Sonnet represents a significant leap for AI-powered coding," said Anthropic. According to the company, Sonnet improved performance on the SWE-bench Verified leaderboard from 33.4% to 49% and scored higher than all publicly available models, including OpenAI o1-preview and specialized systems designed for agentic coding. GitLab tested the model for DevSecOps tasks, which require multistep reasoning across multiple domains such as development, testing, security and operations, discovered it delivered up to 10% better performance with no added latency. Claude 3.5 Haiku is designed to be fast and affordable while providing extremely low latency. The company said it is designed to be well suited for customer-facing tasks where lots of interactions are occurring and high speed is paramount. The upgraded Sonnet is now available to all users starting today and the new Claude 3.5 Haiku will be released later this month. Large language models ordinarily reason over text and images and with the addition of application programming interfaces they have also been able to use software tools to access data, update databases, send emails and more. Being able to "see" computer interfaces via screenshots gives them another capability, the ability to generally perceive and interact with user interfaces such as buttons, text fields and more. Anthropic said that it's given Claude 3.5 Sonnet the ability, via an API, that allows Claude to perceive and interact with UIs. Now developers can give Claude instructions such as "use data from my computer to fill out this form," and it will take a screenshot, scan the page, and then enter text into the relevant parts of the visible page according to the data it has access to. "We were surprised by how rapidly Claude generalized from the computer-use training we gave it on just a few pieces of simple software, such as a calculator and a text editor," Anthropic said. "In combination with Claude's other skills, this training granted it the remarkable ability to turn a user's written prompt into a sequence of logical steps and then take actions on the computer." Anthropic stressed that the new computer use capability is experimental and can make mistakes so users should approach it with caution. For example, although there are things that humans do effortlessly, such as moving around a screen, scrolling, zooming, clicking-and-dragging, Claude has trouble with these actions. In the researchers' own tests, Claude has made some amusing blunders. This has included Claude accidentally clicking to stop a long-running screen recording, causing all footage to be lost. In another, the model took a break from a coding demo to browse Yellowstone National Park photos. It sounds like Claude is not too different from a normal developer. Even in light of these bloopers, the technology represents a powerful leap forward in the sort of work that AI agents could do for users on their computers.

[23]

New Atlas

Dawn of the Agent: New Claude AI can take over your computer

Claude navigates between multiple apps and windows to get the job done The next wave of game-changing AI models will soon be upon us - "agent" style models that'll be able to take over entire ongoing tasks and jobs with full autonomy. Anthropic's newest AI model gives us a sneak peek, by taking over your whole computer. If you haven't encountered the idea of an AI agent before - or if you see Large Language Models (LLMs) like Claude and GPT as primarily chat services, OpenAI CEO Sam Altman might help put things in perspective. In the short video below, Altman lays out the five levels of AI as his company sees things. First, there's the chatbots - and many of us have been getting acquainted with the remarkable capabilities these offer over the last few years. Next come the "reasoners" - Altman says OpenAI's recent o1 model is the first of these. The third level is "agents" - these will effectively be AIs that people trust to go off and just take care of things on their behalf, making their own decisions about how to get a task completed. Agent AIs will have your credit card and permission to use it. They'll have access to the Web, and the ability to interact with websites and tools on your behalf. You'll be able to give them a job, and trust that they'll do it, checking back in with you only as required. The fourth level, says Altman, will be the "innovators" capable of creating new knowledge, and the fifth will be "full organizations," running with basically almost no human input - a concept that would've been laughable to most people just a few years ago, but that seems inevitable now. There are arguably examples of all five levels running here and there around the world, and there have been for many years - but in terms of mass worldwide availability, none of the major AI companies have released anything that could be called an agent, until today's release by Anthropic. As part of the launch of the new Claude 3.5 Haiku model and an upgraded Claude 3.5 Sonnet, the company dropped the following: "We're also introducing a groundbreaking new capability in public beta: computer use. Available today on the API, developers can direct Claude to use computers the way people do - by looking at a screen, moving a cursor, clicking buttons, and typing text." Check out an introductory video below. "Computer use is a completely different approach to AI development," writes the Anthropic team. "Up until now, LLM developers have made tools fit the model, producing custom environments where AIs use specially-designed tools to complete various tasks. Now, we can make the model fit the tools - Claude can fit into the computer environments we all use every day. Our goal is for Claude to take pre-existing pieces of computer software and simply use them as a person would." Here's an example of an early use case - Anthropic Researcher Pujaa Rajan tells Claude she'd like to enjoy a sunrise hike by the Golden Gate bridge, and asks if it can sort out the logistics and set up a calendar entry for when she should leave home. It opens a browser, finds out sunrise times and hike locations, figures out travel times from Rajan's area, then opens up a calendar and makes the relevant entry. LLMs like Claude have become decently capable coders, too - but with this Computer Use feature comes the ability not only to generate, edit and debug code, but to get outside the browser window, launch servers and actually deploy the code: It's important to note that this new feature is currently very early and limited. For starters, it's only available to developers accessing Claude through the back-end API interface, so the unanointed can't yet jump in and start getting it to file our taxes. It's also limited in that it can only see what's happening on your monitor as a series of screenshots, which it then uses to determine how far to move your cursor and which buttons or keys to hit. So it's useless in more visually dynamic applications - although Google Deepmind is already deep into the task of building AIs capable of playing games. Amusingly, it seems to occasionally get bored and go off surfing the 'net, like in the video below, where it stopped doing the coding demo Anthropic was trying to record, and went off to enjoy some scenic pics. And it's also pretty crappy, apparently. On the OSWorld benchmark test, which evaluates a model's ability to use a computer, humans typically score around 70-75%, and Claude scored just 14.9%. But that's nearly double the score of the next-best AI model in its category, and this is very much the beginning. Naturally, giving widely accessible state-of-the-art AI models this much access to computers poses safety risks - and indeed, Anthropic says that's why it's releasing the Computer Use feature in such an embryonic format. Like OpenAI with GPT-4, the thinking here is that opening the doors to the public will give Anthropic the ability to stay well ahead of safety risks and jailbreak attempts, so its safety capabilities will improve as the model's wobbly legs get stronger. In this way, writes Anthropic, "we can begin grappling with any safety issues before the stakes are too high, rather than adding computer use capabilities for the first time into a model with much more serious risks." It's doubtless also a rare opportunity for Anthropic to beat OpenAI to market on a significant new model capability; OpenAI has been speaking about agent-level AIs for some time now. It certainly has something similar cooking, and many expect we'll see the first GPT agent models in the coming weeks or months. But for those of us just trying to keep up with all that's happening in this ridiculously fast-moving space, this does seem like a significant moment. Within a year, it's reasonable to expect we'll all have access to highly competent agent models that can take over computers and do all kinds of tasks. And that's another rubber-meets-road moment for this crazy technology, because an agent AI that can break a task down into hundreds of steps and go away and execute it? That starts looking a lot more like an employee than a chatbot. The productivity gains could be epic, and the job losses we're already seeing thanks to current AI models are going to accelerate. Within five or 10 years, it's hard to see how these agent AIs don't become our primary means of getting things done in the digital world. Operating a computer, using a keyboard and a mouse, looking for bits of information here to move them over there ... How much of your day does this kind of busywork consume? How much nicer would it be simply to hand these tasks off to a reliable AI assistant? This is a hugely transformative moment. As I keep finding myself saying: buckle up, friends, there's no brakes on this train.

[24]

Analytics India Magazine

Anthropic Builds 'FSD' for Computers, Beats Microsoft at the Game

OpenAI's rival Anthropic, the AI safety and research company, is changing the way humans interact with computers. The company recently unveiled an upgraded Claude 3.5 Sonnet model, the new Claude 3.5 Haiku, and a public beta for an experimental feature called 'computer use.' Developers can now use it to automate repetitive tasks, conduct testing and QA, and conduct open-ended research. Many have likened it to a car's self-driving features. "Entering a new era with computer use. It's like FSD for your computer!" said Sunny Madra of Groq. "Replit Agent just becomes an L3 self-driving Agent with Claude computer use from Anthropic," posted a user on X. In a demo video presented by Anthropic, Claude was able to perform web searches, book tickets, operate MS Excel and fill out applications. "In the first trial, the 'computer use' feature was able to find and report flight options from SFO to Tokyo," shared a Replit employee. Abacus AI chief Bindu Reddy noted that the 'computer use' API by Anthropic presented an interesting take on agentic APIs. "Agents are challenging because they have to talk to other systems, and most of these systems don't have good APIs," she said. Bindu further added that one potential solution is to use the 'Computer Use API,' allowing the LLM to simulate a human operating a computer. "In the long run, systems will talk to other systems, and we won't need this type of API," she said. Early adopters like Asana, Canva, Replit, and The Browser Company have already begun exploring its potential. Replit, for instance, is leveraging the feature to build a tool that evaluates apps during their development process. AI computer use is much bigger than it might seem. For instance, many jobs involve performing routine computer tasks, like quality assurance, tech support, data entry, and administrative assistance. All of this can be automated. Amjad Masad, Replit's founder, opines that 10-20% of the global workforce involves routine computer tasks. Masad observes that the economic value of these jobs will probably run in trillions of dollars. "Of course, the tech is still early and will take time to mature. But it seems safe to say that by 2026, we will start to see AI's economic impact in macro metrics," said Masad. AIM Media House chief Bhasker Gupta questioned the need for a computer feature and asked, "Will we need traditional interfaces at all? In the future, most tasks might be completed by AI through simple commands or even autonomously -- without screens, no clicks, or typing." On the other hand, Meta's chief AI scientist, Yann LeCun, believes that in the next 10-15 years, we won't have smartphones and will instead use augmented reality glasses and bracelets to interact with intelligent assistants. "Today, all of us here are carrying smartphones in our pockets. Ten years from now, or 15 years from now, we're not going to have smartphones anymore, we're going to have augmented reality glasses," said LeCun. While Anthropic has built something that still requires a computer as an interface, it is likely that in the future we will move away from screens and interact with AI agents using a new kind of device or interface. In a recent interview, OpenAI chief Sam Altman said that by 2030, you will be able to walk up to a piece of glass and ask it to do something that previously would have taken humans months or years, and it will accomplish that task in a dynamic way or within an hour. OpenAI's approach is quite different from Anthropic's, as it focuses more on the voice features alongside the reasoning capabilities. "It's interesting how OpenAI is doubling down on end-user-facing features like voice mode, while Anthropic is doubling down on engineer/API-focused features like code generation quality and being able to remotely drive a GUI," posted Simon Willison, co-creator of Django on X. Anthropic's new feature is somewhat similar to Microsoft's Recall feature, except that it does not control one's computer directly. Microsoft Recall is part of the Copilot+ PC and provides users with a timeline of their previous PC activities. It allows users to search for content they have interacted with -- such as files, web pages, and applications -- using natural language queries. However, this feature came under scrutiny due to privacy issues. The tech giant recently announced that the ability to create autonomous agents within Copilot Studio will enter public preview in November. These agents are designed to work across various business functions, including sales, finance, and supply chain, to automate tasks and streamline operations. However, Anthropic Claude's computer feature stands out as it doesn't rely on multiple agents to perform different tasks, instead, a single agent effectively manages multiple tasks. For instance, Microsoft recently integrated Copilot into MS Excel, while Claude can now directly operate Excel. This calls into question the existence of Copilot. Also, OpenAI has introduced a new approach for creating and deploying multi-agent AI systems, called the Swarm framework. It simplifies the process of creating and managing multiple AI agents that can work together seamlessly to accomplish complex tasks. The company has launched a ChatGPT desktop application, but it lacks autonomous features.

[25]

Cointelegraph

Anthropic beta lets Claude AI operate users' computer mouse, keyboard

Artificial intelligence company Anthropic announced the launch of a new version of its Claude AI model alongside a developer's beta for "computer use" on Oct. 22. The computer use beta will allow developers working with the API to enable Claude to take control of their mouse cursor, click on buttons and fields, and insert text in a user's computer environment autonomously. A video demonstration of Claude's computer use functionality showcases the machine's ability to navigate multiple windows, open new tabs to search for information, and transfer data from one window to another. Computer control Under the current paradigm, popular AI-powered large language models such as Anthropic's Claude and OpenAI's ChatGPT interact with users through a dedicated interface window. Some models can perform limited web searches within the framework of their interface, but external interactions are typically limited to allowing users to upload documents for parsing by the AI. Related: Anthropic says AI could one day 'sabotage' humanity but it's fine for now With Anthropic's new computer use beta, the company hopes developers will eventually be able to automate any function a human sitting at a workstation would be able to perform. Per a company blog post: "Developers can direct Claude to use computers the way people do -- by looking at a screen, moving a cursor, clicking buttons, and typing text. Claude 3.5 Sonnet is the first frontier AI model to offer computer use in public beta. At this stage, it is still experimental -- at times cumbersome and error-prone." Computer use beta is only available for developers through the Claude API; however, Anthropic says it's releasing it early for feedback from developers. The company expects the feature "to improve rapidly over time." Privacy and security Anthropic addresses some of the privacy and security concerns in the blog post stating that "because computer use may provide a new vector for more familiar threats such as spam, misinformation, or fraud," the firm is taking a "proactive approach to promote its safe deployment" and can identify when computer use is being deployed improperly.

[26]

Analytics India Magazine

Anthropic's Claude 3.5 Now Controls Your Computer Like You Do

Both Claude 3.5 Sonnet and Claude 3.5 Haiku, along with the computer use feature, are available through Anthropic's API, Amazon Bedrock, and Google Cloud's Vertex AI. Anthropic has announced the release of an upgraded Claude 3.5 Sonnet model and the new Claude 3.5 Haiku, along with a public beta for an experimental feature called "computer use." Unlike traditional AI models that rely on specific APIs or tools for task completion, Claude 3.5 Sonnet is now able to navigate computer interfaces in a manner similar to human users. This means the AI can view a screen, move a cursor, click buttons, and type text, allowing it to perform tasks like filling out forms, navigating websites, and interacting with a wide range of software programs designed for human users. "Entering a new era with "computer use" It's like FSD for your computer!," said Groq's Sunny Madra. Early adopters like Asana, Canva, Replit, and The Browser Company have already begun exploring its potential. Replit, for instance, is leveraging the feature to build a tool that evaluates apps during their development process. However, the feature is not without its limitations. Tasks that are simple for humans -- such as scrolling, dragging, or zooming -- can be cumbersome for Claude at this stage. Despite these challenges, Anthropic believes that computer use has the potential to open up new possibilities for automation and AI-driven software development. Both Claude 3.5 Sonnet and Claude 3.5 Haiku, along with the computer use feature, are available through Anthropic's API, Amazon Bedrock, and Google Cloud's Vertex AI. While the Claude 3.5 Sonnet model is ready for use immediately, the new Claude 3.5 Haiku will be available later this month. Future updates are expected to include further enhancements, such as image input capabilities. The computer use feature is similar to former OpenAI co-founder Andrej Karpathy's vision of an LLM OS. The LLM OS proposes using large language models as the "kernel" or central processing unit of a new type of operating system. It is envisioned as a broader, more modular architecture for agentic behavior, going beyond just a chat interface. The upgraded Claude 3.5 Sonnet model offers substantial improvements in key areas, most notably coding. Anthropic reports that the model has made significant progress in agentic coding tasks, which involve AI autonomously generating and manipulating code. On the widely recognised SWE-bench Verified benchmark, Claude 3.5 Sonnet's performance increased from 33.4% to 49.0%, outperforming several major AI systems, including OpenAI's o1-preview and other coding-focused models. Claude 3.5 Sonnet has also improved in tasks requiring tool use. For example, the model achieved higher scores on the TAU-bench tool-use benchmark, improving its performance in the retail domain from 62.6% to 69.2%, and in the airline domain from 36.0% to 46.0%. These advancements make Claude 3.5 Sonnet a strong contender for developers needing AI support in complex, multi-step tasks such as software development, autonomous AI evaluations, and problem-solving. Anthropic also introduced Claude 3.5 Haiku, a next-generation model that combines speed with affordability. Designed for real-time applications, Claude 3.5 Haiku improves across multiple benchmarks and surpasses its predecessor, Claude 3 Opus, in several areas, including coding tasks. Claude 3.5 Haiku is built for tasks requiring low latency and accurate tool use, making it well-suited for user-facing applications, specialised sub-agent tasks, and handling large datasets like inventory records or purchase histories. This makes the model ideal for industries that rely on real-time data-driven decisions. Microsoft recently launched autonomous agents in Copilot Studio, set for public preview next month. These agents will automate tasks across sales, finance, and supply chain to streamline operations. Microsoft has introduced ten new autonomous agents in Dynamics 365. These agents are built to help organisations drive business value by automating processes like lead generation, customer service, and supplier communication. Meanwhile, OpenAI has introduced a new approach for creating and deploying multi-agent AI systems, called the Swarm framework. It simplifies the process of creating and managing multiple AI agents that can work together seamlessly to accomplish complex tasks.

[27]

Mashable

Anthropic releases AI tool that can take over your cursor

Not every artificial intelligence application sounds useful in everyday life, but Anthropic just unveiled one that could be. Anthropic announced on its website that its "Claude" large language model has gotten an intriguing new feature: computer use. Put simply, this AI can take over your mouse cursor and perform basic computing tasks like clicking and typing. This feature is available now on the Claude 3.5 Sonnet public beta. The feature, which Anthropic itself described as "at times cumbersome and error-prone," has apparently been embraced by companies like Asana and DoorDash. According to Anthropic, Claude's computer use capabilities can perform tasks "that require dozens, and sometimes even hundreds, of steps to complete," presumably without human intervention. For something like Asana, which is software for managing work projects, this could actually be pretty useful and save time. Of course, all of that sounds great until Claude messes something up and you don't notice until it's too late.

[28]

Silicon Republic

Anthropic's newest AI model could control your computer

The company said the upgraded Claude model will be able to 'use computers the way people do' but warned its still experimental at this stage. OpenAI challenger Anthropic has released an upgraded version of its Claude 3.5 Sonnet model that is designed to understand and interact with any desktop app. The new 'computer use' API is designed to imitate keystrokes, button clicks, and mouse gestures, so that it can essentially "use computers the way people do". The model is currently in public beta mode so that developers can give the company feedback and help to improve capabilities. "At this stage, it is still experimental - at times cumbersome and error-prone," the company said in a blogpost. "We're releasing computer use early for feedback from developers and expect the capability to improve rapidly over time." As well as its upgraded Sonnet model, Anthropic also announced that an updated version of Haiku is on its way. "Claude 3.5 Haiku matches the performance of Claude 3 Opus, our prior largest model, on many evaluations for the same cost and similar speed to the previous generation of Haiku," the company said. The latest Claude upgrades just four months after it launched its Claude 3.5 model. At the time, the start-up claimed it outperformed competitor models on various benchmarks. However, company claims about a company's own models must be taken with a pinch of salt as a report from earlier this year proves, saying that robust evaluations for large language models are "seriously lacking". Anthropic was co-founded by former OpenAI employees in 2021 and rose to become a challenger to the popular ChatGPT in 2023 with the original release of its AI chatbot Claude. Since then, it has continued to collect OpenAI employees, including co-founders John Schulman and Durk Kingma, and former safety lead Jan Leike. Earlier this year, the company announced the opening of an office in Dublin, saying that it will hopefully be its main establishment in the EU market. The company has gained a large amount of attention in a short space of time and is backed by various tech giants including Google and Amazon. Don't miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic's digest of need-to-know sci-tech news.

[29]

LaptopMag

One of the best ChatGPT competitors can now control your computer screen for you

Anthropic's new "Computer Use" feature has incredible potential Whether you're fully immersed in AI tech or have only heard snippets of top conversations, you've likely heard of OpenAI's ChatGPT. Despite its widespread popularity, GPT-4o, Open AI's best model to date, is now less powerful than a new model from competing company Anthropic: Claude 3.5 Sonnet. In benchmarks shared by Anthropic on X , Claude 3.5 Sonnet outperforms OpenAI's GPT-4o model in every AI benchmark except math problem-solving. Anthropic says Claude 3.5 Sonnet offers "across-the-board improvements over its predecessor," which is fantastic to hear, but it's definitely not the most interesting tidbit. What's most fascinating about Claude 3.5 Sonnet is what Anthropic calls a "groundbreaking new capability." This new feature, available for anyone to test in public beta, is called 'Computer Use' -- and it's the closest an AI model has ever been to delivering an actual virtual assistant to help us with monotonous tasks. Anthropic says, "developers can direct Claude to use computers the way people do -- by looking at a screen, moving a cursor, clicking buttons, and typing text. And while the company notes it's "still experimental" and "error-prone," the demos I've seen so far have been impressive. Via Rowan Cheung on X , you can catch a breakdown of how Claude 3.5 Sonnet can take over your screen, move your cursor, type by itself, and carry out complex tasks, like creating a website or filling in a vendor request form with relevant information. Cheung explains that the 'Computer Use' feature "works by taking static screenshots that are constantly sent back to the API in real-time." In an Anthropic blog post shared with TechCrunch, the company expanded on how it works, saying "Claude looks at screenshots of what's visible to the user, then counts how many pixels vertically or horizontally it needs to move a cursor in order to click in the correct place." Seeing the AI model working in real time is truly something, and if you're worried about the potential security risk for your personal computer, here's what testing has been done so far: Pre-deployment testing of Claude 3.5 Sonnet was conducted by both the US AI Safety Institute (US AISI) and the UK Safety Institute (UK AISI), and the company deemed the ASL-2 Standard appropriate for the model. That said, it's certainly not immune to security risks, and you probably shouldn't be using an AI model with any sensitive, private data. It'll be interesting to see how the feature evolves as it emerges from the public beta version to its official version, but as of right now, I'm excited for its potential to help with large, tedious copy/paste tasks and more complex requests.

[30]

Dataconomy

Anthropic's Claude now can use a computer

Anthropic has announced the launch of two new AI models: an upgraded Claude 3.5 Sonnet and a new Claude 3.5 Haiku. The upgraded Claude 3.5 Sonnet offers across-the-board improvements, with significant gains in coding capabilities. The new Claude 3.5 Haiku brings advanced AI features at an affordable price, matching the performance of Anthropic's previous flagship model, Claude 3 Opus, while maintaining similar speed and cost as its predecessor. The upgraded Claude 3.5 Sonnet builds on the success of its predecessor with enhanced performance in various tasks, especially in coding. Anthropic has emphasized that Claude 3.5 Sonnet leads in software engineering tasks, showing marked improvements in benchmarks such as SWE-bench Verified, where it improved from 33.4% to 49.0%, surpassing other models on the market. The model also performed well in agentic tool use, improving scores on the TAU-bench from 62.6% to 69.2% in the retail domain and from 36.0% to 46.0% in the airline domain. According to early testers like GitLab and Cognition, these upgrades represent a substantial leap forward for AI-powered coding and automation, with better reasoning capabilities and minimal added latency. Claude 3.5 Haiku is designed to offer state-of-the-art performance while keeping costs low. The model scores highly in benchmarks like SWE-bench Verified, achieving 40.6%, which surpasses many agents using other cutting-edge models, including the original Claude 3.5 Sonnet and GPT-4o. The model aims to serve applications that need fast, reliable AI, such as user-facing products and tasks requiring personalized experiences. Anthropic will release Claude 3.5 Haiku later this month, initially available as a text-only model with plans for image input support in the future. It will be available through Anthropic's API, Amazon Bedrock, and Google Cloud's Vertex AI. Anthropic's Claude AI is now on your iPad Anthropic has also introduced a new experimental feature called "computer use," available in public beta. Developers can direct Claude 3.5 Sonnet to use computers like a human -- looking at screens, moving cursors, and typing. This capability is still experimental, with some limitations, but developers like Replit and The Browser Company have already begun exploring how it could automate complex processes that involve many steps. This feature aims to teach AI general computer skills, making it more versatile in completing tasks that previously required custom tools. Developers can access this beta through Anthropic's API and other major cloud platforms. While early results are promising, Anthropic acknowledges that the technology is still developing, with challenges in performing some common computer tasks like scrolling and dragging. Companies such as Asana, Canva, Cognition, and DoorDash are experimenting with the new features, exploring the potential of AI to automate processes that traditionally required human input. Early testing by the US AI Safety Institute and the UK Safety Institute found that the updated Claude 3.5 Sonnet meets Anthropic's safety standards and is suitable for public use. Anthropic is also addressing potential risks associated with AI's ability to interact with computers. To ensure responsible deployment, the company has implemented new safety measures to identify misuse, including classifiers to detect potentially harmful actions. Anthropic is encouraging developers to use the beta feature for low-risk tasks while the technology matures. Anthropic's release of the upgraded Claude 3.5 Sonnet and Claude 3.5 Haiku highlights the company's push to expand AI capabilities while maintaining safety standards. The addition of experimental computer use capabilities represents a novel step forward in AI's potential to perform general-purpose tasks, giving developers new possibilities to explore.

[31]

PC Magazine

Claude AI Can Now Control Your PC, Prompting Concern From Security Experts

With its latest update, the Claude AI tool from Amazon-backed Anthropic can control your computer. The idea is to have Claude "use computers the way people do," but some AI and security experts warn it could facilitate cybercrime or impact user privacy. The feature, dubbed "computer use," means Claude can autonomously complete tasks on your computer by moving the cursor, opening web pages, typing text, downloading files, and completing other activities. It launched first for developers via the Claude API and is included in the Claude 3.5 Sonnet beta, but could be added to more models in the future. Anthropic warns that this new feature could be faulty or make mistakes, however, as it's still in its early stages. Anthropic says companies like Asana, Canva, and DoorDash are already testing this new feature, asking Claude to complete jobs that normally require "dozens, and sometimes even hundreds, of steps to complete." This could mean a more automated US economy as employees automate tasks at work, helping them meet deadlines or get more things done. But it could also lead to fewer jobs if more projects ship faster. Claude may refuse to do certain tasks that could fully automate your social media and email accounts. One coder, however, claims he's been able to create a "wrapper" that circumvents those restrictions. From a security standpoint, Jonas Kgomo, founder of the AI safety group Equianas Institute, called Claude's computer use "untested AI safety territory" and emphasized that cyberattacks are entirely possible with the new tool. Parrot AI founder Paul Morville tells PCMag in a message that while Anthropic's advice to only use the new feature when you can supervise it is wise, "there is enormous potential for both intentional and unintentional security problems" and could one day be used to help hackers deploy autonomous remote access trojans (AI RATs). Rachel Tobac, self-described hacker and CEO of cybersecurity firm SocialProof Security, said she's "breaking out into a sweat thinking about how cybercriminals could use this tool." "This easily automates the task of getting a machine to go to a website and download malware or provide secrets, which could scale attacks (more machines hacked in a shorter period of time)," Tobac wrote on Tuesday. "I'm also imagining that websites could have malicious prompts visible to the AI tool that hijack the requested AI task!" Tobac listed a number of possible scenarios where Claude's computer use could go wrong. It could result in less human responsibility and oversight, meaning people may be able to claim that they aren't responsible for the AI's actions if it leads to a cyberattack or causes a data breach, for example. Attackers could also design websites knowing the tool exists, and inject malicious code or prompts that could override the AI and get it to download malicious files or execute an attack. "I'm majorly crossing my fingers that Anthropic has massive guardrails," Tobac adds. "This is some serious stuff." But as Datasette creator Simon Willison points out, Anthropic is warning users that it doesn't have such guardrails because it can't stop the AI from being hijacked in certain situations. "Our Trust & Safety teams have conducted extensive analysis of our new computer-use models to identify potential vulnerabilities," Anthropic writes in a post. "One concern they've identified is 'prompt injection' -- a type of cyberattack where malicious instructions are fed to an AI model, causing it to either override its prior directions or perform unintended actions that deviate from the user's original intent. Since Claude can interpret screenshots from computers connected to the internet, it's possible that it may be exposed to content that includes prompt injection attacks." Anthropic justifies the release of the feature, however, by taking the stance that such a tool is inevitable. It argues that it's better to release it now while AI models aren't as powerful as they ultimately could be than later, in a hypothetical future. "When future models require AI Safety Level 3 or 4 safeguards because they present catastrophic risks, computer use might exacerbate those risks," it said. "We judge that it's likely better to introduce computer use now, while models still only need AI Safety Level 2 safeguards. This means we can begin grappling with any safety issues before the stakes are too high, rather than adding computer use capabilities for the first time into a model with much more serious risks." Will Ledesma, a senior director at cybersecurity firm Adlumin, tells PCMag in a message that Claude's computer use is cause for concern considering Anthropic's use guidelines and how Claude might be storing or sharing sensitive data. "Recommending a virtual machine means they're already worried about what it might do. [But] breaking out of VMs or even containers to access root systems has not been impossible," Ledesma said. "Also, there is a concern around where they are storing this sensitive data, such as the screenshots [as] they stated they would hand over screenshots if requested by law. This could be weaponized," Ledesma continued. "For example, if a bad guy gains access to this software, they can use it to monitor individuals. The trade-off here is ease of use against privacy. Many are willing to give up their privacy in order to make something 'easier', but that is the risk. Law enforcement could also abuse this if they used it against an endpoint that they have legal rights to monitor."

[32]

Ars Technica

Anthropic publicly releases AI tool that can take over the user's mouse cursor

AI software company Anthropic has announced a new tool that can take control of the user's mouse cursor and perform basic tasks on their computer. Announced alongside other improvements to Anthropic's Claude and Haiku models, the tool is straightforwardly called "Computer Use." It's available exclusively with the company's mid-range 3.5 Sonnet model right now, via the API. Users can give multi-step instructions (Anthropic claims it can go for tens or even hundreds of steps) to accomplish tasks on the user's computer by "looking at a screen, moving a cursor, clicking buttons, and typing text." Here's how Anthropic says it works: When a developer tasks Claude with using a piece of computer software and gives it the necessary access, Claude looks at screenshots of what's visible to the user, then counts how many pixels vertically or horizontally it needs to move a cursor in order to click in the correct place. Training Claude to count pixels accurately was critical. Without this skill, the model finds it difficult to give mouse commands -- similar to how models often struggle with simple-seeming questions like "how many A's in the word 'banana'?". It has limitations, of course. It operates by taking rapid successive screengrabs rather than working with a live video stream, so it can miss short-lived notifications or other changes. It's still incapable of doing some common actions, like drag and drop. Anthropic has also said it is known to be "cumbersome and error-prone" at times. A blog post about developing the tool gave one example of a way it has gone wrong in testing: It abandoned a coding task before completing it and began instead "to peruse photos of Yellowstone National Park" -- perhaps one of the most human-like things an AI bot has done. (I kid.) The tool is now in public beta but has been with partner organizations for a while, with employees of companies like Amazon, Canva, Asana, and Notion testing it in limited ways.

[33]

Fast Company

Anthropic gives its AI models limited ability to control your computer

According to a series of example videos from Anthropic posted Tuesday on X, Claude users might now ask the AI to follow the steps needed to create a personal website. In another example, a user asks Claude to help with the logistics of a trip to watch the sunrise from the Golden Gate bridge. The user describes what they want the model to do by giving it text prompts. AI companies have been stressing a desire to push large language models to become more "agentic" and autonomous. Doing so means extending the ability of the AI to control not only its own functions but also external devices. "Instead of making specific tools to help Claude complete individual tasks, we're teaching it general computer skills -- allowing it to use a wide range of standard tools and software programs designed for people," Anthropic said in a statement on X.

[34]

VentureBeat

Anthropic's new AI can use computers like a human, redefining automation for enterprises

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Anthropic, the AI research and safety company, has announced a new suite of capabilities -- including an upgraded version of its flagship AI model, Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku -- that could transform how businesses automate complex workflows. But the most striking development in this release is a new feature: Claude can now use a computer like a human, navigating screens, clicking buttons, and typing text. This new feature, called "Computer Use," could have far-reaching implications for industries that rely on repetitive tasks involving multiple applications and tabs. From data entry to research to customer service, the potential applications are broad -- and potentially industry-shaping. AI moves from text to screen interaction Since its founding, Anthropic has focused on creating AI models that are safe, reliable, and capable of complex reasoning. With Claude 3.5 Sonnet and Haiku, the company is expanding the model's capabilities even further. The new "Computer Use" feature allows AI to perform tasks that were previously handled only by human workers, such as opening applications, interacting with interfaces, and filling out forms. "Computer use capabilities have the potential to change how tasks that require navigation across multiple applications are performed," said Mike Krieger, Chief Product Officer at Anthropic, in an exclusive interview with VentureBeat. "This could lead to more innovative product experiences and streamlined back-office processes." Krieger emphasized that the new capability is still in its beta phase, but as the technology evolves, it could improve data analysis, visualization, and user interface interactions, making many tasks more efficient. "We anticipate it being particularly useful for tasks like conducting online research, performing repetitive processes like testing new software, and automating complex multi-step tasks," he said. "As the technology matures, it could enhance data analysis, visualization, and user interface interactions, potentially improving accessibility... We're excited to see how developers will leverage this capability to create new tools and workflows that enhance productivity and user experiences across various sectors." Early adopters see potential Anthropic's early partners, including GitLab, Canva, and Replit, are already benefiting from Claude 3.5 Sonnet's new features. GitLab, which specializes in software development and security, has been testing the model for automating tasks in their development pipeline. According to the company, Claude has improved reasoning capabilities by up to 10% without slowing down performance, making it well-suited for complex, multi-step processes like software testing and deployment. Replit, a coding platform, has gone a step further. Michele Catasta, President of Replit, said the model "opens the door to creating a powerful autonomous verifier that can evaluate apps while they're being built." This could ease bottlenecks in software development, where testing often delays project timelines. Meanwhile, Canva, the graphic design platform, is exploring how Claude's computer use skills could speed up design creation and editing. Danny Wu, Head of AI Products at Canva, said in a statement, "We're discovering efficiencies within our team that could significantly impact our users." What does "Computer Use" actually mean? What sets this new capability apart from traditional automation tools is that Claude isn't confined to specific workflows or software programs. Instead, it can "see" a screen using screenshots, interact with various applications, and adapt to different tasks as they come up. This flexibility makes it more versatile than current robotic process automation (RPA) technologies. For example, in a demo shared by Anthropic, Claude helps complete a vendor request form for Ant Equipment Co. In the video, Claude starts by taking a screenshot of the computer screen, identifies that some necessary information is missing from a spreadsheet, then navigates to a CRM system, locates the required data, and fills out the form -- all without human intervention. This level of automation could have major implications for industries like finance, legal services, and customer support, where tasks often involve switching between multiple systems and applications. "Claude could open spreadsheets, run analyses, and create visualizations. For customer service, it could navigate CRM systems to quickly find and update customer information," Krieger told VentureBeat. Security and privacy concerns However, the ability for AI to control a computer raises serious security and privacy concerns. Anthropic has built several safeguards into the system to address these risks. The company made it clear that Claude cannot access a computer without a developer providing the necessary tools. "Claude cannot 'just use your computer.' The computer use feature requires developers to provide tools like a screenshot tool and an action-execution layer, which allows Claude to perform mouse movements and keystrokes," Krieger explained. Anthropic is also taking a cautious approach by releasing the feature in a limited public beta, available only through an API. This allows developers to test it in controlled environments before it becomes more widely available. The company has also developed classifiers to detect misuse and prevent the AI from interacting with sensitive websites, such as government portals. "Our methods to scan for prohibited activity are designed to safeguard customer data privacy and confidentiality," Krieger said. A new era for office automation? In the near term, businesses could see immediate productivity gains in areas like data entry, customer service, and IT support. But as the technology matures, the potential applications could extend far beyond these initial use cases. Imagine a world where AI handles complex legal processes, from reviewing contracts to completing compliance forms. Or envision AI assisting doctors in navigating electronic health records and diagnosing patients by cross-referencing medical databases. Claude's new "Computer Use" feature brings us closer to a future where AI can perform a wide range of tasks that span different software applications and systems. This gives it a level of flexibility that was previously unimaginable for AI technologies, which were often confined to specific, narrow tasks. Proceeding With Caution Still, it's important to remember that this capability is in its early stages. Claude's ability to use computers is not yet perfect, and Anthropic acknowledges that it struggles with tasks that humans find trivial, like scrolling or zooming. "Since it's still in beta and can occasionally miss short-lived actions, we recommend human oversight for high-stakes tasks," Krieger said. That said, Anthropic is committed to refining the technology. "We've developed new classifiers and prompt analysis tools to identify potential misuse of computer use features," Krieger added, indicating the company is serious about addressing the risks associated with this powerful technology. What's next? As AI continues to evolve, the way we work may change dramatically. For enterprise decision-makers, the benefits of automating multi-step workflows could be substantial. But this also raises questions about the future of jobs that rely on these very tasks. For now, Anthropic is focused on the immediate benefits of Claude 3.5 Sonnet and Haiku while ensuring the technology is deployed responsibly. As the company spokesperson put it, "We're excited to see how developers will leverage this capability to create new tools and workflows that improve productivity and user experiences across various sectors." With companies like GitLab, Canva, and Replit already exploring its potential, it's clear that AI is poised to play an even bigger role in the future of work -- perhaps sooner than we think.

[35]

PYMNTS.com

Anthropic's Computer-Controlling Agents Open Door to Real World Uses, Experts Say | PYMNTS.com

Anthropic has unleashed AI agents that can surf the web and control computers, intensifying the race to commercialize AI assistants that can actually get things done. The move puts Anthropic in direct competition with tech giants Microsoft, Google, and OpenAI, who are all vying to automate routine computer tasks for businesses looking to boost productivity and cut costs. While previous AI tools have focused on text and image generation, these new agents represent an evolution toward AI systems that can actively manipulate software and complete complex workflows with minimal human oversight. "Agentic AI is set to redefine industries over the next three years, with its ability to operate autonomously and make decisions without human intervention," Dan Parsons, COO/CPO at Thoughtful AI, told PYMNTS. "The greatest impact will be seen in areas like system administration, operations, customer service, and complex decision-making, where AI can drastically improve efficiency, reduce costs, and unlock new levels of scalability." Anthropic's new AI capabilities allow its Claude model to control computers directly -- from web browsing to coding and app interaction. While the technology shows promise, with Anthropic claiming superior performance on key benchmarks, actual success rates remain modest at 14.9% for basic computer tasks. Unlike rivals Microsoft and Salesforce, which build AI agents to work within specific applications, Anthropic's approach lets its AI system operate computers just as humans do - by interpreting what's on screen and using the mouse and keyboard. The strategy marks a significant shift in how AI could automate everyday tasks, though experts caution that reliable performance remains a key challenge. "Right now, the impact of autonomous AI agents is limited, but the potential is very large," Olga Kupriyanova, principal consultant, AI & data engineering at technology research and advisory firm, ISG told PYMNTS. The growing competition in the AI agent space will likely drive down AI pricing. According to McKinsey, while AI agents were initially too expensive for many business applications, costs are now rapidly declining. The consulting firm notes that as these technologies become more standardized and industrialized, they're becoming cost-effective enough to automate even niche business processes that weren't economically viable before. "As major companies like Microsoft, Google, and Amazon compete to offer the most advanced AI tools, they'll need to make their products accessible to a broader market, including smaller businesses," Arunkumar Thirunagalingam, Manager, Enterprise Data Management at Santander Consumer USA, told PYMNTS. "This competition could result in lower costs, more flexible pricing models (like pay-as-you-go), and bundled services that make AI adoption more affordable. However, smaller businesses may still face challenges in managing the upfront costs of integration and the need for technical expertise." Companies eyeing AI to replace human workers face a complex financial equation, Thirunagalingam said. While AI agents can slash ongoing expenses like salaries and benefits by handling repetitive tasks 24/7, the upfront investment is steep. Thirunagalingam said that beyond the obvious software licensing fees, businesses must budget for technical infrastructure, staff training, and continuous system maintenance. The true savings only emerge after covering these initial hurdles - making the shift to AI automation a long-term strategic decision rather than a quick cost fix. "Human workers, on the other hand, provide flexibility, creativity, and emotional intelligence, which AI may struggle to replicate," he added. "While AI agents excel in efficiency and scalability, human workers are essential for tasks requiring complex judgment, interpersonal skills, and adaptability. Businesses need to weigh these factors based on their specific needs." Anthropic's "computer use" AI agent could transform office work by watching and learning from activity across different software programs. Unlike earlier AI that worked within single applications, this system can track how humans move between various tools to complete tasks, Abhi Maheshwari, CEO at AI agent company Aisera told PYMNTS. "If an administrative job is mostly about repetitive tasks on the desktop, then this type of role will likely fade away - perhaps quickly. However, this may overstate things," he added. "Many of these jobs are more than repetitive processes. There is often a need for human interaction - meeting people, having calls, presentations, etc. If anything, there will be more time to devote to these more important tasks."

[36]

NDTV Gadgets 360

Anthropic's New Claude 3.5 Sonnet Can Complete Tasks on Your PCs

Anthropic also announced an upgraded Claude 3.5 Haiku model Computer Use capability is currently available via API Claude 3.5 Sonnet is said to outperform ChatGPT-4o and Gemini 1.5 Pro Anthropic introduced two new artificial intelligence (AI) models and a new AI capability on Tuesday. The biggest introduction is an upgraded version of Claude 3.5 Sonnet which is claimed to offer improved benchmark scores across different categories. The new 3.5 Sonnet also gets a new capability dubbed Computer Use, which will allow it to understand and interact with computers, essentially allowing it to control and complete tasks on PCs. Further, the AI firm also announced Claude 3.5 Haiku, the successor to Claude 3 Haiku. In a newsroom post, Anthropic announced an upgraded Claude 3.5 Sonnet, which offers improved performance compared to the AI model released in June. The AI firm claimed that the new model outperforms ChatGPT-4o and Gemini 1.5 Pro in benchmarks such as Graduate-Level Google-Proof Q&A (GPQA), Massive Multitask Language Understanding (MMLU) Pro, and coding-focused HumanEval. However, the most significant improvements have been claimed in two particular benchmarks -- Software Engineering Benchmark (SWE-bench), which increased from 33.4 percent to 49 percent, and Tool-Agent-User (TAU-bench), which moved from 62.6 percent to 69.2 percent. Both of these benchmarks relate to AI agentic performance. This AI agentic capability is relevant since Anthropic introduced the new Computer Use capability that allows AI models to control and complete tasks on PCs. Currently, this capability is available via an application programming interface (API) which only runs on Claude 3.5 Sonnet. With Computer Use, Claude is learning general computer skills. With specialised software, it can imitate keystrokes, button clicks, and cursor movements. Adding it to the AI model's existing computer vision capability, Claude 3.5 Sonnet can see what's happening on the screen, and process the information to carry out specific tasks. The feature will work based on prompts provided to the AI. For instance, users can ask the large language model (LLM) to book tickets on a website, fill out an application, or even download and install an application. While specialised tools that can automate certain PC tasks already exist, a general-purpose tool that works on natural-language prompts is a significant milestone for generative AI technology. However, Anthropic admits that this capability is still in its nascent stage and there are certain limitations. "Some actions that people perform effortlessly -- scrolling, dragging, zooming -- currently present challenges for Claude," the company highlighted. For now, it is advised that developers should use this capability for only low-risk tasks. With automated computer control capabilities, there are concerns about whether the AI model can be engineered to perform harmful and illegal activities. The company has not revealed any details about the security of the AI model and the safety of users at present. Notably, the upgraded Claude 3.5 Sonnet is available for all users and developers can build on this capability via the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI. Another major announcement was the unveiling of Claude 3.5 Haiku. For context, Haiku is the cheapest and fastest AI model series offered by Anthropic. The AI firm now claims that the capabilities of the successor to the Claude 3 Haiku outperform Claude 3 Opus, the company's previous flagship-grade model. This means users can now access a powerful AI model at a much cheaper price point. Claude 3.5 Haiku will be released later this month across various platforms including the company's API, Amazon Bedrock, and Google Cloud's Vertex AI. It will initially be available as a text-only model and will later be updated to accept images as input.

[37]

Geeky Gadgets

How To Setup Claude Computer Use API to Control Your Computer - Beginners Guide

This week, Anthropic rolled out updates to its AI models and introduced a very interesting Computer Use API that allows AI to control your computer. Imagine a world where your computer anticipates your needs, executing tasks with just a simple prompt. It may sound like science fiction, but with the introduction of Anthropic's Computer Use API, this vision is becoming a reality. Built on the robust Claude 3.5 Sonnet model, this innovative API enables artificial intelligence to take control of your desktop environment. Whether you're a developer, customer service representative, or educator, the ability to automate and streamline your daily tasks is now within reach. If you're interested in setting up the new Anthropic Computer Use API on your local machine, you'll be pleased to know that the team at WorldofAI has already created a step-by-step tutorial to guide you through the process. The Anthropic Computer Use API represents a significant advancement in AI technology, offering a new way to control computer screens through user prompts. From filling out forms to conducting complex data analyses, the possibilities are as vast as they are exciting. This innovative API, built on the foundation of the Claude 3.5 Sonet model, seamlessly integrates AI into everyday computing tasks. "Computer use is a beta feature. Please be aware that computer use poses unique risks that are distinct from standard API features or chat interfaces. These risks are heightened when using computer use to interact with the internet. To minimize risks, consider taking precautions such as: In some circumstances, Claude will follow commands found in content even if it conflicts with the user's instructions. For example, Claude instructions on webpages or contained in images may override instructions or cause Claude to make mistakes. We suggest taking precautions to isolate Claude from sensitive data and actions to avoid risks related to prompt injection. Finally, please inform end users of relevant risks and obtain their consent prior to enabling computer use in your own products." The Computer Use API is designed with versatility in mind, capable of automating a diverse array of computer tasks. Its applications span several key areas: In the realm of coding, the API serves as a virtual coding assistant, helping programmers streamline their development process. For customer service applications, it acts as an efficient intermediary, managing inquiries and providing solutions with minimal human intervention. In educational settings, the API assists the creation of adaptive learning tools that cater to individual student needs and learning styles. To harness the full potential of the Computer Use API, setting it up in a local environment is crucial. Docker provides a reliable and consistent platform for deployment, making sure a smooth setup process across different systems. Here's a step-by-step guide to get you started: 1. Download and install Docker on your system. 2. Configure Docker settings to align with your system specifications. 3. Pull the Computer Use API Docker image from the official repository. 4. Create a Docker container using the pulled image. 5. Configure the container to interact with your local desktop environment. 6. Launch the container and start exploring the API's functionalities. By following these steps, you can create a controlled testing environment where you can safely experiment with the API's capabilities and integrate it into your workflow. Explore further guides and articles from our vast library that you may find relevant to your interests in APIs. The practical applications of the Computer Use API are extensive and varied. Users can automate a wide range of tasks, from simple data entry to complex multi-step processes. For example: By inputting specific prompts, the API navigates through applications, executes commands, and delivers results with remarkable efficiency. This capability demonstrates its potential to transform routine tasks, significantly reducing manual effort and freeing up time for more critical activities. The Computer Use API is currently in an experimental phase, undergoing continuous refinement and enhancement. As AI technology advances, the potential for future improvements is substantial. Researchers and developers are working on expanding its capabilities, improving its accuracy, and increasing its compatibility with a wider range of applications. Future developments may include: These advancements could lead to more sophisticated desktop interaction technologies, paving the way for advanced automation and AI-driven solutions across various industries. Users are encouraged to explore the Computer Use API across different platforms and scenarios. By experimenting with its functionalities, you can uncover innovative ways to integrate AI into your workflows and potentially discover new applications. Whether you're a developer looking to streamline your coding process, a customer service manager aiming to improve response times, or an educator seeking to create more engaging learning experiences, the API offers a glimpse into the future of AI-powered desktop control. The Anthropic Computer Use API stands at the forefront of a new era in computing, promising to transform how we interact with technology. As it continues to evolve, it has the potential to reshape our digital experiences, making them more intuitive, efficient, and productive. For more information jump over to the official Anthropic website.

[38]

TechCrunch

Anthropic's new AI model can control your PC

In a pitch to investors last spring, Anthropic said it intended to build AI to power virtual assistants that could perform research, answer emails, and handle other back-office jobs on their own. The company referred to this as a "next-gen algorithm for AI self-teaching" -- one it believed that could, if all goes according to plan, automate large portions of the economy someday. It took a while, but that AI is starting to arrive. Anthropic on Tuesday released an upgraded version of its Claude 3.5 Sonnet model that can understand and interact with any desktop app. Via a new "Computer Use" API, now in open beta, the model can imitate keystrokes, button clicks, and mouse gestures, essentially emulating a person sitting at a PC. "We trained Claude to see what's happening on a screen and then use the software tools available to carry out tasks," Anthropic wrote in a blog post shared with TechCrunch. "When a developer tasks Claude with using a piece of computer software and gives it the necessary access, Claude looks at screenshots of what's visible to the user, then counts how many pixels vertically or horizontally it needs to move a cursor in order to click in the correct place." In the race to develop so-called "AI agents," the field has only become more crowded. AI agents remains an ill-defined term, but it generally refers to AI that can automate software. Some analysts say AI agents could provide companies with an easier path to monetizing the billions of dollars that they're pouring into AI. Companies seem to agree: according to a recent Capgemini survey, 10% of organizations already use AI agents and 82% will integrate them within the next three years. Salesforce made splashy announcements about its AI agent tech this summer, while Microsoft touted new tools for building AI agents yesterday. OpenAI, which is plotting its own brand of AI agents, sees the tech as a step toward super-intelligent AI. Anthropic calls its take on the AI agent concept an "action-execution layer" that lets the new 3.5 Sonnet perform desktop-level commands. Thanks to its ability to browse the web (not a first for AI models, but a first for Anthropic), 3.5 Sonnet can use any website and any application. "Humans remain in control by providing specific prompts that direct Claude's actions, like 'use data from my computer and online to fill out this form,'" an Anthropic spokesperson told TechCrunch. "People enable access and limit access as needed. Claude breaks down the user's prompts into computer commands (e.g.. moving the cursor, clicking, typing) to accomplish that specific task." Software development platform Replit has used an early version of the new 3.5 Sonnet model to create an "autonomous verifier" that can evaluate apps while they're being built. Canva, meanwhile, says that it's exploring ways in which the new model might be able to support the designing and editing process. But how is this any different than the other AI agents out there? It's a reasonable question. Consumer gadget startup Rabbit is building a web agent that can do things like buying movie tickets online; Adept, which was recently acqui-hired by Amazon, trains models to browse websites and navigate software; and Twin Labs is using off-the-shelf models, including OpenAI's GPT-4o, to automate desktop processes. Anthropic claims the new 3.5 Sonnet is simply a stronger, more robust model that can do better on coding tasks than even OpenAI's flagship o1, per the SWE-bench Verified benchmark. Despite not being explicitly trained to do so, the upgraded 3.5 Sonnet self-corrects and retries tasks when it encounters obstacles, and can work toward objectives that require dozens or hundreds of steps. But don't fire your secretary just yet. In an evaluation designed to test an AI agent's ability to help with airline booking tasks, like modifying a flight reservation, the new 3.5 Sonnet managed to complete less than half of the tasks successfully. In a separate test involving tasks like initiating a return, 3.5 Sonnet failed roughly a third of the time. Anthropic admits the upgraded 3.5 Sonnet struggles with basic actions like scrolling and zooming, and that it can miss "short-lived" actions and notifications because of the way it takes screenshots and pieces them together. "Claude's Computer Use remains slow and often error-prone," Anthropic writes in its post. "We encourage developers to begin exploration with low-risk tasks." Risky business But is the new 3.5 Sonnet capable enough to be dangerous? Possibly. A recent study found that models without the ability to use desktop apps, like OpenAI's GPT-4o, were willing to engage in harmful "multi-step agent behavior," such as ordering a fake passport from someone on the dark web, when "attacked" using jailbreaking techniques. Jailbreaks led to high rates of success in performing harmful tasks even for models protected by filters and safeguards, according to the researchers. One can imagine how a model with desktop access could wreak more havoc -- say, by exploiting app vulnerabilities to compromise personal info (or storing chats in plaintext). Aside from the software levers at its disposal, the model's online and app connections could open up avenues for malicious jailbreakers. Anthropic doesn't deny that there's risk in releasing the new 3.5 Sonnet. But the company argues that the benefits of observing how the model is used in the wild ultimately outweigh this risk. "We think it's far better to give access to computers to today's more limited, relatively safer models," the company wrote. "This means we can begin to observe and learn from any potential issues that arise at this lower level, building up computer use and safety mitigations gradually and simultaneously." Anthropic also says it has taken steps to deter misuse, like not training the new 3.5 Sonnet on users' screenshots and prompts, and preventing the model from accessing the web during training. The company says it developed classifiers to "nudge" 3.5 Sonnet away from actions perceived as high-risk, such as posting on social media, creating accounts and interacting with government websites. As the U.S. general election nears, Anthropic says it is focused on mitigating election-related abuse of its models. The U.S. AI Safety Institute and U.K. Safety Institute, two separate but allied government agencies dedicated to evaluating AI model risk, tested the new 3.5 Sonnet prior to its deployment. Anthropic told TechCrunch it has the ability to restrict access to additional websites and features "if necessary," to protect against spam, fraud, and misinformation, for example. As a safety precaution, the company retains any screenshots captured by Computer Use for at least 30 days -- a retention period that might alarm some devs. We've asked Anthropic under which circumstances, if any, it would hand over screenshots to a third party (e.g., law enforcement) if asked, and will update this post if we hear back. "There are no foolproof methods, and we will continuously evaluate and iterate on our safety measures to balance Claude's capabilities with responsible use," Anthropic said. "Those using the computer-use version of Claude should take the relevant precautions to minimize these kinds of risks, including isolating Claude from particularly sensitive data on their computer." Hopefully, that'll be enough to prevent the worst from occurring. A cheaper model Today's headliner might've been the upgraded 3.5 Sonnet model, but Anthropic also said an updated version of Haiku, the cheapest, most efficient model in its Claude series, is on the way. Claude 3.5 Haiku, due in the coming weeks, will match the performance of Claude 3 Opus, once Anthropic's state-of-the-art model, on certain benchmarks at the same cost and "approximate speed" of Claude 3 Haiku. "With fast speeds, improved instruction following, and more accurate tool use, Claude 3.5 Haiku is well suited for user-facing products, specialized sub-agent tasks, and generating personalized experiences from huge volumes of data like purchase history, pricing or inventory data," Anthropic wrote in a blog post. 3.5 Haiku will initially be available as a text-only model and later as part of a multimodal package that can analyze both text and images. So once 3.5 Haiku is available, will there be much reason to use 3 Opus? What about 3.5 Opus, 3 Opus' successor, which Anthropic teased back in June? "All of the models in the Claude 3 model family have their individual uses for customers," the Anthropic spokesperson said. "Claude 3.5 Opus is on our roadmap and we'll be sure to share more as soon as we can."

[39]

Tom's Guide

Anthropic just dropped Claude Haiku 3.5 and gave the chatbot a huge upgrade -- here's what's new

Anthropic, a leader in the AI space, today announced a major upgrade to its Claude models with an update to its already impressive Claude 3.5 Sonnet and a brand new offering, Claude 3.5 Haiku. In addition to these upgrades, Anthropic is rolling out an exciting new feature in beta, which allows AI models to interact with computers, essentially mimicking human interactions -- an advancement poised to redefine AI capabilities. The Claude 3.5 Sonnet builds on the capabilities of previous models, which can only mean a boost to performance in both comprehension and content generation. Sonnet's refinement is sure to be an exciting improvement for the already user-friendly model. Additionally, Claude 3.5 Haiku brings a fresh approach, which Anthropic boasts is a more concise natural language model that's "three times faster" than its peers. The model is designed to optimize performance in applications where quick, short-form responses are crucial. I'm looking forward to trying what seems to be an AI model tailor-made for tasks that require brief yet effective answers such as microcontent creation. Anthropic published new benchmark results in its testing of both the updated Claude Sonnet 3.5 and the long-awaited Haiku 3.5. Sonnet now comfortably outperforms GPT-4o from OpenAI and Gemini 1.5 Pro from Google on graduate-level reasoning tasks, coding and visual analysis. Haiku, when compared to similar-sized models from Google and OpenAI -- Gemini Flash 1.5 and GPT-4o mini -- is better at coding, agent-like behavior and text reasoning. Anthropic wrote on X that: "Haiku now outperforms many state-of-the-art models on coding tasks -- including the original Claude 3.5 Sonnet and GPT-4o -- at the same cost as before." But perhaps the most exciting bit of news from Anthropic today is Anthropic's new beta feature: computer use capability. This is sure to turn some industry heads as the update enables Claude models to interact with computers in a manner that simulates human behavior. From viewing screens and moving a cursor to clicking and typing, the new AI opens a wealth of possibilities for how AI can assist. Essentially, the hope is that Claude can take the reins when it comes to performing everyday computer tasks. While still in beta, the computer use feature promises to make AI more versatile, especially in handling complex workflows such as filling out forms and navigating websites. Anthropic's updates position Claude to compete more aggressively with some of its biggest competitors. As these innovations roll out, the AI community will be watching closely to see how these models are used in real-world applications.

[40]

Inc.com

Anthropic Just Released Its Most Advanced Small AI Model Yet

Anthropic's family of AI models, named Claude, come in three sizes. They are Opus (the largest but most expensive), Sonnet (which is medium in both size and cost) and Haiku (the smallest and least expensive). According to a blog post on Anthropic's website, Claude 3.5 Haiku is available for the same cost as Claude 3 Haiku, but manages to outperform the larger Claude 3 Opus across several notable benchmarks, including a test in which models were tasked with completing customer service assignments. The AI company says Claude 3.5 Haiku is "particularly strong on coding tasks," but is also "well suited for user-facing products, specialized sub-agent tasks, and generating personalized experiences from huge volumes of data -- like purchase history, pricing, or inventory records." The model will be made publicly available later this month. Anthropic also released an updated version of its mid-sized model, Claude 3.5 Sonnet. The model features improvements across the board, especially in the fields of agentic coding and tool-use tasks. (In this context, agentic coding means that it can functionally code on its own, and tool-use tasks means it can call upon external functions.) Pietro Schirano, founder of AI-powered imagery startup EverAI, wrote on X that Claude 3.5 Sonnet is "the best model for coding on the planet," adding that "it has been a life-changing experience to use it and make it a daily part of my workflow." Claude 3.5 Sonnet's most notable new feature is called "computer use." This new ability enables the model to "use computers the way people do -- by looking at a screen, moving a cursor, clicking buttons, and typing text." Anthropic cautions that computer use is still experimental, and is "at times cumbersome and error-prone."

[41]

Geeky Gadgets

Anthropic's New Claude Updates and API: Transforming AI-Computer Interaction

Anthropic has unveiled two new AI models and a Computer Use API, marking a significant leap forward in technological capabilities. These developments are poised to transform how you interact with digital tools, offering substantial enhancements in both performance and efficiency. Anthropic's latest offering, the Sonnet 3.5 model, represents a major advancement in AI technology. This model brings notable improvements in coding and agentic tool use, setting new benchmarks in the field. Sonnet 3.5 demonstrates exceptional performance in Software Engineering (SWE) and Technical Agent Use (TAU) benchmarks, establishing itself as a frontrunner in AI capabilities. The upcoming Haiku 3.5 model is designed to complement Sonnet 3.5, focusing on speed and cost-effectiveness. Early indications suggest that Haiku 3.5 will outperform previous Sonnet versions in specific tasks, offering a more streamlined solution for certain applications. These models collectively represent a significant step forward in AI technology, providing you with sophisticated tools for tackling complex problems across various domains. The advancements in these models reflect Anthropic's commitment to pushing the boundaries of what's possible in AI. Sonnet 3.5 has demonstrated remarkable progress in SWE and TAU benchmarks, showcasing its ability to handle intricate coding challenges and complex agentic tool applications. The model's performance in these areas indicates its potential to significantly enhance productivity in software development and technical problem-solving scenarios. While Haiku 3.5 is yet to be released, expectations are high for its performance. It is anticipated to offer even greater efficiency, particularly in tasks where speed and cost-effectiveness are paramount. These enhancements underscore Anthropic's dedication to continually improving AI model performance, making sure you have access to innovative technology that can adapt to evolving needs. Master AI models with the help of our in-depth articles and helpful guides. Anthropic's new computer use API represents a significant breakthrough in AI-computer interaction. This innovative API enables Claude models to interact directly with computers, performing a wide range of tasks including web searches and data entry. This development opens up new possibilities for automation, potentially streamlining numerous operations across various industries. The API's capabilities extend beyond simple tasks, allowing for more complex interactions between AI models and computer systems. This could lead to more sophisticated automation processes, enhancing efficiency in data processing, information retrieval, and system management. However, it's crucial to note that the implementation of this API requires careful consideration and monitoring. To ensure safe and effective use, consider the following: As you integrate these advanced AI tools into your workflows, prioritizing safety and security is paramount. Vigilant monitoring of API activities is essential to prevent unintended operations and ensure the system operates within defined parameters. A thoughtful, strategic approach to implementation will allow you to fully use these technologies while minimizing potential risks. This proactive stance is key to successfully incorporating these tools into your operations, maximizing their benefits while maintaining a secure environment. Consider the following steps for safe implementation: The AI community eagerly anticipates the forthcoming release of Opus 3.5, expected to build upon the successes of Sonnet 3.5 and Haiku 3.5. This model is poised to further enhance AI capabilities, potentially offering even more advanced features and improved performance across various applications. As these advancements continue, you can look forward to ongoing improvements in AI-driven automation and performance. These developments are likely to pave the way for more sophisticated applications, opening up new possibilities in fields such as data analysis, predictive modeling, and automated decision-making. The rapid pace of innovation in AI technology suggests that we may see even more new developments in the near future. Staying informed about these advancements and their potential applications will be crucial for organizations looking to maintain a competitive edge in an increasingly AI-driven landscape. Anthropic's latest AI models and computer use API signify a major technological advancement in the field of artificial intelligence. By adopting these innovations, you can enhance your coding capabilities, improve agentic tool use, and drive greater efficiency and effectiveness in your operations. As AI technology continues to evolve, it promises to reshape how we interact with digital systems, offering new solutions to complex problems and opening up exciting possibilities for the future of technology.

[42]

Benzinga

Now AI Can Operate Computers Like Humans: OpenAI Rival Backed By Google Unveils Claude Upgrades Amid Fears Of Job Displacement - Alphabet (NASDAQ:GOOG), Amazon.com (NASDAQ:AMZN)

OpenAI rival Anthropic has unveiled an upgrade to its its AI model Claude 3.5 Sonnet along with introducing a new model, Claude 3.5 Haiku. The Claude 3.5 Sonnet model introduces a groundbreaking feature that allows it to interact with computers in a human-like manner. What Happened: While the computer interaction feature is still experimental, with some limitations in tasks like scrolling and dragging, Anthropic expects rapid improvements. This upgrade allows developers to automate processes and engage in open-ended tasks, with companies like Asana, Canva and Replit already exploring these capabilities. Anthropic said that this feature is currently in public beta on the API, enabling developers to instruct Claude to perform tasks such as moving a cursor and typing text. The Claude 3.5 Sonnet model is accessible to all users, and the Claude 3.5 Haiku is set for release later this month. Anthropic's API empowers Claude to translate instructions into computer commands, enhancing its ability to perform complex tasks. On OSWorld, Claude 3.5 Sonnet achieved a 14.9% score in the screenshot-only category, outperforming other AI systems. See Also: Crisis At Samsung? Hundreds Of Engineers Reportedly Apply For 3 Jobs At Rival SK Hynix As World's Largest Smartphone Maker's Chip Troubles Mount Anthropic is backed by Google parent Alphabet Inc. GOOGL GOOG and Jeff Bezos' Amazon.com Inc AMZN. Why It Matters: The release of Claude 3.5 Sonnet comes amid growing concerns about AI's impact on employment. Vinod Khosla, a Silicon Valley billionaire, has predicted that AI will replace most human jobs within 25 years, affecting roles from doctors to assembly line workers. This has sparked discussions about the need for Universal Basic Income (UBI) as a potential solution. However, Dario Amodei, CEO of Anthropic, has expressed concerns that UBI alone may not address the inequality AI could create. He advocates for a more comprehensive approach to ensure equitable wealth distribution, emphasizing the importance of a society where everyone can contribute economically. Anthropic was co-founded by Amodei in 2021. Read Next: Apple Intelligence's Meh Moment Ahead? Analyst Predicts 5 Months Of Build-Up Could Lead To A 'That's It?' Disclaimer: This content was partially produced with the help of Benzinga Neuro and was reviewed and published by Benzinga editors. Image via Shutterstock Market News and Data brought to you by Benzinga APIs

[43]

Geeky Gadgets

Claude 3.5 Haiku : The Future of AI Coding

Anthropic has this week updated it's Claude 3.5 Haiku AI model refining its processing powers to offer a blend of speed, accuracy, and affordability that sets a new standard in the industry. It's clear that Claude 3.5 Haiku is more than just an upgrade; With its unparalleled efficiency in handling complex coding tasks and multilingual processing, this model is poised to transform how we approach AI-powered applications. And the best part? It's designed to be accessible to everyone, from individual developers to large enterprises, thanks to its cost-effective pricing strategy. So, if you've ever felt limited by the tools at your disposal, prepare to be inspired by the possibilities that 3.5 Haiku brings to the table. Anthropic's Claude 3.5 Haiku represents a notable advancement in the field of language model technology. As the latest addition to the 3.5 Series, this model is engineered to be the fastest and most cost-effective option available, surpassing its predecessor, the 3.0 Haiku. Specifically designed to excel in coding tasks, Claude 3.5 Haiku sets new standards in the industry. For the same cost and similar speed as Claude 3 Haiku, Claude 3.5 Haiku improves across every skill set and surpasses Claude 3 Opus, the largest model in Anthropic's previous generation, on many intelligence benchmarks. Pricing for Claude 3.5 Haiku starts at $0.25 per million input tokens and $1.25 per million output tokens, with up to 90% cost savings with prompt caching and 50% cost savings with the Message Batches API. To learn more, check out the pricing page. Claude 3.5 Haiku offers substantial improvements in both speed and accuracy, crucial factors for efficient coding processes. The model handles complex tasks with minimal latency, making sure a highly responsive user experience. It has achieved outstanding results in various coding benchmarks, often outperforming other leading models in the field. Key performance enhancements include: The model's versatility extends beyond coding, making it a valuable tool for a wide range of global applications. Its ability to handle multilingual tasks efficiently further enhances its utility across diverse industries and regions. A standout feature of Claude 3.5 Haiku is its cost-effectiveness. Priced at just 0.25 cents per million tokens, it offers a financially viable solution for users ranging from individual developers to large enterprises. The model provides additional value through: This pricing strategy aims to provide widespread access to access to innovative AI capabilities, allowing a broader range of users to use advanced language model technology in their projects and workflows. Uncover more insights about Language Model (LLM) in previous articles we have written. Claude 3.5 Haiku is designed for easy integration into existing systems. Initially available as a text-only model, it can be accessed through popular platforms such as: This availability assists seamless API integration, allowing users to incorporate the model into their current workflows with minimal disruption. While currently focused on text processing, future updates may expand its capabilities to include image input support, further broadening its applications. The introduction of Claude 3.5 Haiku has the potential to significantly influence AI adoption across various sectors. Its combination of high performance and affordability makes it an ideal choice for: By lowering the cost barrier, Claude 3.5 Haiku encourages more businesses to integrate AI technologies into their operations. This increased accessibility is likely to foster innovation and improve efficiency across multiple industries. Anthropic has demonstrated a strong commitment to the continuous improvement of Claude 3.5 Haiku. Users can anticipate: This dedication to innovation underscores the model's potential to shape the future of AI applications. As Claude 3.5 Haiku evolves, it is poised to play an increasingly important role in advancing AI technology across various sectors, from software development to scientific research and beyond. The release of Claude 3.5 Haiku marks a significant step forward in making advanced AI capabilities more accessible and efficient. As it becomes widely available, this model is set to empower users with a powerful, cost-effective tool for tackling complex language processing and coding challenges.

Twitter

Facebook

Copy Link

Anthropic introduces a groundbreaking feature allowing its AI model, Claude, to control computers, potentially revolutionizing task automation and human-AI interaction.

Anthropic Unveils Groundbreaking AI Computer Control Feature

Anthropic, a leading AI company, has introduced a revolutionary feature that allows its AI model, Claude, to control computers directly. This development marks a significant shift from AI that merely generates text to agents capable of performing actions, potentially transforming how we interact with technology 1.

How Claude's Computer Control Works

Claude 3.5 Sonnet, the latest version of Anthropic's AI model, can now control a computer's cursor and input information via a virtual keyboard. The AI achieves this by:

Taking screenshots of the computer screen
Analyzing the content and layout
Determining the distance between the cursor and desired interaction points
Sending commands to perform actions 2

This capability allows Claude to perform a wide range of tasks, including web searches, document retrieval and modification, and interaction with various software programs 1.

Practical Applications and Demonstrations

Anthropic has showcased Claude's abilities through several demonstrations:

Planning an outing: Claude searched the web for information, found optimal viewing spots and times, and created a calendar event 3.
Website creation: The AI used its own interface to generate code, then utilized Visual Studio Code to build and test a simple website 3.
Form filling: Claude accessed multiple databases and browser tabs to gather information and complete a form autonomously 5.

Potential Impact and Future Prospects

This advancement could lead to significant changes in how we work and interact with computers:

Automation of repetitive tasks, potentially saving hours of human labor 4.
Enhanced productivity by allowing humans to focus on more creative and complex tasks 3.
Development of more sophisticated AI agents capable of handling a wide range of computer-based tasks 1.

Challenges and Limitations

While promising, the technology is still in its early stages and faces several challenges:

Potential for errors and unexpected behaviors, such as getting sidetracked during tasks 2.
Security concerns, particularly regarding access to sensitive data 1.
Limited capabilities in certain areas, such as scrolling, dragging, and zooming 5.

Industry Context and Competition

Anthropic's development aligns with a broader industry trend towards creating more autonomous AI agents:

Microsoft recently announced similar capabilities for its Copilot assistant 2.
Other companies like OpenAI and Meta are also working on agentic systems 4.

As the technology evolves, it will be crucial to address safety concerns, build user trust, and find the right balance between AI assistance and human control in computing tasks.

Reference

[1]

CCN.com

|Could Computer-Using Agents Be AI's Next Frontier? Anthropic Thinks So

[2]

Gizmodo

|Anthropic's New AI Model Takes Control of Your Computer

[3]

Wired

|Anthropic Wants Its AI Agent to Control Your Computer

[4]

Financial Times News

|Anthropic says latest AI model can control users' computers

[5]

Lifehacker

|Anthropic's Updated AI Models Can Control Your Computer for You

Anthropic Unveils AI-Powered Computer Automation for Developers

Anthropic releases updated AI models with a new "computer use" feature that can autonomously perform complex computer tasks, potentially revolutionizing software development workflows.

8 Sources

Anthropic Unveils Groundbreaking 'Computer Use' Capability in AI Models, Challenging OpenAI's Dominance

Anthropic introduces a new 'computer use' feature in its Claude AI models, allowing them to interact with computer interfaces like humans. This development, along with model upgrades, positions Anthropic as a strong competitor to OpenAI in the AI industry.

3 Sources

Anthropic's Claude AI Launches on Android, Challenging ChatGPT with Enhanced Features and Security

Anthropic has released its Claude AI chatbot as an Android app, offering advanced features and improved security. This move positions Claude as a strong competitor to ChatGPT in the mobile AI assistant market.

12 Sources

Anthropic Unveils Groundbreaking 'Computer Use' AI for Autonomous Task Performance

Anthropic introduces 'Computer Use' AI capability, allowing AI agents to autonomously perform tasks on computers by mimicking human actions. This experimental feature is available in public beta with Claude 3.5 Sonnet and Haiku models.

2 Sources

Anthropic Launches Claude AI Desktop Apps for Windows and Mac, Adds Dictation Feature to Mobile

Anthropic has released desktop applications for its AI chatbot Claude on Windows and Mac, along with a new dictation feature for mobile apps. The desktop apps aim to provide a more convenient and focused user experience.

8 Sources

Your one-stop AI hub

The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.

Creative and design

Anthropic's Claude AI Takes a Leap Forward: Controlling Computers and Automating Tasks

43 Sources

Anthropic Unveils Groundbreaking AI Computer Control Feature

How Claude's Computer Control Works

Practical Applications and Demonstrations

Potential Impact and Future Prospects

Challenges and Limitations

Industry Context and Competition

Anthropic Unveils AI-Powered Computer Automation for Developers

Anthropic Unveils Groundbreaking 'Computer Use' Capability in AI Models, Challenging OpenAI's Dominance

Anthropic's Claude AI Launches on Android, Challenging ChatGPT with Enhanced Features and Security

Anthropic Unveils Groundbreaking 'Computer Use' AI for Autonomous Task Performance

Anthropic Launches Claude AI Desktop Apps for Windows and Mac, Adds Dictation Feature to Mobile

Your one-stop AI hub

The Outpost

Keep in touch