The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved
Curated by THEOUTPOST
On Fri, 14 Feb, 12:08 AM UTC
2 Sources
[1]
AI can use your computer now. Should it?
Adam Clark Estes is a senior technology correspondent at Vox. He's spent 15 years covering the intersection of technology, culture, and politics at places like Gizmodo, Vice, and the Atlantic. The first time I heard about AI agents, I thought they could monitor your computer use, anticipate your needs, and manipulate your behavior accordingly. This wasn't entirely off base. There is a dystopic future about what AI technology could enable that experts issue regular warnings about. There's also the present reality of agentic AI, which is here and clumsier than you would have guessed. Last month, OpenAI released something called Operator. It's what experts would call an AI agent, meaning a version of AI technology that can not only recall information and generate content, like ChatGPT, but can also actually do things. In the case of Operator, the AI can use a web browser to do anything from buying your groceries to updating your LinkedIn profile. At least in theory. Operator is also currently a "research preview" that's only available to ChatGPT Pro users, who pay $200 a month for the privilege. The reality is that, in its current form, Operator is not great at doing things. I've spent a week using it and, if I'm being honest, am happy to report that Operator is slow, makes mistakes, and constantly asks for help. Far from the frightening digital Ãœbermensch I once feared, what appears to be the state-of-the-art for a consumer-grade AI agent is impressive yet unintimidating. If you ask it to find you a road bike in your size that's on sale and nearby, it can do it. Give it the right amount of context and constraints, and Operator truly works. But if I put in the time myself, I could still find a better bike. "I'm very optimistic about using AI as sort of a dumb assistant, in that I don't want it to make decisions for me," Aditi Raghunathan, an assistant professor of computer science at Carnegie Mellon University. "I don't trust it to do things better than me." The basic concept of an AI agent is simultaneously alluring and horrific. Who wouldn't want an AI to handle mundane computer chores? But if the AI can use a computer to do boring things, you have to imagine it can do scary things, too. For now, for people like you and me, scary things include buying expensive eggs or briefly screwing up your presence on the world's largest network for professionals. For the economy as a whole, well, it depends on how much we trust AI and how much freedom we give it to operate unchecked. Global leaders gathered for the Paris AI Action Summit this week to discuss the future of the technology. Past summits in Bletchley Park, famous for its code-breaking computer used in World War II, and Seoul focused on AI safety, including the kinds of regulations governments should adopt in order to keep AI in check. But this meeting seemed to highlight a growing sense of competition between global powers, namely the US and China, to win the AI arms race. JD Vance was in attendance and said, "The AI future is not going to be won by hand-wringing about safety." So now I'm feeling a little nervous. While OpenAI's entry into the AI agent space currently feels like a parlor trick, I have to wonder what the industry's endgame is here. AI could usher in a friendly future of digital assistants who make our lives easier without any negative consequences. Or it could finally realize the paperclip scenario, in which we give AI free rein to solve one problem, like making paperclips, and it diverts all global resources toward that problem, destroying humanity in the process. The future will almost certainly be something in between the best- and worst-case scenarios. In any case, plenty of experts say fully autonomous agents should never be invented. I have to say, if the AI agents of the future are as clumsy as Operator is right now, I'm not too worried. Whether you like it or not, the next wave of AI technology will involve computers using computers. It's already happening. In the big agriculture industry, for example, farmers are already handing over the keys to their John Deere tractors to AI-powered software that can work through the night. Others, like the global development nonprofit Digital Green, are giving farmers in developing countries access to Operator so that it can lower costs and improve crop yields. "A farmer can take a picture of a crop, and they can determine the crop is not doing well because of a bug, or it can check the weather to see if it's weather-related," said Kevin Barenblat, co-founder and president of Fast Forward, a tech nonprofit accelerator that supports Digital Green. "Giving the agent more flexibility to figure out what the problem is really helpful for people when they're trying to solve problems." Another arresting example of AI agents in action is also a pretty boring one, which tells you something about how this technology can be most useful. Rekki, a startup in London, recently told Bloomberg that it sells access to AI agents that are trained to help restaurants and their suppliers streamline inventory management. A restaurant, for instance, could give the chatbot a long list of ingredients it uses and make sure everything is ordered on time. It works well enough that some companies are cutting staff and paying for the software instead. Enter AI-curious consumers, like me, with problems to solve. If you pay the $200 a month for access, you can gain access to a user-friendly version of Operator that looks and acts a lot like ChatGPT. While it currently works as a separate app on ChatGPT's website, OpenAI ultimately plans to integrate Operator into ChatGPT for a seamless experience. Interacting with Operator is already a lot like using ChatGPT: You get Operator to do tasks by typing prompts into a familiar-looking empty box. Then things get interesting. Operator opens up a tiny browser window and starts doing the task. You can watch it try and fail in real-time. A couple of things Operator successfully did for me: It bought me a new vacuum, and it initiated an exchange for a mattress I bought online. In both cases, however, I essentially did the heavy lifting. Operator can't currently log into websites on your behalf, solve CAPTCHAs, or enter credit card information. So when I was purchasing the vacuum, Operator got as far as finding the product listing, but I pretty much did everything after that. In the customer service example, Operator found the right form, but I filled it out and then the whole transaction moved over to email, where Operator had no jurisdiction. These seemingly innocuous tasks are exactly the kind of thing that OpenAI wants Operator to do right now. It actually serves up suggestions under that prompt box for things like making restaurant reservations, booking plane tickets, and ordering an Uber. If you consider you're not actually handing over your credit card to the AI, getting Operator to do your shopping sounds like a good idea. It will compare prices for you and that part requires little supervision. In one instance, Operator even flagged a potentially fraudulent website selling a Dyson vacuum for $50. But you can also imagine a future in which fraudsters know the AI's weaknesses and exploit them. In its current form, Operator amounts to a painfully slow way to use Google -- or rather Bing, thanks to OpenAI's partnership with Microsoft. It can do tasks for you while you're doing something else, but like ChatGPT before it, you always have to check Operator's work. I asked it to find me the cheapest flights for a weekend visit to my mom's house in Tennessee, and it returned a two-week-long itinerary that cost double what I'd expect to pay. When I explained the error, Operator did it again but worse. Operator is, in many ways, a mirage. It looks like a proof-of-concept that AI can not just generate text and images but actually perform tasks autonomously, making your life effortless in the process. But the more you ask the agent to do, the more agency it requires. This is a big conundrum for the future of AI development. When you put guardrails on tools -- not letting Operator go wild with your credit card, for instance -- you constrain its utility. If you give it more power to make decisions and operate independently, it may be more useful but also more dangerous. Which brings us back to the paperclip problem. First popularized by philosopher Nick Bostrom in 2003, the paperclip scenario imagines giving a superintelligent AI the task of manufacturing paperclips, and the freedom to do so unchecked. It doesn't end well for humans, which is a stark reminder that responsible AI development is not just about preventing an AI from using your credit card without permission. The stakes are much higher. "One of the most high-risk scenarios would be AI agents deployed to accelerate biological weapons development," said Sarah Kreps, director of the Tech Policy Institute at Cornell University. "A committed, nefarious actor could already develop bioweapons, but AI lowers the barriers and removes the need for technical expertise." This sort of thing is what global leaders were discussing in Paris this week. The consensus from the AI Summit, however, was not encouraging, if you care about the future of the human race. Vice President Vance called for "unparalleled R&D investments" into AI and called for "international regulatory regimes that fosters the creation of AI technology rather than strangles it." This reflects the same anti-guardrail principles that were in the executive order President Trump signed in January revoking President Joe Biden's plan for safe and responsible AI development. For the Trump administration, at least, the goal for AI development seems to be growth and dominance at all costs. But it's not clear that the companies developing this technology, including OpenAI, feel the same way. Many of the limitations I found in Operator, for instance, were imposed by its creators. The AI agent's slow-moving, second-guessing nature made it less useful -- but also more approachable and safe. Operator is very clearly an experiment. It's telling that OpenAI rolled it out for ChatGPT Pro subscribers, who are clearly enthusiastic enough and bullish enough about AI that they're willing to spend a four-figure sum annually to access the latest features. Based on their feedback, OpenAI will undoubtedly release a tweaked and improved version and then iterate again. In a couple of years, when the kinks are worked out, maybe we'll know how scared we should be about a future powered by AI agents. A version of this story was also published in the Vox Technology newsletter. Sign up here so you don't miss the next one!
[2]
Are You Ready to Let an AI Agent Use Your Computer?
An OpenAI engineer demoed the company's new computer-use agent, Operator, in a video on launch day. Two years after the generative AI boom really began with the launch of ChatGPT, it no longer seems that exciting to have a phenomenally helpful AI assistant hanging around in your web browser or phone, just waiting for you to ask it questions. The next big push in AI is for AI agents that can take action on your behalf. But while agentic AI has already arrived for power users like coders, everyday consumers don't yet have these kinds of AI assistants. That will soon change. Anthropic, Google DeepMind, and OpenAI have all recently unveiled experimental models that can use computers the way people do -- searching the web for information, filling out forms, and clicking buttons. With a little guidance from the human user, they can do thinks like order groceries, call an Uber, hunt for the best price for a product, or find a flight for your next vacation. And while these early models have limited abilities and aren't yet widely available, they show the direction that AI is going. "This is just the AI clicking around," said OpenAI CEO Sam Altman in a demo video as he watched the OpenAI agent, called Operator, navigate to OpenTable, look up a San Francisco restaurant, and check for a table for two at 7pm. Zachary Lipton, an associate professor of machine learning at Carnegie Mellon University, notes that AI agents are already being embedded in specialized software for different types of enterprise customers such as salespeople, doctors, and lawyers. But until now, we haven't seen AI agents that can "do routine stuff on your laptop," he says. "What's intriguing here is the possibility of people starting to hand over the keys." Anthropic was the first to unveil this new functionality, with an announcement in October that its Claude chatbot can now "use computers the way humans do." The company stressed that it was giving the models this capability as a public beta test, and that it's only available to developers who are building tools and products on top of Anthropic's large language models. Claude navigates by viewing screenshots of what the user sees and counting the pixels required to move the cursor to a certain spot for a click. A spokesperson for Anthropic says that Claude can do this work on any computer and within any desktop application. Next out of the gate was Google DeepMind with its Project Mariner, built on top of Google's Gemini 2 language model. The company showed Mariner off in December but called it an "early research prototype" and said it's only making the tool available to "trusted testers" for now. As another precaution, Mariner currently only operates within the Chrome browser, and only within an active tab, meaning that it won't run in the background while you work on other tasks. While this requirement seems to somewhat defeat the purpose of having a time-saving AI helper, it's likely just a temporary condition for this early stage of development. Finally, in January OpenAI launched its computer-use agent (CUA), called Operator. OpenAI called it a "research preview" and made it available only to users who pay US $200 per month for OpenAI's premium service, though the company said it's working toward broader release. Yash Kumar, an engineer on the Operator team, says the tool can work with essentially any website. "We're starting with the browser because this is where the majority of work happens," Kumar says. But he notes that "the CUA model is also trained to use a computer, so it's possible we could expand it" to work with other desktop apps. Like the others, Operator relies on chain-of-thought reasoning to take instructions and break them down into a series of tasks that it can complete. If it needs more information to complete a task -- like, for example, if you prefer to buy red or yellow onions -- it will pause and ask for input. It also asks for confirmation before taking a final step, like booking the restaurant table or putting in the grocery order. Here are some things that computer-use agents can't yet do: log in to sites, agree to terms of service, solve captchas, and enter credit card or other payment details. If an agent comes up against one of these roadblocks, it hands the steering wheel back to the human user. OpenAI notes that Operator doesn't take screenshots of the browser while the user is entering login or payment information. The three companies have all noted that putting an AI in charge of your computer could pose safety risks. Anthropic has specifically raised the concern of prompt injection attacks, or ways in which malicious actors can add something to the user's prompt to make the model take an unexpected action. "Since Claude can interpret screenshots from computers connected to the internet, it's possible that it may be exposed to content that includes prompt injection attacks," Anthropic wrote in a blog post. CMU's Lipton says that the companies haven't revealed much information about the computer-use agents and how they work, so it's hard to assess the risks. "If someone is getting your computer operator to do something nefarious, does that mean they already have access to your computer?" he wonders, and if so, why wouldn't the miscreant just take action directly? Still, Lipton says, with all the actions we take and purchases we make online, "It doesn't require a wild leap of imagination to imagine actions that would leave the user in a pickle." For example, he says, "Who will be the first person who wakes up and says, 'My [agent] bought me a fleet of cars?'" While none of the companies have revealed a timeline for making their computer-use agents broadly available, it seems likely that consumers will begin to get access to them this year -- either through the big AI companies or through startups creating cheaper knockoffs. OpenAI's Kumar says it's an exciting time, and that Operator marks a step toward a more collaborative future for humans and AI. "It's a stepping stone on our path to AGI," he says, referring to the long-promised dream/nightmare of artificial general intelligence. "The ability to use the same interfaces and tools that humans interact with on a daily basis broadens the utility of AI, helping people save time on everyday tasks." If you remember the prescient 2013 movie Her, it seems like we're edging toward the world that existed at the beginning of the film, before the sultry-voiced Samantha began speaking into the protagonist's ear. It's a world in which everyone has a boring and neutral AI to help them read and respond to messages and take care of other mundane tasks. Once the AI companies solidly achieve that goal, they'll no doubt start working on Samantha.
Share
Share
Copy Link
AI agents capable of using computers like humans are emerging, promising to revolutionize how we interact with technology. While still in early stages, these tools raise questions about efficiency, safety, and the future of human-computer interaction.
AI agents, a new frontier in artificial intelligence, are emerging as tools capable of using computers the way humans do. Unlike traditional AI models that simply recall information and generate content, these agents can perform actions on a computer, from browsing the web to completing tasks 1. Major players in the AI industry, including OpenAI, Anthropic, and Google DeepMind, have recently unveiled experimental models demonstrating this capability 2.
OpenAI's entry into this space is Operator, currently available as a "research preview" to ChatGPT Pro users. Operator can use a web browser to perform tasks such as online shopping or updating social media profiles. While impressive in concept, early experiences with Operator reveal limitations: it's slow, prone to mistakes, and frequently requires human intervention 1.
These AI agents can perform a range of tasks, from searching the web for information to filling out forms and clicking buttons. With guidance, they can order groceries, book rides, compare product prices, or find flights. However, they currently have limitations, including an inability to log in to sites, agree to terms of service, solve CAPTCHAs, or enter payment details 2.
Anthropic's Claude and Google DeepMind's Project Mariner are other notable entries in this field. Claude navigates by viewing screenshots and counting pixels to move the cursor, while Project Mariner operates within the Chrome browser 2. These developments indicate a growing trend towards more interactive and capable AI assistants.
The potential applications of AI agents are vast. In agriculture, AI-powered software is already being used to operate tractors and assist farmers in developing countries 1. In the restaurant industry, AI agents are helping streamline inventory management, potentially reducing staff requirements 1.
As AI agents gain the ability to interact with computers, concerns about safety and ethical use arise. Experts warn about the potential for misuse, including prompt injection attacks and unauthorized actions. The companies developing these technologies acknowledge these risks and are implementing safeguards 2.
The development of AI agents is taking place against a backdrop of global competition in AI technology. The recent Paris AI Action Summit highlighted the growing sense of rivalry between global powers, particularly the US and China, in the race to dominate AI development 1.
While current AI agents like Operator may seem clumsy and limited, they represent a significant step towards more autonomous AI systems. The future of this technology lies somewhere between the extremes of friendly digital assistants and potentially harmful autonomous systems 1. As these tools become more refined and widely available, they are likely to reshape how we interact with computers and perform everyday tasks.
Experts like Aditi Raghunathan from Carnegie Mellon University express cautious optimism, viewing AI as a "dumb assistant" rather than a decision-maker 1. Zachary Lipton, also from Carnegie Mellon, notes the intriguing possibility of people "handing over the keys" to AI for routine computer tasks 2.
As AI agents continue to evolve, they promise to bring both exciting possibilities and new challenges to the world of human-computer interaction. The coming years will likely see rapid advancements in this technology, potentially transforming how we work, shop, and interact with digital systems.
Reference
[2]
IEEE Spectrum: Technology, Engineering, and Science News
|Are You Ready to Let an AI Agent Use Your Computer?OpenAI's new AI agent, Operator, shows potential in automating online tasks but faces challenges in reliability and user experience.
7 Sources
7 Sources
OpenAI launches Operator, an AI agent capable of performing web-based tasks autonomously, sparking discussions about its implications for AGI and potential risks.
70 Sources
70 Sources
A comprehensive look at the AI landscape in 2024, highlighting key developments, challenges, and future trends in the rapidly evolving field.
8 Sources
8 Sources
As ChatGPT turns two, the AI landscape is rapidly evolving with new models, business strategies, and ethical considerations shaping the future of artificial intelligence.
6 Sources
6 Sources
OpenAI's new AI agent, Operator, shows potential in automating web-based tasks but falls short of full autonomy, requiring significant user intervention and facing challenges in reliability and efficiency.
3 Sources
3 Sources