Curated by THEOUTPOST
On Mon, 31 Mar, 4:05 PM UTC
18 Sources
[1]
Amazon unveils Nova Act, an AI agent that can control a web browser | TechCrunch
Amazon on Monday unveiled Nova Act, a general-purpose AI agent that can take control of a web browser and independently perform some simple actions. Alongside the new agentic AI model, Amazon is releasing the Nova Act SDK, a toolkit that allows developers to build agent prototypes with Nova Act. Nova Act, developed by Amazon's recently opened San Francisco-based AGI lab, will also power key features of the company's upcoming Alexa+ upgrade, a generative AI-enhanced version of Amazon's popular voice assistant. The version of Nova Act available starting today is a little less polished, however. Amazon is calling it a research preview. Developers can access the Nova Act toolkit on a new website, nova.amazon.com, which also serves as a showcase for Amazon's various Nova foundation models. Nova Act is Amazon's attempt to take on OpenAI's Operator and Anthropic's Computer Use with general-purpose AI agent technology of its own. Several leading tech companies believe AI agents that can navigate the web for users will make today's AI chatbots significantly more useful. Amazon may not be the first to develop this sort of agentic technology, but via Alexa+, it may have the widest reach. Amazon says developers building with the Nova Act SDK should be able to automate basic actions on behalf of users, such as ordering salads from Sweetgreen or making dinner reservations. With the Nova Act toolkit, developers can pull together tools that allow an AI agent to navigate web pages, fill out forms, or pick dates on a calendar. Amazon claims that Nova Act outperforms agents from OpenAI and Anthropic on several of the company's internal tests. For example, on ScreenSpot Web Text, which measures how an AI agent interacts with text on a screen, Nova Act scored 94%, outperforming OpenAI's CUA (which scored 88%) and Anthropic's Claude 3.7 Sonnet (90%). However, Amazon didn't benchmark Nova Act using more common agent evaluations, such as WebVoyager. Nova Act is the first public product to emerge from Amazon's aforementioned AGI lab, an initiative co-led by former OpenAI researchers David Luan and Pieter Abbeel. Both previously founded startups of their own -- Luan started Adept, while Abbeel cofounded Covariant -- before Amazon hired them away last year to spearhead its AI agent efforts. While it may seem strange for an AGI lab to be building AI agents that can order SweetGreen, Luan told TechCrunch that he sees agents as a key step toward creating superintelligent AI systems. Luan defines AGI as "an AI system that can help you do anything a human does on a computer." Luan says his team designed the Nova Act SDK to reliably automate short, simple tasks, and give developers tools to precisely define when they want a human to intervene in an agentic workflow. He hopes it will allow developers to create more reliable agentic applications, albeit not necessarily fully autonomous ones. Amazon is releasing its first generalist AI agent in a crowded space, but it's a crucial technology that the company has a lot riding on. Early tests of Nova Act could provide a glimpse into some of the capabilities of the long-delayed Alexa+, a make-or-break moment for Amazon's AI efforts. A major problem with early AI agents from OpenAI, Google, and Anthropic is their reliability across different domains. In TechCrunch's tests, the systems are slow, struggle to operate independently for very long, and are prone to mistakes a human wouldn't make. It won't be long until we see whether Amazon has cracked the code -- or whether its agents suffer from the same flaws plaguing competitors.
[2]
Amazon's AGI Lab Reveals Its First Work: Advanced AI Agents
Led by a former OpenAI executive, Amazon's AI lab focuses on the decision-making capabilities of next generation of software agents -- and borrows insights from physical robots. Amazon is still seen as a bit of a laggard in the race to develop advanced artificial intelligence, but it has quietly created a lab that is now setting records when it comes to AI performance. Amazon's AGI SF Lab, which is located in San Francisco and dedicated to building artificial general intelligence, or AI that surpasses the capabilities of humans, revealed the first fruits of its work today: A new AI model capable of powering some of the most advanced AI agents available anywhere. The new model, called Amazon Nova Act, outperforms ones from OpenAI and Anthropic on several benchmarks designed to gauge the intelligence and aptitude of AI agents, Amazon says. On the benchmarks GroundUI Web and ScreenSpot, Amazon Nova Act performs better than Claude 3.7 Sonnet and OpenAI Computer Use Agent. A major part of Amazon's plan to compete in the AI market is to focus on building agents, and the new model's abilities reflect its efforts to build a generation of tools that can measure up to the very best available. "I believe that the basic atomic unit of computing in the future is going to be a call to a giant [AI] agent," says David Luan, who leads Amazon's AGI SF Lab. He was previously a vice president of engineering at OpenAI and later cofounded Adept, a startup that pioneered work on AI agents, before joining Amazon in 2024 when the ecommerce giant took a stake in the company. Most of the leading AI labs are now focused on building increasingly capable AI agents. Getting AI to master independent actions, as well as conversation, promises to make the technology more useful and valuable. The shift from chat to action is still very much a work in progress, however. In the past six months, OpenAI, Anthropic, Google, and others have demonstrated web-browsing agents that take actions in response to a prompt. But for the most part, these agents are still unreliable, and they can easily be tripped up by open-ended requests. Luan says that Amazon's goal is building AI agents that are dependable rather than flashy. The thing holding agents back is not the need for "more cool demos of interesting capabilities that work 60 percent of the time, it's the Waymo problem," he says, referring to how self-driving cars needed to be trained to deal with unusual edge cases before they could take to the streets unsupervised. Many so-called agents are built by combining large language models with multiple human-written rules that are designed to prevent them from veering off course, but also makes their behavior brittle. Amazon Nova Act is a version of the company's most powerful homegrown model Amazon Nova that has received additional training to help it make decisions about what actions to take and at what time. In general, Luan says, AI models struggle to decide when they should intervene in a task. To improve Nova's agential abilities, Amazon is using reinforcement learning, a method that has helped other AI models better simulate reasoning.
[3]
Amazon's new AI agent is designed do your shopping
Amazon has launched a new AI model, Nova Act, which is designed to perform tasks -- like online shopping -- in your browser. For now it's only available to developers in a "research preview," but Amazon is also expanding access to its other Nova AI models through a web portal, making them easier to find and use. Nova Act is a new model whose function sounds similar to OpenAI's Operator agent. Amazon says Act can carry out web searches, make purchases, or answer questions about what's on the screen, including the option to perform tasks on a schedule. It can also recognize more detailed instructions -- such as telling it "don't accept the insurance upsell" while making a purchase. Act is currently only accessible to developers, but Amazon says it's already being put to use carrying out some online tasks in the upgraded Alexa Plus assistant.
[4]
Amazon's Nova AI agent launch puts it up against rivals OpenAI, Anthropic
Amazon CEO Andy Jassy speaks during an Amazon Devices launch event in New York City, U.S., February 26, 2025. Amazon on Monday released a new AI model that can take actions in a web browser on a user's behalf, a move that puts it in more direct competition with OpenAI, Anthropic and other companies that have developed the so-called "agents." The new model, called Nova Act, is designed to help developers build agents, or AI software that can complete multi-step tasks for users without supervision. Amazon showed Nova Act searching for "apartments by biking distance to the train station" as one example of a task it can complete. A growing number of companies are building AI agents as they look beyond text and image generators. Anthropic, the Amazon-backed AI startup founded by ex-OpenAI research executives, released its Computer Use tool in October. The startup said the tool can interpret what's on a computer screen, select buttons, enter text, navigate websites and execute tasks through any software and real-time internet browsing.
[5]
'Nova Act' moves Amazon further into the AI agent race
Amazon on Monday announced Nova Act, a new AI model and developer toolkit for creating agents that can perform tasks autonomously in web browsers -- a move that positions the tech giant more squarely in the emerging competition to commercialize AI agents. The company is also releasing a new website to give developers and everyday users the ability to experiment with its Nova foundation models, introduced in December. The Nova Act research preview -- the first release from Amazon's AGI Lab in San Francisco -- arrives as rivals such as OpenAI, Salesforce, Microsoft, Google, and Amazon partner Anthropic, along with a growing wave of startups, race to develop autonomous systems that move beyond AI chat and can complete real tasks on users' behalf. "We really think agents are the last missing piece on the path to general intelligence," said David Luan, Amazon's vice president of AGI Autonomy, in an interview with GeekWire. A former OpenAI VP and Google Research director, Luan joined Amazon last year along with others from Adept, a San Francisco startup that built agents to automate enterprise workflows. Luan leads the Amazon lab with Pieter Abbeel, an Amazon scholar in robotics. Nova Act is designed to help developers build agents that can complete step-by-step tasks in the browser -- such as submitting time-off requests or placing recurring online orders -- without relying on APIs. Amazon says the model is also engineered to handle interface elements that often trip up other systems, such as drop-down menus, date pickers, and pop-up dialogs. The toolkit, available as a Python software package, lets developers create agents that can follow natural-language instructions to complete tasks in a web browser. These agents can work quietly in the background, and developers can run many of them at once to handle larger workloads. They can also operate without showing anything on screen -- a behind-the-scenes mode designed for more advanced or automated business use. The company says its early internal tests show improved reliability over existing systems, but the AGI Lab will be watching closely to see how the system performs, and how it's used. "This is really meant to be a way to engage developers," Luan said, describing Nova Act as an early-stage research preview. "We want to see what developers are going to do with it." It's an extension of the company's Nova AI initiative, which includes foundation models of varying sizes, focusing on different media and forms of input, including text, images and video. Amazon's Nova models are part of a broader effort by the company to close the gap with rivals in the new era of generative AI. The Nova models function somewhat like a store brand -- developed in-house to integrate tightly with its own services, at a lower price. At the same time, Amazon offers a wide variety of third-party models through its Bedrock service. The company has invested heavily in external options, committing up to $8 billion to Anthropic, the AI startup behind the Claude chatbot.
[6]
Amazon launches AI agent which can take control of a web browser
Amazon has launched a new AI agent designed to control a web browser and perform simple browsing tasks autonomously. Currently available in research preview, Amazon says Nova Act is part of its efforts to deliver "real-world value" for all types of customers, including shoppers, sellers and advertisers - not just enterprises. US-based customers with an Amazon account can start using the platform immediately, even though it's not a full release, by visiting nova.amazon.com, where it can automate simple tasks like ordering food, making reservations, filling out forms or selecting calendar dates. Nova Act is accompanied by Nova Act SDK, which enables developers to build their own browser-based AI agents using the same technology. "nova.amazon.com puts the power of Amazon's frontier intelligence into the hands of every developer and tech enthusiast, making it easier than ever to explore the capabilities of Amazon Nova," said Rohit Prasad, SVP of Amazon Artificial General Intelligence. Three separate models have been launched to support the tool and SDK with text generation capabilities - Nova Micro, Lite, and Pro - as well as a fourth for image generation (Nova Canvas) and a fifth for video generation (Nova Reel). The company noted agents are still in their "early stage[s]" and that their usefulness will only grow as developments continue to roll out. Moreover, Amazon's internal tests demonstrate how Nova Act outperformed key rivals, including Anthropic's Claude 3.7 Sonnet (90%) and OpenAI's CUA (88%) on the ScreenSpot Web Text benchmark, designed to test how the models interact with text on screen. Nova Act scored an impressive 94%. It's also possible that Nova Act could play a key role in the company's upcoming Alexa+ upgrade, which improves the utility of the company's popular voice assistant with AI. We tried to access the site from the UK, but were greeted with the following message: "Thank you for your interest in Amazon Nova. At this time, we are only accepting customers in the US." It's unclear when Amazon will roll the online service out more broadly.
[7]
What you need to know about Amazon Nova Act: the new AI agent SDK challenging OpenAI, Microsoft, Salesforce
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More For a while, it seemed like Amazon was playing catchup in the race to offer its users -- particularly the millions of developers building atop Amazon Web Services (AWS)'s cloud infrastructure -- compelling first-party AI models and tools. But in late 2024, it debuted its own internal foundation model family, Amazon Nova, with text, image and even video generation capabilities, and last month saw a new Amazon Alexa voice assistant powered in part by Anthropic's Claude family of models. Then, on Monday, the e-commerce and cloud giant's artificial general intelligence division Amazon AGI has announced the release of Amazon Nova Act, an experimental developer kit for building AI agents that can navigate the web and complete tasks autonomously, powered by a custom, proprietary version of Amazon's Nova large language model (LLM). Oh, and the standard developer kit (SDK) is open source under a permissive Apache 2.0 license, though the SDK is designed to work only with Amazon's in-house custom Nova model, not any third-party ones. The goal is to enable third-party developers to build AI agents capable of reliably performing tasks within web browsers. But how does Amazon's Nova Act stack up to other agent building platforms out there on the market, such as Microsoft's AutoGen, Salesforce's Agentforce, and of course, OpenAI's recently released open source Agents SDK? A different, more thoughtful approach to AI agents Since the public rise of large language models (LLMs), most "agent" systems have been limited to responding in natural language or providing information by querying knowledge bases. Nova Act is part of the larger industry shift toward action-based agents -- systems that can complete actual tasks across digital environments on behalf of the user. OpenAI's new Responses API, which gives users access to its autonomous browser navigator, is one leading example of this, which developers can integrate into AI agents through the OpenAI Agents SDK. Amazon AGI emphasizes that current agent systems, while promising, struggle with reliability and often require human supervision, especially when handling multi-step or complex workflows. Nova Act is specifically designed to address these limitations by providing a set of atomic, prescriptive commands that can be chained together into reliable workflows. Deniz Birlikci, a Member of Technical Staff at Amazon, described the broader vision in a video introducing Nova Act: soon, there will be more AI agents than people browsing the web, carrying out tasks on behalf of users. David Luan, VP of Amazon's Autonomy Team and Head of AGI SF Lab, framed the mission more directly in a recent video call interview with VentureBeat: "We've created this new experimental AI model that is trained to perform actions in a web browser. Fundamentally, we think that agents are the building block of computing," he said. Luan, formerly a co-founder and CEO of Adept AI, joined Amazon in 2024 as part of an aqcui-hire. Luan said he has long been a proponent of AI agents. "With Adept, we were the first company to really start working on AI agents. At this point, everybody knows how important agents are. It was pretty cool to be a bit ahead of our time," he added. What Nova Act offers devs The Nova Act SDK provides developers with a framework for constructing web-based automation agents using natural language prompts broken down into clear, manageable steps. Unlike typical LLM-powered agents that attempt entire workflows from a single prompt -- often resulting in unreliable behavior -- Nova Act is designed to incrementally execute smaller, verifiable tasks. Some of the key features of Nova Act include: Luan emphasized that Nova Act is a tool for developers rather than a general-purpose chatbot. "Nova Act is built for developers. It's not a chatbot you talk to for fun. It's designed to let developers start building useful products," he said. For example, one of the sample workflows demonstrated in Amazon's documentation shows how Nova Act can automate apartment searches by scraping rental listings and calculating biking distance to train stations, then sorting the results in a structured table. Another showcased example uses Nova Act to order a specific salad from Sweetgreen every Tuesday, entirely hands-free and on a schedule, illustrating how developers can automate repeatable digital tasks in a way that feels reliable and customizable. Benchmark performance and a focus on reliability A central message in Amazon's announcement is that reliability, not just intelligence, is the key barrier to widespread agent adoption. Current state-of-the-art models are actually quite brittle at powering AI agents, with agents typically achieving 30% to 60% success rates on browser-based multi-step tasks, according to Amazon. Nova Act, however, emphasizes a building-block approach, scoring over 90% on internal evaluations of tasks that challenge other models -- such as interacting with dropdowns, date pickers, or pop-ups. Luan underscored why that reliability focus matters. "What we've really focused on is how do you actually make agents reliable? If you ask it to update a record in Salesforce and it deletes your database one out of ten times, you're probably never going to use it again," he said. Amazon AGI benchmarked Nova Act against competing models including Anthropic's Claude 3.7 Sonnet and OpenAI's CUA model. On the ScreenSpot Web Text benchmark, which tests instruction-following on textual screen elements, Nova Act achieved a score of 0.939, outperforming Claude 3.7 Sonnet (0.900) and OpenAI CUA (0.883). On the ScreenSpot Web Icon benchmark, which focuses on visual UI elements, Nova Act scored 0.879, again ahead of the other models. However, on the GroundUI Web benchmark, which tests general UI interaction, Nova Act scored 0.805, slightly behind its competitors. These scores were measured internally by Amazon using consistent prompts and evaluation criteria. Amazon also highlighted early results in Nova Act's ability to generalize beyond standard environments. For instance, team member Rick Liu demonstrated how the agent, without explicit training, successfully interacted with a pigeon-themed web game -- assigning stats, battling opponents, and progressing in the game. According to Luan, that ability to generalize is central to the long-term vision. "Our goal with Nova Act is to be a universal browser-use solution. We want an agent that can do anything you want to do on a computer for you," he said. Flexible for use in different clouds, but locked to Amazon's Nova model While Nova Act is accessible to developers globally through nova.amazon.com, Luan clarified that the system is tightly coupled to Amazon's in-house Nova foundation models. Developers cannot plug in external LLMs such as OpenAI's GPT-4o or Anthropic's Claude 3.7 Sonnet, unlike with OpenAI's Agents SDK, and to a lesser extent, Microsoft's AutoGen and Salesforce's Agentforce platforms (which allow switching to a few different provider companies and model families). "Nova Act is a custom trained version of the Nova model," he said. "It's not just a scaffolding over a generic LLM. It's natively trained to act on the internet on your behalf." However, Nova Act is not restricted to AWS environments. Developers can download the SDK and run it locally, in the cloud, or wherever they choose. "You don't need to be on AWS to use it," Luan stated. Thus, for businesses looking for maximum underlying model flexibility for their agents, Nova Act is probably not the best choice. However, for those looking for a purpose-built model specifically designed to navigate the web and perform actions across a wide variety of websites with very different user interfaces (UIs), it's probably worth a look -- especially if you're already in the Amazon or AWS developer ecosystem. Security, licensing and pricing The Nova Act SDK is released under the Apache License, Version 2.0 (January 2004), an open source license. However, this applies only to the SDK software. The Nova Act model itself, along with its weights and training data, is proprietary and remains closed-source. The approach is intentional, according to Luan, who explained that the model is tightly integrated and co-trained with the SDK to achieve reliability. At launch, Nova Act is offered as a free research preview. There is no announced pricing for production use yet. Luan described this phase as an opportunity for developers to experiment and build with the technology. "Our belief is that the majority of the most useful agent products have not yet been built. We want to enable anybody to build a really useful agent, whether for themselves or as a product," he said. Longer term, Amazon plans to introduce production-grade terms, including usage-based billing and scaling guarantees, but those are not yet available. What's next for Nova Act? The release of Nova Act reflects Amazon's broader ambition to make action-oriented AI agents a foundational component of computing. Luan summed up the opportunity ahead: "My personal dream is that agents become the building block of computing, and the coolest new startups and products get built on top of what our team is developing."
[8]
Amazon's AI agent will make it even easier for you to part with your money
The next big thing in the field of artificial intelligence is Agentic AI, which is essentially an AI tool that can automate certain multi-step processes for users. For example, interacting with a web browser for tasks like booking tickets or ordering groceries. Amazon certainly sees a future in there. After giving a massive overhaul to Alexa and introducing a new Alexa+ assistant, the company has today announced a new AI agent called Nova Act. Amazon says Nova Act is designed to "complete tasks in a web browser." Amazon won't be the first to reach this milestone, as few other AI companies have already attempted this vision. OpenAI's Operator is designed to handle the repetitive and mundane parts of web browsing on users' behalf, such as filling out forms or placing online orders. The folks behind Opera browser are also building an Operator that can pull off something similar using natural language commands given by users. Recommended Videos In Amazon's case, the Nova Act AI agent has been built atop Nova, the company's own portfolio of foundation AI models that were announced a few months ago. It can break down multi-step workflows into smaller inputs, such as performing a search, adding items, checking out, or answering questions based on the on-screen activity. Based on internal tests, Amazon's agentic AI tool is touted to perform better than rival products from OpenAI and Anthropic. The company says it is eying an accuracy level of over 90%, especially at browser-based tasks where AI tools often fail, such as pop-ups, drop-down menu interactions, and dates. Nova is currently in the research preview phase and has been offered to developers. "Our dream is for agents to perform wide-ranging, complex, multi-step tasks like organizing a wedding or handling complex IT tasks to increase business productivity," says the company. Interestingly, the foundations of Nova Act have already been integrated within the Alexa+ framework. It allows the virtual assistant to browse the web on behalf of users and get chores done. That's a massive bonus for Amazon when it comes to reaching an audience for the Nova Act tool. Access to Alexa+ is currently limited to Prime subscribers or those willing to pay a monthly fee for the next-gen assistant. Unlike the pricey agentic Operator from OpenAI, Amazon has a much bigger audience sitting with the right hardware at home to experiment with Alexa+ and the underlying Nova Act capabilities.
[9]
Amazon wants its new AI agent to do stuff on the web for you
In a demo video, Amazon showed Nova Act searching for apartments on a website and sorting them by biking distance to a train station. The company released the tool as a research preview, and said U.S.-based developers can start using it to build so-called AI agents -- or AI software that attempts to complete tasks autonomously -- starting today. Amazon called its Nova Act software development kit "a crucial step forward toward building reliable agents by enabling developers to break down complex workflows into atomic commands (e.g., search, checkout, answer questions about the screen)." The company first debuted its Nova foundation models in December during its re:Invent web services event for developers. Amazon is the latest tech giant to launch early-stage agentic -- or independent, task-performing -- AI tools, however this kind of software isn't quite ready to do much without supervision today. In January, OpenAI launched a research preview of its AI agent called Operator, which it also said can perform tasks on the web on behalf of users. Operator uses its own browser, and can interact with a webpage by typing, clicking, and scrolling, OpenAI said. Users can have Operator do tasks such as completing online forms and grocery shopping, according to the company. OpenAI launched another AI agent in February called Deep Research, which can "synthesize large amounts of online information" with its reasoning capabilities, as well as complete multi-step research. Google (GOOGL-0.58%) also offers an identically named research tool. On Amazon's fourth-quarter earnings call in February, Amazon chief executive Andy Jassy said the company spent $26.3 billion in capital expenditures in the fourth quarter -- a "vast majority" of which was on AI for Amazon Web Services. Jassy added that the spending "is reasonably representative of what you could expect in annualized capex rate in 2025," which would be around $100 billion.
[10]
Amazon's Nova AI agent launch puts it up against rivals OpenAI, Anthropic
An Amazon fulfillment center.Brent Lewin / Bloomberg via Getty Images file Amazon on Monday released a new AI model that can take actions in a web browser on a user's behalf, a move that puts it in more direct competition with OpenAI, Anthropic and other companies that have developed the so-called "agents." The new model, called Nova Act, is designed to help developers build agents, or AI software that can complete multi-step tasks for users without supervision. Amazon showed Nova Act searching for "apartments by biking distance to the train station" as one example of a task it can complete. A growing number of companies are building AI agents as they look beyond text and image generators. Anthropic, the Amazon-backed AI startup founded by ex-OpenAI research executives, released its Computer Use tool in October. The startup said the tool can interpret what's on a computer screen, select buttons, enter text, navigate websites and execute tasks through any software and real-time internet browsing. In January, OpenAI released a similar feature called Operator that will automate tasks such as planning vacations, filling out forms, making restaurant reservations and ordering groceries. The Microsoft-backed startup described Operator as "an agent that can go to the web to perform tasks for you." OpenAI followed up that release in February with another tool called Deep Research, which allows an AI agent to compile complex research reports and analyze questions and topics of the user's choice. Google launched a similar tool of the same name last December, which acts as a "research assistant, exploring complex topics and compiling reports on your behalf." Nova Act is initially launching in research preview for developers, Amazon said. The company is also launching a website that lets users experiment with its Nova AI models. The release is part of a broader strategy within Amazon to invest heavily in generative AI software. Amazon has introduced a flurry of AI products, including its own set of Nova models, Trainium chips, shopping and health assistants, as well as a marketplace for third-party models called Bedrock. It's also overhauling Alexa, the digital assistant it launched more than a decade ago, with AI capabilities. Earlier this month, Amazon's cloud unit said it's forming a group dedicated to developing agentic AI that's being led by longtime Amazon Web Services executive Swami Sivasubramanian. It's also created an internal team focused on building artificial general intelligence, or AGI, which broadly refers to AI that is as smart or smarter than humans. The team reports directly to Amazon CEO Andy Jassy.
[11]
Amazon Unveils Nova Act, an AI Agent to Control Web Browsers
Amazon launched the Nova Act SDK on Monday, an AI agent trained to perform tasks within a web browser. The Nova Act SDK is built to automate workflows by breaking down complex tasks into smaller commands, such as searching, completing checkouts, and answering questions based on on-screen content. It also allows developers to provide detailed instructions and integrate API calls to enhance reliability. US-based customers with an Amazon account can access nova.amazon.com to explore the Nova models, generate text and images, and test the Nova Act SDK for building browser-based agents. "Nova.amazon.com puts the power of Amazon's frontier intelligence into the hands of every developer and tech enthusiast, making it easier than ever to explore the capabilities of Amazon Nova," said Rohit Prasad, SVP of Amazon Artificial General Intelligence. "We've created this experience to inspire builders, so that they can quickly test their ideas with Nova models, and then implement them at scale in Amazon Bedrock." Nova Act is the first product to come out of Amazon's AGI lab, which was formed in December last year. Moreover, Amazon first introduced its Nova foundation models at re:Invent in December 2024. These models, including Nova Micro, Lite, and Pro, generate text from different modalities, while Nova Canvas and Nova Reel create high-quality images and videos. Nova models are integrated with Amazon Bedrock, offering customers scalable AI applications. Amazon describes agents as systems capable of performing tasks in digital and physical environments on behalf of users. The Nova Act SDK marks a step toward improving agent reliability by allowing developers to refine workflow commands. "It is an exciting step forward for rapid exploration with AI, including bleeding-edge capabilities such as the Nova Act SDK for building agents that take actions on the web. We're excited to see what they build and to hear their useful feedback," Prasad added. Amazon's Nova Act will directly compete with OpenAI's agent, Operator, and Anthropic's Computer Use. OpenAI's Operator can perform web-based tasks independently. It can fill out forms, order products, book flights, and make reservations by interacting with a browser as a human would. Similarly, Anthropic's Computer Use feature allows it to control software on a PC. This agent can perform on-screen tasks such as moving the cursor, clicking buttons, and typing text, mimicking human-computer interactions.
[12]
Amazon introduces Nova Act, an AI agent that can use a web browser - SiliconANGLE
The new AI agent is a research preview built by Amazon's newly opened Amazon AGI San Francisco Lab, which was behind the release of the Amazon Nova foundation models in December. Amazon Nova launched with three text-generating models (Micro, Lite and Pro) capable of summarizing text, answering questions and understanding context. The company also released two models capable of producing images and generating videos from text and image inputs named Canvas and Reel respectively. The company said it was also expanding access to Amazon Nova by rolling out a new website, nova.amazon.com, where developers and enthusiasts can explore the foundation models. "[We've put] the power of Amazon's frontier intelligence into the hands of every developer and tech enthusiast, making it easier than ever to explore the capabilities of Amazon Nova," said Rohit Prasad, senior vice president of Amazon artificial general intelligence. Amazon Act is capable of completing rudimentary tasks in a web browser on behalf of a user, such as clicking buttons and entering text into fields. Accompanying the release of the AI agent, Amazon also expanded access to a Nova Act software development kit, or SDK, that will allow developers to build agents that can break down complex commands into a series of actions that can be completed to reach a goal using a mapping such as "Find me the easiest way from my house to visit these three stores and then take in a movie at around 6 pm." Amazon said that it is looking to teach its AI agents to "have the same intuitions about UI elements" that humans do. That means interacting with web pages the same way that people do and being able to understand icons, forms, web elements and everything to participate similarly to another person when asking a question or proposing a task, such as the one above. Amazon's move comes at a time when other large enterprise companies have been working on building their own agentic AI solutions, such as Google LLC, OpenAI and Anthropic PBC, which are becoming increasingly powerful. Anthropic unveiled an experimental version of its AI model Claude in October that could use computer interfaces, including web browsers, and Google revealed it was testing a browser control capability for its Gemini flagship AI model in December. "We've created this experience to inspire builders so they can quickly test their ideas with Nova models, and then implement them at scale in Amazon Bedrock," added Prasad.
[13]
Amazon unveils Nova Act, an AI agent that can shop for you
Amazon on Monday launched its latest AI model, designed to take over a user's web browser and perform simple tasks. The move places the e-commerce giant in more direct competition with artificial intelligence companies like OpenAI and Anthropic, which are also developing AI "agents." The model, called Nova Act, is currently available as a "research preview" for developers, meaning it's not yet open to the general public. It can complete tasks such as browsing the web and making purchases without supervision. For instance, the company demonstrated Nova Act searching for apartments within biking distance of a specific train station. It can also handle more nuanced instructions like "don't accept the insurance upsell." "We think of agents as systems that can complete tasks and act in a range of digital and physical environments on behalf of the user," Amazon wrote in a blog post on Monday. These types of agents are still in their early stages, but tech companies are placing big bets that agentic AI represents the next major frontier. OpenAI recently released "Operator," a tool that automates web-based tasks, along with Deep Research, which it says can gather information from across the web and summarize it into digestible reports. Anthropic, the creator of Claude, and Google have also introduced AI agents.
[14]
Amazon Now Has an AI Agent That Can Perform Tasks in a Web Browser
Amazon Nova foundation models were first unveiled in December 2024 Amazon introduced an artificial intelligence (AI) agent on Monday that can perform various browser-based tasks. Dubbed the Amazon Nova Act, the AI agent is powered by the tech giant's foundation Nova family of AI models. These are first-party models the company claims will offer "frontier intelligence and industry leading price performance." Nova Act is currently available as a software development kit (SDK) and, alongside the AI models, is available to users in the US. Notably, the AI agent is currently available as a research preview. In a newsroom post, the Seattle-based e-commerce giant announced the expansion of the Nova AI models and the release of its AI agent. Amazon's Nova family of AI models were first unveiled in December 2024 in three variants of Micro, Lite, and Pro. These are the company's native large language models (LLMs) that are claimed to feature advanced intelligence and competitive pricing. Earlier, these were only available on an invite basis to businesses. Developers and AI enthusiasts can explore these models on a dedicated website. Currently, the Amazon Nova models are only available in the US. The new website also features Amazon's agentic model, dubbed the Nova Act. Aimed at developers, it is available as an SDK and can be used to build agentic capabilities in applications and software. The agent is powered by an early version of a customised Nova model. Notably, the Amazon Nova Act is currently available to users in the US as a research preview. Amazon says the AI agent can autonomously perform a wide range of tasks on a web browser. These tasks could be filling out a form, shopping online, or booking a reservation in a restaurant. The company did not share the kind of tasks Nova Act can perform. The company defined AI agents as "systems that can complete tasks and act in a range of digital and physical environments on behalf of the user." "Nova.amazon.com [makes] it easier than ever to explore the capabilities of Amazon Nova. We've created this experience to inspire builders, so that they can quickly test their ideas with Nova models, and then implement them at scale in Amazon Bedrock," said Rohit Prasad, SVP of Amazon Artificial General Intelligence.
[15]
Amazon launches AI agent that can shop online
This content has been selected, created and edited by the Finextra editorial team based upon its relevance and interest to our community. Now available as a research preview, the Nova Act SDK lets developers build agents that can do everything from submitting out of office requests, to playing video games, to apartment hunting. Amazon says that, while some use cases are well-suited for today's technology, multi-step agents prompted with high-level goals still require constant human hovering and supervision. Nova Act SDK enables developers to break down complex workflows into reliable atomic commands. In one example, Amazon technical staff member Vishal Vora runs through how an agent is set up to visit the Sweetgreen restaurant site every Tuesday night, add the salad he likes to a cart, choose to get it delivered to his address, add a tip and check out and pay. Vora says that if you have to "babysit" an AI, it's not really automation. Once the Nova Act agent's workflow is setup, users can switch on "headless mode" meaning that they never have to take any actions. "Our dream is for agents to perform wide-ranging, complex, multi-step tasks like organizing a wedding or handling complex IT tasks to increase business productivity," says Amazon.
[16]
Amazon introduces Nova Act AI agent for automating web tasks
Designed to autonomously handle online shopping, reservations, and form-filling within web browsers. In a move to bolster its position in the artificial intelligence (AI) sector, Amazon has unveiled Nova Act, a new AI model designed to perform complex tasks with advanced reasoning and problem-solving skills. The announcement marks Amazon's entry into the competitive field of agentic AI, aiming to challenge existing players like OpenAI and Google. According to CNBC, Nova Act is designed to carry out tasks within web browsers, including online shopping, filling out forms, and making reservations, effectively acting on behalf of users in digital environments. This model is part of Amazon's Nova family of generative AI models, which the company plans to launch by June 2025. Nova Act is the first AI model developed at Amazon's newly established Artificial General Intelligence (AGI) research lab. The upgraded Amazon voice assistant, Alexa Plus, will incorporate the Nova Act model, bringing generative AI capabilities and the ability to browse the web on behalf of users. This integration aims to enhance user interaction by allowing the assistant to perform tasks autonomously. Amazon AGI Labs, assembled in San Francisco, is working on models that aim to compete with OpenAI and Anthropic in the field of generative AI, according to The Independent. The lab is dedicated to developing artificial general intelligence, aiming to create a theoretical AI system with capabilities surpassing humans. "We think of agents as systems that can complete tasks and act in a range of digital and physical environments on behalf of the user. Today, such agents are still in an early stage," Amazon wrote in a blog post, as reported by The Washington Times. Nova Act builds on Amazon's Nova foundation models introduced in December, which include text-based models Micro, Lite, and Pro. Previously, Amazon released image and video generation AI models -- Nova Canvas and Nova Reel -- as part of the Nova product line. Amazon claimed that the Nova Act model surpassed the agents of OpenAI and Anthropic in a test measuring the ability to recognize and interact with text displayed on a screen, scoring 94% on the ScreenSpot Web Text benchmark. The Nova Act Software Development Kit (SDK) allows developers to create AI agents capable of automating tasks such as filling out forms, navigating web pages, and managing workflows, similar to OpenAI's Operator, according to testingcatalog.com. Access to the Nova Act SDK is currently limited to US-based developers with an Amazon account. "The idea is that people can quickly try ideas on the Nova website and then scale them in Bedrock," said Rohit Prasad, Senior Vice President of Amazon Artificial General Intelligence. Amazon is trying to differentiate its model from others through lower cost -- up to 75% cheaper than rivals -- and better performance, prioritizing external benchmark performance as well as cost efficiency. Stay updated with the latest news! Subscribe to The Jerusalem Post Newsletter Subscribe Now The article was written with the assistance of a news analysis system.
[17]
Amazon Unveils AI Agent That Can Shop and Place Orders | PYMNTS.com
Amazon launched an AI agent on Monday (March 31) that can use a web browser to do things like shop for products and services on behalf of users. It can even place the order for you. Called Nova Act, the new artificial intelligence (AI) model can navigate, interact with and act on web content without constant human oversight. Basically, it can mimic what human users do when they use a web browser. For example, it can search for products, add items to a cart, check out and pay for the order. It can also track specific products across eCommerce sites to compare prices and monitor for price discounts. It can do things like update a consumer's billing information across retail websites as well. Non-retail uses include submitting an out-of-office request, putting on a calendar hold and filing expense reports, among other tasks. "We think of agents as systems that can complete tasks and act in a range of digital and physical environments on behalf of the user," the company said in a blog post. Amazon's competitors have also introduced browser-using agents in recent months: Anthropic, OpenAI and Google. Anthropic was the first to introduce such an agent, dubbed "computer use." Introduced last October, the agent can look at a screen, move a cursor, click buttons and type text. Google followed with "Project Mariner" last December and OpenAI came next with "Operator" in January. Read more: Amazon Launches Nova AI Models for Business But Amazon said Nova Act differs from web-using agents of its competitors. It said AI agents that can perform multistep tasks for a complex goal -- like planning a wedding -- still need "constant" human supervision. Amazon said Nova Act can be more autonomous. The answer lies in how it's structured. Nova Act lets developers break down complicated workflows into a series of single acts -- such as do a search or check out. Developers can also add conditions to singular tasks, such as telling the agent not to accept the retailer's upsell when it searches for a product. By structuring Nova Act this way, Amazon said it is creating reliable building blocks for AI agents. "If you have to babysit an automation, it's not really an automation," Vishal Vora, technical staff member at Amazon, said in a video. "That's why reliability is the core of everything that we have built." For example, Vora demonstrated how he used Nova Act to order the same salad for dinner every Tuesday night. The AI agent did the following: The AI agent will do this every Tuesday to get Vora's salad delivered. It did not have to be trained on the steps beforehand. Nova Act is part of Amazon's family of foundation models called Nova. Its software development kit (SDK) is accessible through nova.amazon.com for developers to use for experimentation and building. "It is an exciting step forward for rapid exploration with AI, including bleeding-edge capabilities ... for building agents that take actions on the web," Rohit Prasad, senior vice president of Amazon artificial general intelligence (AGI), said in a blog post. Amazon is doubling down on AGI. In February, it created a new artificial general intelligence research unit called Amazon AGI SF Lab. The San Francisco-based group aims to build "practical AI that can actually do things for us and make our customers more productive, empowered and fulfilled."
[18]
Amazon's Nova Act AI Agent Debuts in US: Here's What It Can Do
Amazon Nova Act has launched an AI agent that can execute browser-based tasks independently. Developed on Amazon's own Nova foundation models, the AI agent is now open to a research preview in the United States. Amazon has also opened up access to its Nova AI models, which were earlier only available on an invite basis. Nova agent, as a kit (SDK), allows developers to incorporate agentic AI capabilities into applications. Amazon defines AI agents as machines that can act on their own to accomplish tasks in digital and physical worlds. The AI agent is capable of undertaking a range of browser-based AI tasks such as completing forms, purchasing products, and booking reservations. However, Amazon still hasn't revealed its complete capabilities.
Share
Share
Copy Link
Amazon introduces Nova Act, an AI agent capable of controlling web browsers and performing autonomous tasks, positioning the company in direct competition with OpenAI and Anthropic in the AI agent race.
Amazon has unveiled Nova Act, a general-purpose AI agent designed to control web browsers and perform autonomous tasks. This development marks Amazon's entry into the competitive field of AI agents, positioning the company alongside industry leaders like OpenAI and Anthropic 1.
Nova Act is capable of navigating web pages, filling out forms, and executing simple actions on behalf of users. Some potential applications include:
The AI agent can also recognize and follow detailed instructions, such as avoiding upsells during purchases 3.
Amazon claims that Nova Act outperforms competitors on several internal tests:
However, it's worth noting that Amazon did not benchmark Nova Act using more common agent evaluations like WebVoyager 1.
Nova Act is the first public product from Amazon's San Francisco-based AGI lab, led by former OpenAI researchers David Luan and Pieter Abbeel 1. The team's focus is on building dependable AI agents rather than flashy demonstrations 2.
Luan sees agents as a crucial step towards creating superintelligent AI systems, defining AGI as "an AI system that can help you do anything a human does on a computer" 1.
Amazon is releasing the Nova Act SDK, allowing developers to build agent prototypes and integrate the technology into their applications 1. The toolkit is available through a new website, nova.amazon.com, which also showcases Amazon's various Nova foundation models 3.
Nova Act is set to power key features in Amazon's upcoming Alexa+ upgrade, a generative AI-enhanced version of the popular voice assistant 1. This integration could potentially give Amazon a significant advantage in terms of reach and user adoption.
While Amazon's entry into the AI agent space is promising, the technology still faces challenges:
As the AI agent race intensifies, Amazon will need to demonstrate that Nova Act can overcome these hurdles and compete effectively with offerings from OpenAI, Anthropic, Google, and other players in the market 4.
Reference
[3]
Amazon launches its Nova family of AI models, offering text, image, and video generation capabilities. The move positions Amazon as a strong competitor in the enterprise AI market, challenging Microsoft, Google, and OpenAI.
29 Sources
29 Sources
Amazon introduces Nova, a family of AI foundation models, aiming to compete with OpenAI and Google in generative AI capabilities while emphasizing responsible AI practices and cost-efficiency.
2 Sources
2 Sources
Amazon introduces Alexa+, an advanced AI assistant with agentic capabilities, aiming to revolutionize consumer interaction with AI and solidify its position in the competitive AI market.
2 Sources
2 Sources
Amazon establishes a new research and development laboratory in San Francisco, focusing on creating AI agents capable of performing actions in both digital and physical environments. The lab, led by former Adept co-founder David Luan, aims to advance artificial general intelligence (AGI) capabilities.
4 Sources
4 Sources
Amazon unveils plans for AI-powered shopping guides and explores the development of autonomous AI shopping agents, signaling a significant shift in e-commerce and raising questions about the future of online shopping.
3 Sources
3 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved