Curated by THEOUTPOST
On Wed, 4 Dec, 12:07 AM UTC
7 Sources
[1]
AWS' new service tackles AI hallucinations | TechCrunch
Amazon Web Services (AWS), Amazon's cloud computing division, is launching a new tool to combat hallucinations -- that is, scenarios where an AI model behaves unreliably. Announced at AWS' re:Invent 2024 conference in Las Vegas, the service, Automated Reasoning checks, validates a model's responses by cross-referencing customer-supplied info for accuracy. AWS claims in a press release that Automated Reasoning checks is the "first" and "only" safeguard for hallucinations. But that's, well... putting it generously. Automated Reasoning checks is nearly identical to the Correction feature Microsoft rolled out this summer, which also flags AI-generated text that might be factually wrong. Google also offers a tool in Vertex AI, its AI development platform, to let customers "ground" models by using data from third-party providers, their own data sets, or Google Search. In any case, Automated Reasoning checks, which is available through AWS' Bedrock model hosting service (specifically the Guardrails tool), attempts to figure out how a model arrived at an answer -- and discern whether the answer is correct. Customers upload info to establish a ground truth of sorts, and Automated Reasoning checks and creates rules that can then be refined and applied to a model. As a model generates responses, Automated Reasoning checks verifies them, and, in the event of a probable hallucination, draws on the ground truth for the right answer. It presents this answer alongside the likely mistruth so customers can see how far off-base the model might've been. AWS says PwC is already using Automated Reasoning checks to design AI assistants for its clients. And Swami Sivasubramanian, VP of AI and data at AWS, suggested that this type of tooling is exactly what's attracting customers to Bedrock. "With the launch of these new capabilities," he said in a statement, "we are innovating on behalf of customers to solve some of the top challenges that the entire industry is facing when moving generative AI applications to production." Bedrock's customer base grew by 4.7x in the last year to tens of thousands of customers, Sivasubramanian added. But as one expert told me this summer, trying to eliminate hallucinations from generative AI is like trying to eliminate hydrogen from water. AI models hallucinate because they don't actually "know" anything. They're statistical systems that identify patterns in a series of data, and predict which data comes next based on previously-seen examples. It follows that a model's responses aren't answers, then, but predictions of how questions should be answered -- within a margin of error. AWS claims that Automated Reasoning checks uses "logically accurate" and "verifiable reasoning" to arrive at its conclusions. But the company volunteered no data showing that the tool is itself reliable. In other Bedrock news, AWS this morning announced Model Distillation, a tool to transfer the capabilities of a large model (e.g. Llama 405B) to a small model (e.g. Llama 8B) that's cheaper and faster to run. An answer to Microsoft's Distillation in Azure AI Foundry, Model Distillation provides a way to experiment with various models without breaking the bank, AWS says. "After the customer provides sample prompts, Amazon Bedrock will do all the work to generate responses and fine-tune the smaller model," AWS explained in a blog post, "and it can even create more sample data, if needed, to complete the distillation process." But there's a few caveats. Model Distillation only works with Bedrock-hosted models from Anthropic and Meta at present. Customers have to select a large and small model from the same model "family" -- the models can't be from different providers. And distilled models will lose some accuracy -- "less than 2%," AWS claims. If none of that deters you, Model Distillation is now available in preview, along with Automated Reasoning checks. Also available in preview is "multi-agent collaboration," a new Bedrock feature that lets customers assign AI to subtasks in a larger project. A part of Bedrock Agents, AWS' contribution to the AI agent craze, multi-agent collaboration provides tools to create and tune AI to things like reviewing financial records and assessing global trends. Customers can even designate a "supervisor agent" to break up and route tasks to the AIs automatically. The supervisor can "[give] specific agents access to the information they need to complete their work," AWS says, and "[determine] what actions can be processed in parallel and which need details from other tasks before [an] agent can move forward." "Once all of the specialized [AIs] complete their inputs, the supervisor agent [can pull] the information together [and] synthesize the results," AWS wrote in the post. Sounds nifty. But as with all these features, we'll have to see how well it works when deployed in the real world.
[2]
AWS Launches Automated Reasoning Checks to Combat AI Hallucinations
Users can add information and the service creates rules for the AI model Amazon Web Services (AWS) launched a new service at its ongoing re:Invent conference that will help enterprises reduce instances of artificial intelligence (AI) hallucination. Launched on Monday, the Automated Reasoning checks tool is available in preview and can be found within the Amazon Bedrock Guardrails. The company claimed that the tool mathematically validates the accuracy of responses generated by large language models (LLMs) and prevents factual errors from hallucinations. It is similar to the Grounding with Google Search feature which is available on both the Gemini API as well as the Google AI Studio. AI models can often generate responses that are incorrect, misleading, or fictional. This is known as AI hallucination, and the issue impacts the credibility of AI models, especially when used in an enterprise space. While companies can somewhat mitigate the issue by training the AI system on high-quality organisational data, the pre-training data and architectural flaws can still make the AI hallucinate. AWS detailed its solution to AI hallucination in a blog post. The Automated Reasoning checks tool has been introduced as a new safeguard and is added in preview within Amazon Bedrock Guardrails. Amazon explained that it uses "mathematical, logic-based algorithmic verification and reasoning processes" to verify the information generated by LLMs. The process is pretty straightforward. Users will have to upload relevant documents that describe the rules of the organisation to the Amazon Bedrock console. Bedrock will automatically analyse these documents and create an initial Automated Reasoning policy, which will convert the natural language text into a mathematical format. Once done, users can move to the Automated Reasoning menu under the Safeguards section. There, a new policy can be created and users can add existing documents that contain the information that the AI should learn. Users can also manually set processing parameters and the policy's intent. Additionally, sample questions and answers can also be added to help the AI understand a typical interaction. Once all of this is done, the AI will be ready to be deployed, and the Automated Reasoning checks tool will automatically verify in case the chatbot provides any incorrect responses. Currently, the tool is available in preview in only the US West (Oregon) AWS region. The company plans to roll it out to other regions soon.
[3]
AWS has a new tool that wants to stop AI hallucinations for good
Customers will be presented with AI responses to see how accurate they are With businesses increasingly using AI tools for their key processes and tasks, hallucinations are proving to be a growing challenge. To try and tackle this, Amazon Web Services (AWS) has announced a new tool to tackle hallucinations. Revealed at its AWS re:Invent 2024 event, the new Automated Reasoning checks system looks to cut down on potentially damaging errors caused by hallucinations, which could see businesses face security risks or financial losses. At its simplest level, hallucinations are when an AI system or service behaves incorrectly, or becomes unreliable, often due to issues with errors in the data it has been trained on. Described by the company as "the first and only generative AI safeguard that helps prevent factual errors due to model hallucinations", AWS' Automated Reasoning checks look to solve this by cross-checking the responses generated by a model against information provided by the customer. If it can't determine if the answer matches up exactly, the response gets sent back to the model for checking. Available as part of Amazon Bedrock Guardrails, the company's system for keeping AI models accurate and reliable, the new checks will also attempt to see how the model came up with its answer, and if it deems it to be erroneous, will compare to the customer's information. It will then present its answer alongside the initial response from the model, meaning customers can see the possible gap between the truth and the response, and tweak their model accordingly. AWS gave the example of a healthcare provider using the tool to make sure customer enquiries about specific policies are given accurate answers. "Over time, as generative AI transforms more companies and customer experiences, inference will become a core part of every application," said Dr. Swami Sivasubramanian, vice president of AI and Data at AWS. "With the launch of these new capabilities, we are innovating on behalf of customers to solve some of the top challenges, like hallucinations and cost, that the entire industry is facing when moving generative AI applications to production."
[4]
AWS Bedrock upgrades to add model teaching, hallucination detector
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More AWS announced more updates for Bedrock aimed to spot hallucinations and build smaller models faster as enterprises want more customization and accuracy from models. AWS announced during re:Invent 2024 Amazon Bedrock Model Distillation and Automated Reasoning Checks on preview for enterprise customers interested in training smaller models and catching hallucinations. Amazon Bedrock Model Distillation will let users use a larger AI model to train a smaller model and offer enterprises access to a model they feel would work best with their workload. Larger models, such as Llama 3.1 405B, have more knowledge but are slow and unwieldy. A smaller model responds faster but most often has limited knowledge. AWS said Bedrock Model Distillation would make the process of transferring a bigger model's knowledge to a smaller one without sacrificing response time. Users can select the heavier-weight model they want and find a small model within the same family, like Llama or Claude, which have a range of model sizes in the same family, and write out sample prompts. Bedrock will generate responses and fine-tune the smaller model and continue to make more sample data to finish distilling the larger model's knowledge. Right now, model distillation works with Anthropic, Amazon and Meta models. Bedrock Model Distillation is currently on preview. Why enterprises are interested in model distillation For enterprises that want a faster response model -- such as one that can quickly answer customer questions -- there must be a balance between knowing a lot and responding quickly. While they can choose to use a smaller version of a large model, AWS is banking that more enterprises want more customization in the kinds of models -- both the larger and smaller ones -- that they want to use. AWS, which does offer a choice of models in Bedrock's model garden, hopes enterprises will want to choose any model family and train a smaller model for their needs. Many organizations, mostly model providers, use model distillation to train smaller models. However, AWS said the process usually entails a lot of machine learning expertise and manual fine-tuning. Model providers such as Meta have used model distillation to bring a broader knowledge base to a smaller model. Nvidia leveraged distillation and pruning techniques to make Llama 3.1-Minitron 4B, a small language model it said performs better than similar-sized models. Hallucinations remain an issue for AI models, even though enterprises have created workarounds like fine-tuning and limiting what models will respond to. However, even the most fine-tuned model that only performs retrieval augmented generation (RAG) tasks with a data set can still make mistakes. AWS solution is Automated Reasoning checks on Bedrock, which uses mathematical validation to prove that a response is correct. "Automated Reasoning checks is the first and only generative AI safeguard that helps prevent factual errors due to hallucinations using logically accurate and verifiable reasoning," AWS said. "By increasing the trust that customers can place in model responses, Automated Reasoning checks opens generative AI up to new use cases where accuracy is paramount." Customers can access Automated Reasoning checks from Amazon Bedrock Guardrails, the product that brings responsible AI and fine-tuning to models. Researchers and developers often use automated reasoning to deal with precise answers for complex issues with math. Users have to upload their data and Bedrock will develop the rules for the model to follow and guide customers to ensure the model is tuned to them. Once it's checked, Automated Reasoning checks on Bedrock will verify the responses from the model. If it returns something incorrectly, Bedrock will suggest a new answer. AWS CEO Matt Garman said during his keynote that automated checks ensure an enterprise's data remains its differentiator, with their AI models reflecting that accurately.
[5]
Amazon Bedrock gets better safeguards and ability to orchestrate multiple AI agents - SiliconANGLE
Amazon Bedrock gets better safeguards and ability to orchestrate multiple AI agents Cloud computing giant Amazon Web Services Inc. is looking to cement Amazon Bedrock's status as one of the most popular platforms for artificial intelligence developers, enhancing its capabilities in a number of new ways. At AWS re:Invent today, the company announced multiple updates to the platform that it says will help to prevent AI applications from "hallucinating" and generating false answers. Developers will also be able to orchestrate groups of so-called "AI agents" to perform more complex tasks than before, and create much smaller, task-specific AI models that can almost match the capabilities of powerful large language models, at lower costs. The company said it's introducing "Automated Reasoning" checks in preview as a comprehensive new safeguard for AI applications built with Amazon Bedrock, in an effort to combat the increasing prevalence of hallucinations. The fact is that inaccurate responses and other problems, such as bias, simply cannot be tolerated in an age when AI is being given more responsibility than ever before, fielding customer queries and completing work-related tasks to free up employees to focus on higher-level work. But AI hallucinations remain a big problem even now, causing a serious lack of trust among consumers and enterprises alike. AWS thinks it may finally be able to resolve this problem with Automated Reasoning, which is a type of AI that relies on math to prove that its responses are correct. The company says it excels when dealing with complex problems that require precise answers, paving the way for AI to be adopted in more situations where reliability is of paramount importance. These automated checks are the secret sauce within the new and improved guardrails for Amazon Bedrock. With Bedrock's guardrails, developers can exert more control over their AI models and force them to only talk about topics relevant to their purpose. Now, the guardrails are gaining the ability to validate factual responses for accuracy and show exactly how a model arrived at a particular response. In addition, the models will be able to produce auditable outputs for full transparency, ensuring that everything it says is in line with the customer's rules and policies. AWS reckons the new guardrails will be helpful in all sorts of scenarios. For instance, a healthcare insurance organization will be able to check that its customer service bots always respond accurately to customer queries, providing the correct answers to any questions they might have about their insurance policies, for example. One company already doing this is PricewaterhouseCoopers International Ltd., the professional consultancy firm, which uses Automated Reasoning checks to ensure its various AI assistants and agents are always providing accurate responses. Besides making AI safer, Amazon Bedrock is also making AI more capable with the introduction of "multi-agent collaboration" tools that allow developers to orchestrate dozens of AI agents at once, so they can work together to achieve outcomes. AI agents are AI applications that are programmed to perform complex tasks on behalf of users. For instance, a customer service chatbot that can process a refund is an AI agent, and so is an AI assistant that can perform data entry tasks when told to do so. AWS wants to make agents more useful and it thinks the best way to do that is by making them work together. It supports AI agent development through Bedrock's Amazon Bedrock Agents module, which is now getting some specialized tools for them to share context and dynamically route different tasks to other agents. The company said multi-agent collaboration in Amazon Bedrock is in preview now and makes it possible to assign different specialized agents to the specific steps involved in a more complex task or project. For instance, a financial services firm looking to carry out due diligence might use one agent to analyze global economic indicators, another to assess industry trends, and a third to review its historic financial records. With the multi-agent tools, they'll now be able to create what AWS calls a "supervisor agent" that coordinates these agents to work together on much larger projects, routing each step to the most appropriate one. Each agent will be restricted to accessing only the information it needs to complete the specific task assigned, the company said. The supervisor will then work out what other tasks can be processed in parallel, and which must wait until other tasks have been completed, before coordinating everything to ensure it's done in the correct order. The credit rating agency Moody's Corp. has already been exploring the potential of this, AWS said, using a series of coordinated AI agents to improve its risk analysis workflows, assigning one agent to analyze macroeconomic trends and another to evaluate company-specific risks. The third major new capability being added to Bedrock today is model distillation, which makes it possible to transfer the specific knowledge from powerful LLMs to much smaller, more energy-efficient models that are only focused on one task. The idea is that smaller agents will focus exclusively on that task, enabling it to match or even surpass the performance of the LLM while using a fraction of the energy. It's an intriguing idea because AI models require a serious amount of computing power which makes them extremely expensive. LLMs are, in some cases, too powerful for their own good, and their extensive knowledge base can also hinder their performance, as it takes longer for them to respond to some types of queries. Model distillation changes that. In preview now, it's basically a technique that allows for knowledge transfer from large LLMs to small language models or SLMs. To support this process, Bedrock also provides tools for the SLMs to manipulate the underlying training data, as well as capabilities for fine-tuning and adjusting the model weights to optimize performance. AWS reckons that it has made it possible to distill any LLM into an SLM that's up to 500% faster and 75% cheaper to run. By providing the right sample prompts, the SLM will be almost as capable as the LLM, with the average performance impact rated at just a 2% accuracy loss. Robin AI Ltd., the creator of a copilot for writing and reviewing legal contracts, said it has used model distillation to create AI assistants that can respond to questions about millions of contractual clauses. It does this at a small fraction of the cost of Robin's original LLM, responding much faster and without making any mistakes.
[6]
AWS had a Hard Time Fitting in All of Bedrock's Innovations at re:Invent 2024
"Bedrock gives you everything you need to integrate generative AI into production applications, not just proof of concepts," says AWS CEO Matt Garman. At AWS re:Invent in Las Vegas, Amazon Web Services (AWS) has announced exciting updates to Amazon Bedrock, its platform for creating and running AI applications. "One of the hardest parts was figuring out how much we could fit in," resonated AWS chief Matt Garman, reflecting on the sheer scale of advancements in Bedrock. "Fortunately, Swami will dive deeper into a ton more during his keynote tomorrow." Garman said that Bedrock is by far the easiest way to build and scale generative AI applications. One big addition to Bedrock includes Automated Reasoning Checks, a tool designed to stop AI from making factual mistakes, aka hallucinations. This is especially useful for industries like healthcare and finance, where accuracy is critical. "Automated reasoning checks prevent factual errors due to model hallunciations," said Garman. AWS further claimed that it helps ensure AI provides correct and trustworthy answers without needing advanced AI expertise. For example, PwC is using Automated Reasoning checks to build accurate, trustworthy AI assistants and agents to drive its clients' businesses to the leading edge. In addition to this, AWS announced the launch of Model Distillation. This lets users shrink large AI models into smaller ones without losing much accuracy. Smaller models are faster and cheaper to run. "Model Distillation in Bedrock delivers models that are 500% faster and 75% cheaper," shared Garman. For instance, Robin AI is already using this to save money while providing quick, accurate answers for legal questions. With Amazon Bedrock Model Distillation, customers can choose the optimal model for their use case and a smaller model from the same family, balancing application latency with cost efficiency. The company claimed that it works best with models from Anthropic and Meta, alongside its latest in-house Nova-series of models. "With a broad selection of models, leading capabilities that make it easier for developers to incorporate generative AI into their applications, and a commitment to security and privacy, Amazon Bedrock has become essential for customers who want to make generative AI a core part of their applications and businesses," said Dr. Swami Sivasubramanian, vice president of AI and Data at AWS. AWS also showcased Bedrock's ability to manage and coordinate multiple AI agents for large-scale, complex workflows. "Bedrock agents can now support complex workflows with agent collaboration, enabling seamless coordination for sophisticated tasks," shared Garman. Moody's uses Amazon Bedrock's multi-agent system to improve risk analysis, with each agent handling specific tasks. This makes their assessments faster and more accurate, strengthening their position as a financial leader. Garman said while generative AI is still in its early stages, Bedrock is positioning itself as a leader by offering innovative tools and models that address real-world challenges. "This is just a sampling of the new capabilities that we're announcing this week. Bedrock gives you the best models, the right tools, and capabilities you cannot get anywhere else," added Garman in his keynote address on the future vision of Bedrock.
[7]
AWS had a Hard Time Fitting in All of Bedrock's Innovations at the First Day of re:Invent 2024
"Bedrock gives you everything you need to integrate generative AI into production applications, not just proof of concepts," says AWS CEO Matt Garman. At AWS re:Invent in Las Vegas, Amazon Web Services (AWS) has announced exciting updates to Amazon Bedrock, its platform for creating and running AI applications. "One of the hardest parts was figuring out how much we could fit in," resonated AWS chief Matt Garman, reflecting on the sheer scale of advancements in Bedrock. "Fortunately, Swami will dive deeper into a ton more during his keynote tomorrow." Garman said that Bedrock is by far the easiest way to build and scale generative AI applications. One big addition to Bedrock includes Automated Reasoning Checks, a tool designed to stop AI from making factual mistakes, aka hallucinations. This is especially useful for industries like healthcare and finance, where accuracy is critical. "Automated reasoning checks prevent factual errors due to model hallunciations," said Garman. AWS further claimed that it helps ensure AI provides correct and trustworthy answers without needing advanced AI expertise. For example, PwC is using Automated Reasoning checks to build accurate, trustworthy AI assistants and agents to drive its clients' businesses to the leading edge. In addition to this, AWS announced the launch of Model Distillation. This lets users shrink large AI models into smaller ones without losing much accuracy. Smaller models are faster and cheaper to run. "Model Distillation in Bedrock delivers models that are 500% faster and 75% cheaper," shared Garman. For instance, Robin AI is already using this to save money while providing quick, accurate answers for legal questions. With Amazon Bedrock Model Distillation, customers can choose the optimal model for their use case and a smaller model from the same family, balancing application latency with cost efficiency. The company claimed that it works best with models from Anthropic and Meta, alongside its latest in-house Nova-series of models. "With a broad selection of models, leading capabilities that make it easier for developers to incorporate generative AI into their applications, and a commitment to security and privacy, Amazon Bedrock has become essential for customers who want to make generative AI a core part of their applications and businesses," said Dr. Swami Sivasubramanian, vice president of AI and Data at AWS. AWS also showcased Bedrock's ability to manage and coordinate multiple AI agents for large-scale, complex workflows. "Bedrock agents can now support complex workflows with agent collaboration, enabling seamless coordination for sophisticated tasks," shared Garman. Moody's uses Amazon Bedrock's multi-agent system to improve risk analysis, with each agent handling specific tasks. This makes their assessments faster and more accurate, strengthening their position as a financial leader. Garman said while generative AI is still in its early stages, Bedrock is positioning itself as a leader by offering innovative tools and models that address real-world challenges. "This is just a sampling of the new capabilities that we're announcing this week. Bedrock gives you the best models, the right tools, and capabilities you cannot get anywhere else," added Garman in his keynote address on the future vision of Bedrock.
Share
Share
Copy Link
Amazon Web Services introduces Automated Reasoning checks to tackle AI hallucinations and Model Distillation for creating smaller, efficient AI models, along with multi-agent collaboration features in Amazon Bedrock.
Amazon Web Services (AWS) has unveiled a new tool called Automated Reasoning checks to combat AI hallucinations, a persistent challenge in the field of artificial intelligence. Announced at AWS re:Invent 2024, this service aims to validate model responses by cross-referencing customer-supplied information for accuracy 1.
Automated Reasoning checks, available through AWS' Bedrock model hosting service, attempts to discern how a model arrived at an answer and determine its correctness. Customers can upload information to establish a ground truth, and the tool creates rules that can be refined and applied to a model 1.
The process involves uploading relevant documents to the Amazon Bedrock console, which then automatically analyzes these documents and creates an initial Automated Reasoning policy. This policy converts natural language text into a mathematical format 2.
As models generate responses, Automated Reasoning checks verifies them against the established ground truth. In case of a probable hallucination, it presents the correct answer alongside the likely mistruth, allowing customers to see potential discrepancies 1.
AWS also introduced Model Distillation, a tool designed to transfer capabilities from large models to smaller, more cost-effective ones. This feature allows customers to experiment with various models without incurring excessive costs 1.
Model Distillation works by using a larger AI model to train a smaller one, offering enterprises access to models that best suit their workload requirements. Currently, it supports models from Anthropic and Meta, with some limitations on model compatibility and a slight trade-off in accuracy 4.
AWS has also introduced multi-agent collaboration tools in Amazon Bedrock, allowing developers to orchestrate multiple AI agents for complex tasks. This feature enables the assignment of specialized agents to specific steps in larger projects, with a "supervisor agent" coordinating their efforts 5.
Several companies are already leveraging these new AWS tools. PwC is using Automated Reasoning checks to design AI assistants for its clients, while Moody's is exploring multi-agent collaboration for improving risk analysis workflows 1 5.
Despite these advancements, experts caution that completely eliminating hallucinations from generative AI is challenging. AI models fundamentally operate as statistical systems, making predictions based on patterns in data rather than possessing actual knowledge 1.
Additionally, while AWS claims Automated Reasoning checks uses "logically accurate" and "verifiable reasoning," the company has not provided data demonstrating the tool's reliability 1.
Reference
[1]
[2]
Amazon Web Services (AWS) showcases significant AI developments at its annual re:Invent conference, including new Trainium chips, enhancements to SageMaker and Bedrock platforms, and AI-powered tools to compete with Microsoft in the cloud computing market.
6 Sources
6 Sources
AWS executives outline the company's strategy for integrating AI into enterprise operations, emphasizing productivity gains, democratized data access, and innovative tools like Amazon Q and Bedrock.
5 Sources
5 Sources
Stability AI has introduced three cutting-edge text-to-image models to Amazon Bedrock, expanding the platform's AI capabilities and offering developers new tools for visual content generation.
4 Sources
4 Sources
Amazon's AWS has established a new group focused on agentic AI, led by Swami Sivasubramanian, as part of a broader restructuring effort to accelerate innovation in AI technologies.
10 Sources
10 Sources
Amazon Web Services announces major updates to SageMaker, transforming it into a comprehensive platform that integrates data management, analytics, and AI development tools, addressing the convergence of data and AI in enterprise workflows.
11 Sources
11 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved