7 Sources
7 Sources
[1]
Anthropic launches code review tool to check flood of AI-generated code | TechCrunch
When it comes to coding, peer feedback is crucial for catching bugs early, maintaining consistency across a codebase, and improving overall software quality. The rise of "vibe coding" -- using AI tools that takes instructions given in plain language and quickly generates large amounts of code -- has changed how developers work. While these tools have sped up development, they have also introduced new bugs, security risks, and poorly understood code. Anthropic's solution is an AI reviewer designed to catch bugs before they make it into the software's codebase. The new product, called Code Review, launched Monday in Claude Code. "We've seen a lot of growth in Claude Code, especially within the enterprise, and one of the questions that we keep getting from enterprise leaders is: Now that Claude Code is putting up a bunch of pull requests, how do I make sure that those get reviewed in an efficient manner?" Cat Wu, Anthropic's head of product, told TechCrunch. Pull requests are a mechanism that developers use to submit code changes for review before those changes make it into the software. Wu said Claude Code has dramatically increased code output, which has increased pull request reviews that have caused a bottleneck to shipping code. "Code Review is our answer to that," Wu said. Anthropic's launch of Code Review -- arriving first to Claude for Teams and Claude for Enterprise customers in research preview -- comes at a pivotal moment for the company. On Monday, Anthropic filed two lawsuits against the Department of Defense in response to the agency's designation of Anthropic as a supply chain risk. The dispute will likely see Anthropic leaning more heavily on its booming enterprise business, which has seen subscriptions quadruple since the start of the year. Claude Code's run-rate revenue has surpassed $2.5 billion since launch, according to the company. "This product is very much targeted towards our larger scale enterprise users, so companies like Uber, Salesforce, Accenture, who already use Claude Code and now want help with the sheer amount of [pull requests] that it's helping produce," Wu said. She added that developer leads can turn on Code Review to run on default for every engineer on the team. Once enabled, it integrates with GitHub and automatically analyzes pull requests, leaving comments directly on the code explaining potential issues and suggested fixes. The focus is on fixing logical errors over style, Wu said. "This is really important because a lot of developers have seen AI automated feedback before, and they get annoyed when it's not immediately actionable," Wu said. "We decided we're going to focus purely on logic errors. This way we're catching the highest priority things to fix." The AI explains its reasoning step by step, outlining what it thinks the issue is, why it might be problematic, and how it can potentially be fixed. The system will label the severity of issues using colors: red for highest severity, yellow for potential problems worth reviewing, and purple for issues tied to pre-existing code or historical bugs. Wu said it does this all fast and efficiently by relying on multiple agents working in parallel, with each agent examining the codebase from a different perspective or dimension. A final agent aggregates and ranks the filings, removing duplicates and prioritizing what's most important. The tool provides a light security analysis, and engineering leads can customize additional checks based on internal best practices. Wu said Anthropic's more recently launched Claude Code Security provides a deeper security analysis. The multi-agent architecture does mean this can be a resource-intensive product, Wu said. Similar to other AI services, pricing is token-based, and the cost varies depending on code complexity -- though Wu estimated each review would cost $15 to $25 on average. She added that it's a premium experience, and a necessary one as AI tools generate more and more code. "[Code Review] is something that's coming from an insane amount of market pull," Wu said. "As engineers develop with Claude Code, they're seeing the friction to creating a new feature [decrease], and they're seeing a much higher demand for code review. So we're hopeful that with this, we'll enable enterprises to build faster than they ever could before, and with much fewer bugs than they ever had before."
[2]
This new Claude Code Review tool uses AI agents to check your pull requests for bugs - here's how
Anthropic today announced a new Code Review beta feature built into Claude Code for Teams and Enterprise plan users. It's a new software tool that uses agents working in teams to analyze completed blocks of new code for bugs and other potentially problematic issues. To understand this new Anthropic offering, you need to understand the concept of a pull request. And that leads me to a story about a man named Linus. Long ago, Linux creator Linus Torvalds had a problem. He was managing lots of contributions to the open source Linux operating system. All the changes were getting out of control. Source code control systems (a method for managing source code changes) had been around for quite a while before then, but they had a major problem. Those old SCCSs were not meant to manage distributed development by coders all across the world. Also: I used Claude Code to vibe code a Mac app in 8 hours, but it was more work than magic So, Linus invented Git. If you're a coder, you know Git. It's the underlying coordinating mechanism for code changes. And if you thought Linus was a coding god just for Linux, the creation of Git and its offspring, particularly GitHub, should put him up there at the top of Mount Olympus. Dude created not just one, but two world-changing technologies. Today, almost every large project uses GitHub or one of its competitors. GitHub (as differentiated from Git) is the centralized cloud service that holds code repositories managed by Git. A few years back, GitHub was purchased by Microsoft, fostering all sorts of doom-and-gloom conspiracy theories. But Microsoft has proven to be a good steward of this precious resource, and GitHub keeps chugging along, managing the world's code. All that brings us back to pull requests, known as PRs in coder-speak. A pull request is initiated when a programmer wants to check in some new or changed code to a code repository. Rather than just merging it into the main track, a PR tells repo supervisors that there's something new, ready to be reviewed. Also: I tried to save $1,200 by vibe coding for free - and quickly regretted it Quick note: to coders, PR is an acronym for pull request. For marketers, PR means public relations. When you read about tech, you'll see both acronyms, so pay attention to the context to distinguish between the two. Sometimes, the code is very carefully checked over before being merged into the main codebase. But other times, it just gets rubber-stamped and merged. Code reviews, while necessary, are also tedious and time-consuming. Of course, the cost of rubber-stamping a PR can be catastrophic. You might ship code that is buggy, loses data, or damages user systems. At best, buggy code is just annoying. At worst, it can cause catastrophic damage. That's where Anthropic's new Claude Code Review comes in. In my article, 7 AI coding techniques I use to ship real, reliable products - fast, my bonus technique was using AI for code review. As a lone developer, I don't use a formalized code review process like the one Anthropic is introducing. I just tell a new session of the AI to look at my code and let me know what's not right. Sometimes I use the same AI (ie, Claude Code to look at Claude's code), and other times I use a different AI (like when I use OpenAI's Codex to review Claude Code generated code). It's far from a comprehensive review, but almost every time I ask for a review, one AI or the other finds something that needs fixing. The new Claude Code Review capability is modeled on the process used by Anthropic. The company has essentially productized its own internal methodology. According to Anthropic, customers "Tell us developers are stretched thin, and many PRs get skims rather than deep reads." Also: How to switch from ChatGPT to Claude: Transferring your memories and settings is easy This new agentic Code Review AI is able to provide deeper automated review coverage before needing human decisions. Anthropic says that code output per Anthropic engineer has increased 200% in the past year, intensifying pressure on human reviewers. You think? The company has been using its own AI to write code, which speeds up code production, so the changes and new code blocks are coming faster than ever before. Anthropic reports that the new Code Review system is run on nearly every pull request internally. When a PR is reviewed, human reviewers often make comments about the issues they see, which the coder needs to go back and fix. Before running Code Review, Anthropic coders got back "substantive" review comments about 16% of the time. With Code Review, coders are getting back substantive comments 54% of the time. While that seems to mean more work for coders, what it really means is that nearly three times the number of coding oopsies have been caught before they cause damage. Also: I used Claude Code to vibe code an Apple Watch app in just 12 hours - instead of 2 months According to Anthropic, the size of the internal PR impacts the level of review findings. Large pull requests with more than 1,000 changed lines show findings 84% of the time. Small pull requests of under 50 lines produce findings 31% of the time. Anthropic engineers "largely agree with what it surfaces: less than 1% of findings are marked incorrect." Heck, when I code, even if I add just one line of code, there's a chance I'll introduce a bug. Testing and code reviews are essential if you don't want thousands of users coming at you brandishing virtual pitchforks and torches. Don't ask me how I know. I'm always fascinated by what others experience while doing their jobs. Anthropic provided some examples of problems Code Review identified during its early testing. In one case, a single line change appeared to be routine. It would have normally been quickly approved. But Code Review flagged it as critical. It turns out this tiny little change would have broken authentication for the service. Because Code Review caught it, it was fixed before the move. The original coder said that they wouldn't have caught that error on their own. Also: I tried a Claude Code rival that's local, open source, and completely free - how it went Another example occurred when filesystem encryption code was being reorganized in an open source product. According to the report, "Code Review surfaced a pre-existing bug in adjacent code: a type mismatch that was silently wiping the encryption key cache on every sync." This is what we call a silent killer in coding. It could have resulted in data loss, performance degradation, and security risks. Anthropic described it as "A latent issue in code the PR happened to touch, the kind of thing a human reviewer scanning the changeset wouldn't immediately go looking for." If that hadn't been caught and fixed, it would have made for a very bad day for someone (or a whole bunch of someones). Code Review runs fairly quickly, turning around fairly complex reviews in about 20 minutes. When a pull request is opened, Code Review kicks off a bunch of agents that analyze code in parallel. Various agents detect potential bugs, verify findings to filter false positives, and rank issues by severity. The results are consolidated so that all the results from all the agents appear as a single summary comment on the pull request, alongside inline comments for specific problems. Also: How to install and configure Claude Code, step by step In a demo, Anthropic showed that the summary comment can also include a fix directive. So if Code Review finds a bug, it can be fed to Claude Code to fix. The company says that reviews scale with complexity: larger pull requests receive deeper analysis and more agents. Anthropic really seems to like spawning multiple agents. In the past, I've had some fairly serious difficulty wrangling them after they're launched. In fact, the first technique I shared in my 7 coding techniques article was to specifically tell Claude Code to avoid launching agents in parallel. There are some internal task management features in Claude (the /tasks command, for example), but I'd prefer to see a more comprehensive task management dashboard before I rely on the results of dozens of spawned agents. Reviews are billed based on token usage. Pricing scales with the size and complexity of the pull request being analyzed, but the company says that a code review typically costs between $15 and $25. In some ways, this could get very expensive very quickly. One of the most popular engineering-related Substacks is The Pragmatic Engineer. In an article, Gergely Orosz says that Anthropic engineers each typically produce about five PRs per day. In practice, typical developers not using AI coding support produce at most one or two a week. Also: Want local vibe coding? This AI stack might replace Claude Code and Codex - for free As a quick calculation, let's say a company has a hundred developers, each producing one PR a day, for five days a week. In our fantasy example, software engineers get weekends off. That would lead to 500 PRs a week, or 2,000 per month. At an average of $20 per PR, that amount of Code Review PRs could cost this sample company about $40,000 a month, or $480,000 per year. That might seem like a lot. But then factor in the cost of a catastrophic bug leaking out to customers, and how much that might cost in real dollars and reputation brand value to fix, and it starts to seem affordable. It's clear Anthropic has found a new profit center. Even at that expense level, it's probably worth it for companies to actively employ Code Review. The company does say that there are ways to control spending and usage, including: Administrators with Team and Enterprise plans can enable Code Review through Claude Code settings and a GitHub app install. Once activated, reviews automatically run on new pull requests without additional developer configuration. That's part of why usage caps and repository-level control become pretty important for cost management. Are you using AI tools to review your code or pull requests yet? Would you trust an automated multi-agent system to flag bugs and security problems before humans see the code? Do you think paying $15 to $25 per pull request for automated review makes sense, or would the costs add up too quickly? Also: Claude Code made an astonishing $1B in 6 months - and my own AI-coded iPhone app shows why If you're a developer, have AI code reviewers already caught issues you might have missed? Like I said, I'm just using basic prompting to generate code reviews, but that has certainly helped me produce better code.
[3]
Anthropic debuts Code Review for teams, enterprises
First vibe coding, now vibe reviewing ... but the buzz is good as it finds worthy issues Anthropic has introduced a more extensive - and expensive - way to review source code in hosted repositories, many of which already contain large swaths of AI-generated code. Code Review is a new service for teams and enterprise customers that drives multiple agents to scour code repos in a concerted effort to catch unidentified bugs. The company's Claude models are already capable of conducting code reviews upon demand - you can learn a lot about the quality of AI-generated code by having Claude review its own work. The AI biz also offers a Claude Code GitHub Action that can launch a code review automatically as part of the CI/CD pipeline. Code Review will do a lot more of that, at greater expense. "Code Review analyzes your GitHub pull requests and posts findings as inline comments on the lines of code where it found issues," the company explains in its documentation. "A fleet of specialized agents examine the code changes in the context of your full codebase, looking for logic errors, security vulnerabilities, broken edge cases, and subtle regressions." A fleet of specialized agents, you say? That sounds like it might burn a lot of tokens during the inference process. And indeed that's the case. As Anthropic observes, Code Review focuses on depth, more so than the existing approaches. "Reviews are billed on token usage and generally average $15-25, scaling with PR [pull request] size and complexity," the company says. That's per pull request. As a point of comparison, Code Rabbit, which offers AI-based code reviews, charges $24 per month. Code Review is also not very quick. While the amount of time required varies with the size of the pull request, reviews on average take about 20 minutes to complete, according to Anthropic. Given the time required and the billing rate, the question becomes whether paying a person $60 an hour to conduct a code review would produce comparable or better results. Still, the AI biz insists its engineers have seen positive results using Code Review, a finding supported in some research but not in all cases. Anthropic reports that it has used Code Review internally for several months with considerable success. The company claims that for large pull requests consisting of more than 1,000 changed lines, 84 percent of automated reviews find something of note - and 7.5 issues on average. For small pull requests of less than 50 lines, 31 percent get comments, averaging 0.5 issues. Human developers reject fewer than one percent of issues found by Claude. Customers that have been testing Code Review have seen some benefits. When TrueNAS embarked on a ZFS encryption refactoring for its open-source middleware, the AI review service spotted a bug in adjacent code that risks having a type mismatch erase the encryption key cache during sync operations. Anthropic claims that in one instance involving internal code, Code Review caught an innocuous-looking one-line change to a production service that would have broken the service's authentication mechanism. "It was fixed before merge, and the engineer shared afterwards that they wouldn't have caught it on their own," the AI biz said. In organizations large enough to afford AI tools, it's doubtful that software developers will ever work alone again. ®
[4]
Anthropic rolls out Code Review for Claude Code as it sues over Pentagon blacklist and partners with Microsoft
Anthropic on Monday released Code Review, a multi-agent code review system built into Claude Code that dispatches teams of AI agents to scrutinize every pull request for bugs that human reviewers routinely miss. The feature, now available in research preview for Team and Enterprise customers, arrives on what may be the most consequential day in the company's history: Anthropic simultaneously filed lawsuits against the Trump administration over a Pentagon blacklisting, while Microsoft announced a new partnership embedding Claude into its Microsoft 365 Copilot platform. The convergence of a major product launch, a federal legal battle, and a landmark distribution deal with the world's largest software company captures the extraordinary tension defining Anthropic's current moment. The San Francisco-based AI lab is simultaneously trying to grow a developer tools business approaching $2.5 billion in annualized revenue, defend itself against an unprecedented government designation as a national security threat, and expand its commercial footprint through the very cloud platforms now navigating the fallout. Code Review is Anthropic's most aggressive bet yet that engineering organizations will pay significantly more -- $15 to $25 per review -- for AI-assisted code quality assurance that prioritizes thoroughness over speed. It also signals a broader strategic pivot: the company isn't just building models, it's building opinionated developer workflows around them. Code Review works differently from the lightweight code review tools most developers are accustomed to. When a developer opens a pull request, the system dispatches multiple AI agents that operate in parallel. These agents independently search for bugs, then cross-verify each other's findings to filter out false positives, and finally rank the remaining issues by severity. The output appears as a single overview comment on the PR along with inline annotations for specific bugs. Anthropic designed the system to scale dynamically with the complexity of the change. Large or intricate pull requests receive more agents and deeper analysis; trivial changes get a lighter pass. The company says the average review takes approximately 20 minutes -- far slower than the near-instant feedback of tools like GitHub Copilot's built-in review, but deliberately so. "We built Code Review based on customer and internal feedback," an Anthropic spokesperson told VentureBeat. "In our testing, we've found it provides high-value feedback and has helped catch bugs that we may have missed otherwise. Developers and engineering teams use a range of tools, and we build for that reality. The goal is to give teams a capable option at every stage of the development process." The system emerged from Anthropic's own engineering practices, where the company says code output per engineer has grown 200% over the past year. That surge in AI-assisted code generation created a review bottleneck that the company says it now hears about from customers on a weekly basis. Before Code Review, only 16% of Anthropic's internal PRs received substantive review comments. That figure has jumped to 54%. Crucially, Code Review does not approve pull requests. That decision remains with human reviewers. Instead, the system functions as a force multiplier, surfacing issues so that human reviewers can focus on architectural decisions and higher-order concerns rather than line-by-line bug hunting. The pricing will draw immediate scrutiny. At $15 to $25 per review, billed on token usage and scaling with PR size, Code Review is substantially more expensive than alternatives. GitHub Copilot offers code review natively as part of its existing subscription, and startups like CodeRabbit operate at significantly lower price points. Anthropic's more basic code review GitHub Action -- which remains open source -- is itself a lighter-weight and cheaper option. Anthropic frames the cost not as a productivity expense but as an insurance product. "For teams shipping to production, the cost of a shipped bug dwarfs $20/review," the company's spokesperson told VentureBeat. "A single production incident -- a rollback, a hotfix, an on-call page -- can cost more in engineer hours than a month of Code Review. Code Review is an insurance product for code quality, not a productivity tool for churning through PRs faster." That framing is deliberate and revealing. Rather than competing on speed or price -- the dimensions where lightweight tools have an advantage -- Anthropic is positioning Code Review as a depth-first tool aimed at engineering leaders who manage production risk. The implicit argument is that the real cost comparison isn't Code Review versus CodeRabbit, but Code Review versus the fully loaded cost of a production outage, including engineer time, customer impact, and reputational damage. Whether that argument holds up will depend on the data. Anthropic has not yet published external benchmarks comparing Code Review's bug-detection rates against competitors, and the spokesperson did not provide specific figures on bugs caught per dollar or developer hours saved when asked directly. For engineering leaders evaluating the tool, that gap in publicly available comparative data may slow adoption, even if the theoretical ROI case is compelling. Anthropic's internal usage data provides an early window into the system's performance characteristics. On large pull requests exceeding 1,000 lines changed, 84% receive findings, averaging 7.5 issues per review. On small PRs under 50 lines, that drops to 31% with an average of 0.5 issues. The company reports that less than 1% of findings are marked incorrect by engineers. That sub-1% figure is the kind of stat that demands careful unpacking. When asked how "marked incorrect" is defined, the Anthropic spokesperson explained that it means "an engineer actively resolving the comment without fixing it. We'll continue to monitor feedback and engagement while Code Review is in research preview." The methodology matters. This is an opt-in disagreement metric -- an engineer has to take the affirmative step of dismissing a finding. In practice, developers under time pressure may simply ignore irrelevant findings rather than actively marking them as wrong, which would cause false positives to go uncounted. Anthropic acknowledged the limitation implicitly by noting the system is in research preview and that it will continue monitoring engagement data. The company has not yet conducted or published a controlled evaluation comparing agent findings against a ground-truth baseline established by expert human reviewers. The anecdotal evidence is nonetheless striking. Anthropic described a case where a one-line change to a production service -- the kind of diff that typically receives a cursory approval -- was flagged as critical by Code Review because it would have broken authentication for the service. In another example involving TrueNAS's open-source middleware, Code Review surfaced a pre-existing bug in adjacent code during a ZFS encryption refactor: a type mismatch that was silently wiping the encryption key cache on every sync. These are precisely the categories of bugs -- latent issues in touched-but-unchanged code, and subtle behavioral changes hiding in small diffs -- that human reviewers are statistically most likely to miss. The Code Review launch does not exist in a vacuum. On the same day, Anthropic filed two lawsuits -- one in the U.S. District Court for the Northern District of California and another in the D.C. Circuit Court of Appeals -- challenging the Trump administration's decision to label the company a supply chain risk to national security, a designation historically reserved for foreign adversaries. The legal confrontation stems from a breakdown in contract negotiations between Anthropic and the Pentagon. As CNN reported, the Defense Department wanted unrestricted access to Claude for "all lawful purposes," while Anthropic insisted on two redlines: that its AI would not be used for fully autonomous weapons or mass domestic surveillance. When talks collapsed by a Pentagon-set deadline on February 27, President Trump directed all federal agencies to cease using Anthropic's technology, and Defense Secretary Pete Hegseth formally designated the company a supply chain risk. According to CNBC, the complaint alleges that these actions are "unprecedented and unlawful" and are "harming Anthropic irreparably," with the company stating that contracts are already being cancelled and "hundreds of millions of dollars" in near-term revenue are in jeopardy. "Seeking judicial review does not change our longstanding commitment to harnessing AI to protect our national security," the Anthropic spokesperson told VentureBeat, "but this is a necessary step to protect our business, our customers, and our partners. We will continue to pursue every path toward resolution, including dialogue with the government." For enterprise buyers evaluating Code Review and other Claude-based tools, the lawsuit introduces a novel category of vendor risk. The supply chain risk designation doesn't just affect Anthropic's government contracts -- as CNBC reported, it requires defense contractors to certify they don't use Claude in their Pentagon-related work. That creates a chilling effect that could extend well beyond the defense sector, even as the company's commercial momentum accelerates. The market's response to the Pentagon crisis has been notably bifurcated. While the government moved to isolate Anthropic, the company's three largest cloud distribution partners moved in the opposite direction. Microsoft on Monday announced it is integrating Claude into Microsoft 365 Copilot through a new product called Copilot Cowork, developed in close collaboration with Anthropic. As Yahoo Finance reported, the service enables enterprise users to perform tasks like building presentations, pulling data into Excel spreadsheets, and coordinating meetings -- the kind of agentic productivity capabilities that sent shares of SaaS companies like Salesforce, ServiceNow, and Intuit tumbling when Anthropic first debuted its Cowork product on January 30. The timing is not coincidental. As TechCrunch reported last week, Microsoft, Google, and Amazon Web Services all confirmed that Claude remains available to their customers for non-defense workloads. Microsoft's legal team specifically concluded that "Anthropic products, including Claude, can remain available to our customers -- other than the Department of War -- through platforms such as M365, GitHub, and Microsoft's AI Foundry." That three of the world's most powerful technology companies publicly reaffirmed their commitment to distributing Anthropic's models -- on the same day the company sued the federal government -- tells enterprise customers something important about the market's assessment of both Claude's technical value and the legal durability of the supply chain risk designation. For organizations considering Code Review, the data handling question looms especially large. The system necessarily ingests proprietary source code to perform its analysis. Anthropic's spokesperson addressed this directly: "Anthropic does not train models on our customers' data. This is part of why customers in highly regulated industries, from Novo Nordisk to Intuit, trust us to deploy AI safely and effectively." The spokesperson did not detail specific retention policies or compliance certifications when asked, though the company's reference to pharmaceutical and financial services clients suggests it has undergone the kind of security review those industries require. Administrators get several controls for managing costs and scope, including monthly organization-wide spending caps, repository-level enablement, and an analytics dashboard tracking PRs reviewed, acceptance rates, and total costs. Once enabled, reviews run automatically on new pull requests with no per-developer configuration required. The revenue figure Anthropic confirmed -- a $2.5 billion run rate as of February 12 for Claude Code -- underscores just how quickly developer tooling has become a material revenue line for the company. The spokesperson pointed to Anthropic's recent Series G fundraise for additional context but did not break out what share of total company revenue Claude Code now represents. Code Review is available now in research preview for Claude Code Team and Enterprise plans. Whether it can justify its premium in a market already crowded with cheaper alternatives will depend on whether Anthropic can convert anecdotal bug catches and internal usage stats into the kind of rigorous, externally validated evidence that engineering leaders with production budgets require -- all while navigating a legal and political environment unlike anything the AI industry has previously faced.
[5]
Anthropic adds Code Review to Claude Code to streamline bug hunting
The feature uses multiple AI agents to analyze pull requests, flag potential issues, and provide feedback. Anthropic's AI coding assistant, Claude Code, is getting a new feature designed to help developers identify and resolve bugs faster and more efficiently. Aptly named Code Review, the feature automatically analyzes code changes, flags potential issues, and provides actionable feedback before the code is merged. Anthropic explains that when a pull request (PR) is opened, Code Review "dispatches a team of agents that look for bugs in parallel, verify bugs to filter out false positives, and rank bugs by severity. The result lands on the PR as a single high-signal overview comment, plus in-line comments for specific bugs." The company adds that the multi-agent review system scales dynamically based on the PR. It assigns more agents and deeper analysis to larger or more complex changes, while smaller updates get a lighter review. Based on Anthropic's testing, the system typically completes an average PR review in about 20 minutes. Recommended Videos The feature was developed to streamline internal operations after Anthropic saw the amount of code generated per engineer grow by 200% over the last year. The company now uses the system on nearly every PR and reports a significant increase in substantive review comments. Following successful internal testing, Code Review is now rolling out to Claude for Teams and Enterprise subscribers in research preview. While powerful, the tool is considerably more expensive than lightweight alternatives like the Claude Code GitHub Action, and it is billed on token usage. Premium pricing for in-depth reviews Anthropic reveals that code reviews with the new tool average somewhere between $15 and $25, scaling with PR size and complexity. To help admins manage costs, the company is offering monthly organization caps, repository-level restrictions, and an analytics dashboard to track PRs reviewed, acceptance rates, and total spend. Code Review arrives as Claude Code continues to gain traction commercially. The tool's run-rate revenue has reportedly surpassed $2.5 billion since launch, more than doubling since early 2026. Furthermore, business subscriptions have quadrupled since the start of the year, with enterprise customers now accounting for over half of Claude Code's total revenue.
[6]
Anthropic debuts extremely efficient yet pricey code checking tool for developers - SiliconANGLE
Anthropic debuts extremely efficient yet pricey code checking tool for developers When it comes to writing software, getting feedback is a critical part of the process, ensuring that bugs in the newly written code can be caught early, before a pull request is submitted. But with the rise of artificial intelligence coding bots, developers are shipping more code than ever before, overwhelming human reviewers. Fortunately, Anthropic PBC has come up with a solution, announcing the availability of Code Review in Claude Code, a new multi-agent system that's designed to spot bugs in AI-generated code before a human reviewer ever sees it. The new product is meant to review pull requests, which are a mechanism used by developers to submit code changes for review before they're implemented in the software. With most developers using tools like Claude Code to accelerate their output, there has been a dramatic increase in the number of pull requests at many organizations, creating a new bottleneck in software development. The launch of Code Review comes at a critical juncture in Anthropic's story. Earlier today, the company filed two lawsuits against the U.S. Department of Defense after it was designated as a supply chain risk. The designation threatens to derail Anthropic's booming government business, and so it makes sense that the company wants to double down on its enterprise customer base, where subscriptions have quadrupled since the beginning of the year. Claude Code is the company's most popular enterprise product, and its annual revenue run-rate recently surpassed $2.5 billion. It's hoped that Code Review will help to make its coding tool even more attractive. Anthropic has previously integrated code checking capabilities within Claude Code, giving it the ability to review its own work. In addition, there's a Claude Code GitHub Action tool that can be set up to automatically review code as part of a company's continuous integration/continuous development pipeline. But Code Review is meant to go further and conduct more comprehensive reviews, although companies will have to pay a high price for the privilege. Code Review notably takes much more time to review each pull request, and as it works its way through the code it will explain its reasoning step-by-step, Anthropic said. For each potential bug it finds, it will outline what the issue is and explain why it's likely to be problematic. It will also offer a recommended fix for that issue. It will label each problem it surfaces according to its severity, with red being used for the most severe issues, yellow for possible problems that need review, and purple for issues tied to pre-existing code and historical bugs. According to Anthropic, Code Review does this by relying on multiple AI agents that work in parallel, with each one examining the codebase from a different perspective. Once that's done, another agent will come in to aggregate and rank the findings of those agents, remove any duplicate issues and prioritize them based on their order of importance. It's extremely comprehensive, but that kind of attention to detail doesn't come cheap. With so many agents involved in the process, customers are going to burn through a lot of tokens. "Reviews are billed on token usage and generally average $15-$25, scaling with pull request size and complexity," the company said. That's definitely not cheap. In contrast, a service such as Code Rabbit, which also uses AI to review pull requests, charges $24 per month. Code Review is also on the slow side. Anthropic said the time it takes to review each pull request will vary, but averages around 20 minutes to complete. Despite this, Anthropic promised that customers will be delighted with the results. It said it has been using the tool internally for several months, and claims that large pull requests of more than 1,000 changed lines, 84% of its reviews found something of note, and around 7.5 issues on average. For smaller pull requests of less than 50 lines, 31% were flagged with an average of 0.5 issues found. The company said it has caught some significant bugs too. In one instance involving internal code, the tool flagged a single, innocuous-looking change to a production service that would have disrupted its authentication mechanism. Anthropic said Code Review is available now in research preview for Claude Code Team and Enterprise subscribers.
[7]
Anthropic Launches Multi-Agent Code Review System in Claude Code
On an average, the review takes 20 minutes to complete. It's available to Team and Enterprise customers only. Anthropic just launched a new Code Review feature in its popular Claude Code AI tool. It's a multi-agent review system that dispatches a team of agents to catch bugs after every pull request. Code Review is modeled on the system that Anthropic uses internally for nearly every PR. It's also a better alternative than the open-source Claude Code GitHub Action tool. The Code Review system in Claude Code is currently in research preview and available for Team and Enterprise customers only. As for how the feature works, when a PR opens or updates, multiple AI agents analyze the diff from different angles. Five independent reviewers check the changes such as CLAUDE.md compliance, bug detection, git history context, previous PR comment review, and code comment verification. Now, the issues are ranked by severity and posted as inline comments on the specific lines. And to reduce false positives, only high-confidence issues (above 80 threshold) are posted. The average review takes about 20 minutes to complete. Note that Code Review will not approve PRs, so human reviewers can still make that decision. Next, it focuses on code correctness and looks for bugs that might break production, instead of looking for formatting preferences or missing test coverage. Of course, you can modify the CLAUDE.md file and tune what kind of issues Claude can flag. To enable it, team admins can open Claude Code admin settings, and install the Claude GitHub app. Now, simply select the repository and you are done. Individual developers don't need to do anything. Once enabled, it runs automatically on new PRs.
Share
Share
Copy Link
Anthropic unveiled Code Review, an AI-powered code review tool that deploys multiple agents to analyze pull requests for bugs and security vulnerabilities. Available for Claude Code Teams and Enterprise customers, the system costs $15-25 per review and takes about 20 minutes to complete. The launch comes as Claude Code's run-rate revenue surpasses $2.5 billion, with enterprise customers now accounting for over half of total revenue.
Anthropic introduced Code Review on Monday, an AI-powered code review tool designed to address a growing challenge in software development: the flood of AI-generated code overwhelming human reviewers. Available in research preview for Claude Code Teams and Enterprise customers, the new feature uses a multi-agent system to analyze pull requests and identify logical errors, security vulnerabilities, and other critical issues before code merges into production
1
.
Source: ZDNet
The launch arrives at a pivotal moment for the San Francisco-based AI lab. On the same day, Anthropic filed lawsuits against the Department of Defense over a Pentagon blacklist designation, while Microsoft announced a partnership embedding Claude into its Microsoft 365 Copilot platform
4
. The convergence of a major product launch, federal legal battle, and landmark distribution deal captures the extraordinary tension defining Anthropic's current trajectory.Code Review operates differently from lightweight alternatives. When developers open a pull request on GitHub, the system dispatches multiple AI agents that work in parallel, each examining the codebase from different perspectives . These agents independently search for bugs, cross-verify findings to filter out false positives, then rank remaining issues by severity. A final agent aggregates the results, removing duplicates and prioritizing what matters most.
The output appears as inline comments directly on the code, explaining potential issues and suggested fixes
1
. The system labels severity using colors: red for highest priority, yellow for potential problems worth reviewing, and purple for issues tied to pre-existing code or historical bugs. Each review explains its reasoning step by step, outlining what the issue is, why it might be problematic, and how to fix it.
Source: TechCrunch
Code Review targets large-scale enterprise customers including Uber, Salesforce, and Accenture who already use Claude Code and need help managing the sheer volume of pull requests
1
. Cat Wu, Anthropic's head of product, said enterprise leaders consistently ask how to efficiently review the pull requests Claude Code generates. "Claude Code has dramatically increased code output, which has increased pull request reviews that have caused a bottleneck to shipping code," Wu explained.The pricing reflects this premium positioning. Reviews cost $15 to $25 on average, billed on token usage and scaling with pull request size and complexity
3
. Reviews take approximately 20 minutes to complete, prioritizing depth over speed5
. Anthropic frames this not as a productivity expense but as insurance against production risk, arguing that a single production incident can cost more in engineer hours than a month of Code Review4
.The system emerged from Anthropic's own engineering practices, where code output per engineer has grown 200% over the past year
4
. Before implementing Code Review, only 16% of internal pull requests received substantive review comments. That figure jumped to 54% after deployment2
. For large pull requests exceeding 1,000 changed lines, 84% of automated reviews find something noteworthy, averaging 7.5 issues3
. Human developers reject fewer than 1% of issues identified by Claude3
.In one internal case, Code Review caught an innocuous-looking one-line change that would have broken a production service's authentication mechanism
3
. External testing customer TrueNAS saw the system spot a bug during ZFS encryption refactoring that risked a type mismatch erasing the encryption key cache during sync operations3
.Related Stories
Code Review integrates directly with GitHub and can be enabled by default for entire engineering teams through the CI/CD pipeline
1
. Engineering leads can customize additional checks based on internal best practices. The tool provides light security analysis, while Anthropic's Claude Code Security offers deeper security-focused reviews1
.The focus remains strictly on fixing logic errors rather than style preferences. "This is really important because a lot of developers have seen AI automated feedback before, and they get annoyed when it's not immediately actionable," Wu said
1
. "We decided we're going to focus purely on logic errors. This way we're catching the highest priority things to fix."
Source: The Register
Claude Code's run-rate revenue has surpassed $2.5 billion since launch, with business subscriptions quadrupling since the start of the year
1
. Enterprise customers now account for over half of total revenue5
. Wu characterized demand as "insane market pull" from engineers experiencing decreased friction in creating new features but higher demand for code quality assurance1
.To help administrators manage costs, Anthropic offers monthly organization caps, repository-level restrictions, and an analytics dashboard tracking PRs reviewed, acceptance rates, and total spend
5
. The company maintains an open-source GitHub Action for lighter-weight reviews as a cheaper alternative4
.Summarized by
Navi
[2]
[3]
[4]
[5]
07 Aug 2025•Technology

17 Sept 2025•Technology

13 Feb 2026•Technology

1
Technology

2
Policy and Regulation

3
Policy and Regulation
