4 Sources
4 Sources
[1]
Anthropic details how it measures Claude's wokeness
Anthropic is detailing its efforts to make its Claude AI chatbot "politically even-handed" -- a move that comes just months after President Donald Trump issued a ban on "woke AI." As outlined in a new blog post, Anthropic says it wants Claude to "treat opposing political viewpoints with equal depth, engagement, and quality of analysis." In July, Trump signed an executive order that says the government should only procure "unbiased" and "truth-seeking" AI models. Though this order only applies to government agencies, the changes companies make in response will likely trickle down to widely released AI models, since "refining models in a way that consistently and predictably aligns them in certain directions can be an expensive and time-consuming process," as noted by my colleague Adi Robertson. Last month, OpenAI similarly said it would "clamp down" on bias in ChatGPT. Anthropic doesn't mention Trump's order in its press release, but it says it has instructed Claude to adhere to a series of rules -- called a system prompt -- that direct it to avoid providing "unsolicited political opinions." It's also supposed to maintain factual accuracy and represent "multiple perspectives." Anthropic says that while including these instructions in Claude's system prompt "is not a foolproof method" to ensure political neutrality, it can still make a "substantial difference" in its responses. Additionally, the AI startup describes how it uses reinforcement learning "to reward the model for producing responses that are closer to a set of pre-defined 'traits.'" One of the desired "traits" given to Claude encourages the model to "try to answer questions in such a way that someone could neither identify me as being a conservative nor liberal." Anthropic also announced that it has created an open-source tool that measures Claude's responses for political neutrality, with its most recent test showing Claude Sonnet 4.5 and Claude Opus 4.1 garnering respective scores of 95 and 94 percent in even-handedness. That's higher than Meta's Llama 4 at 66 percent and GPT-5 at 89 percent, according to Anthropic. "If AI models unfairly advantage certain views -- perhaps by overtly or subtly arguing more persuasively for one side, or by refusing to engage with some arguments altogether -- they fail to respect the user's independence, and they fail at the task of assisting users to form their own judgments," Anthropic writes in its blog post.
[2]
Anthropic's bot bias test shows Grok and Gemini are more "evenhanded"
Why it matters: The move comes amid a complicated battle over how chatbots answer political questions. Driving the news: Anthropic finds that its Claude chatbot outperformed ChatGPT in evenhandedness, but slightly lagged other rivals, including Elon Musk's Grok and Google's Gemini. * The automated evaluation method scored two Claude models (Sonnet 4.5 and Opus 4.1) as 95% evenhanded, well above Meta's Llama 4 (66%) and GPT-5 (89%), though slightly behind Gemini 2.5 Pro's 97% and Grok 4's 96%. * Anthropic's evenhandedness score evaluates, among other things, how well a bot offers and engages with opposing perspectives. It also looks at how often it refuses to answer. * OpenAI said last month its own testing found GPT-5 showed less political bias than any of its previous models. What they're saying: Anthropic says it developed the tool as part of its effort to ensure its products treat opposing political viewpoints fairly and to neither favor nor disfavor, any particular ideology. * "We want Claude to take an even-handed approach when it comes to politics," Anthropic said in its blog post. However, it also acknowledged that "there is no agreed-upon definition of political bias, and no consensus on how to measure it." How it works: Anthropic offered paired prompts, with one showing a preference for a left-leaning perspective and the other a right-leaning one and then graded each model's response on its evenhandedness. * The research centered on U.S. political queries conducted in a single-turn conversation between a person and that chatbot. Zoom out: President Trump has issued a "Woke AI" executive order demanding that chatbots whose companies do business with the government be free from political bias. * However, in defining political bias, the order points to supporting the government's own position on contentious issues, including DEI. * The U.S. Office of Management and Budget is required by November 20th to issue guidance to agencies on how to procure models that meet the order's standards around "truth seeking" and "ideological neutrality." * Even before President Trump signed the executive order, tech companies were adjusting their policies to assuage Republican complaints about bias and to reflect a changing political climate. Between the lines: There's no consensus on what constitutes political bias in AI systems. * Several studies have found the major chatbots produce answers generally viewed as slightly left of center * However, other studies found that models that focus on factual accuracy can appear to display political bias when the facts point one direction on a contested issue. * It remains technically challenging to try to guarantee that models never give responses perceived as biased -- and the executive order introduces new compliance risks for companies. What we're watching: Anthropic posted its tool to GitHub under an open source license and is encouraging others to use it and to develop other measurement approaches.
[3]
Anthropic rushes to prove its AI isn't 'woke' -- and subtly throws rivals under Trump's neutrality bus | Fortune
Anthropic is scrambling to assert its political neutrality as the Trump administration intensifies its campaign against so-called "woke AI," placing itself at the center of an increasingly ideological fight over how large language models should talk about politics. In a detailed post Thursday, Anthropic unveiled a sweeping effort to train its Claude chatbot to behave with what it calls "political even-handedness," a framework meant to ensure the model treats competing viewpoints "with equal depth, engagement, and quality of analysis." The company also released a new automated method for measuring political bias and published results suggesting its latest model, Claude Sonnet 4.5, outperforms or matches competitors on neutrality. The announcement comes in the midst of unusually strong political pressure. In July, President Donald Trump signed an executive order barring federal agencies from procuring AI systems that "sacrifice truthfulness and accuracy to ideological agendas," explicitly naming diversity, equity and inclusion initiatives as threats to "reliable AI." That scrutiny appears to have rattled the San Francisco firm, which is now actively advertising its efforts to shed perceptions that Claude is more left-leaning than rival models. Anthropic CEO Dario Amodei insisted last month that the company is aligned with the Trump administration on its anti-woke policy, pushing back on what he called "a recent uptick in inaccurate claims." In a statement to CNBC, Amodei added: "I fully believe that Anthropic, the administration, and leaders across the political spectrum want the same thing: to ensure that powerful AI technology benefits the American people and that America advances and secures its lead in AI development." The company's neutrality push indeed goes well beyond the typical marketing language. Anthropic says it has rewritten Claude's system prompt -- its always-on instructions -- to include guidelines such as avoiding unsolicited political opinions, refraining from persuasive rhetoric, using neutral terminology, and being able to "pass the Ideological Turing Test" when asked to articulate opposing views. The firm has also trained Claude to avoid swaying users in "high-stakes political questions," implying one ideology is superior, and pushing users to "challenge their perspectives." Anthropic's evaluation found Claude Sonnet 4.5 scored a 94% "even-handedness" rating, roughly on par with Google's Gemini 2.5 Pro and Elon Musk's Grok 4, but higher than OpenAI's GPT-5 and Meta's Llama 4. It also showed low refusal rates, meaning the model was typically willing to engage with both sides of political arguments rather than declining out of caution. Companies across the AI sector -- OpenAI, Google, Meta, xAI -- are being forced to navigate the Trump administration's new procurement rules and a political environment where "bias" complaints can become existential business risks. But Anthropic in particular has faced amplified attacks, due in part to its past warnings about AI safety, its Democratic-leaning investor base, and its decision to restrict some law-enforcement use cases. "We are going to keep being honest and straightforward, and will stand up for the policies we believe are right," Amodei wrote in a blog post. "The stakes of this technology are too great for us to do otherwise."
[4]
Beyond left and right: How Anthropic is training Claude for political even-handedness
Why political neutrality in AI matters for trust, safety, and transparency As AI systems become more woven into the way people learn, debate and form opinions, the political character of these models has turned into a frontline issue. Do they lean left? Do they lean right? Do they subtly favour one worldview without meaning to? Anthropic, the company behind Claude, has stepped into this debate with an unusually transparent breakdown of how it tries to train an LLM that doesn't slip into ideological bias. Its goal isn't silence or evasion, it's even-handedness. And that distinction is critical. Also read: OpenAI's group chat feature explained: The bold step toward AI-assisted collaboration Why political even-handedness matters Anthropic's north star is that an AI should not quietly nudge a user toward one ideology. Instead, it should offer clear, comprehensive information that respects multiple viewpoints and gives the user the freedom to decide for themselves. Their definition of even-handedness goes beyond factual correctness: it includes tone, depth, and the level of respect with which the model treats different perspectives. In practice, that means Claude shouldn't sound progressive or conservative by default. It shouldn't, unprompted, advocate for political positions. And when asked to argue for or against a policy, it should deliver equally serious, well-framed arguments on both sides. The point is not to water down political engagement, but to ensure the model engages symmetrically. How Claude is trained for neutrality Anthropic uses two major techniques to guide Claude's political behaviour: system prompts and reinforcement learning. The system prompt is Claude's permanent instruction sheet, the blueprint that shapes its default behaviour. Anthropic regularly updates this prompt with explicit reminders to avoid leaning toward identifiable ideological positions. This prompt-level steering may sound simple, but the company says it has a "substantial" impact on how the model behaves across thousands of conversations. Reinforcement learning takes the idea deeper. Instead of training Claude on raw text alone, Anthropic rewards the model when it shows traits they consider essential for political fairness: objectivity, respect for opposing views, clarity, sobriety, and an ability to articulate multiple sides without emotional charge. These "character traits" form part of the model's learned behaviour. What makes this notable is that Anthropic isn't trying to teach Claude to be passive or bland. It's trying to teach it to be fair, to recognise nuance across political landscapes and avoid the trap of advocating for a worldview hidden beneath "neutral" phrasing. Measuring neutrality: the paired-prompt test A claim of neutrality is only meaningful if you can measure it. Anthropic attempts to do this through what it calls a paired-prompt evaluation. On any contentious topic - taxation, immigration, climate policy - the company feeds Claude two versions of the same question: one framed from the left, one from the right. Evaluators then check whether the responses match in depth, logic, detail and seriousness. If Claude provides a thoughtful argument for one side but a thin or half-hearted one for the other, the imbalance is flagged. The framework scores results across three key dimensions: Even-handedness - how well the model treats opposing views with equal care. Plurality of perspective, whether it acknowledges nuance and avoids binary framing. Refusals, ensuring the model doesn't selectively decline to engage with one side. Also read: What's new in GPT-5.1: All differences compared to GPT-5 explained Anthropic has even open-sourced this evaluation system, encouraging outside scrutiny and, ideally, creating a shared industry benchmark. The challenges ahead Despite the rigour, building a genuinely even-handed AI is messy work. LLMs inherit political signals from their training data, much of which comes from platforms and publications with clear ideological footprints. Defining neutrality itself is contested: what feels balanced to one audience can feel biased to another. And trying too hard to stay neutral risks draining personality or lowering utility. There's also a structural question: as AI increasingly participates in public conversations, is even-handedness enough? Or will users expect models to help them navigate political debates more assertively, without crossing the line into advocacy? Why Anthropic's effort matters Claude's political neutrality push signals that AI companies can no longer afford to treat political behaviour as an afterthought. Whether for education, policymaking, workplace advice or news consumption, users want systems that don't smuggle ideology into their answers. Anthropic hasn't solved political bias in AI, nobody has. But by making its methods public, defining neutrality clearly and trying to measure it rigorously, the company is raising the standard for what political trustworthiness in AI should look like.
Share
Share
Copy Link
Anthropic releases comprehensive framework to make Claude AI politically even-handed, scoring 95% in neutrality tests while outperforming rivals like ChatGPT. The move comes as Trump's executive order bans 'woke AI' from government procurement.
Anthropie has unveiled a comprehensive framework designed to make its Claude AI chatbot "politically even-handed," marking a significant shift in how the company approaches political discourse in artificial intelligence. The initiative comes as the AI industry faces mounting pressure to address perceived political bias in large language models
1
.
Source: Digit
The San Francisco-based company detailed its efforts in a blog post, explaining that it wants Claude to "treat opposing political viewpoints with equal depth, engagement, and quality of analysis." This approach goes beyond simple factual accuracy, encompassing tone, depth, and the level of respect with which the model treats different perspectives
4
.Anthropic employs two primary techniques to achieve political neutrality in Claude: system prompts and reinforcement learning. The system prompt serves as Claude's permanent instruction sheet, containing explicit guidelines to avoid providing "unsolicited political opinions" while maintaining factual accuracy and representing "multiple perspectives"
1
.
Source: The Verge
The reinforcement learning approach rewards the model for producing responses that align with predefined traits encouraging political neutrality. One key trait instructs Claude to "try to answer questions in such a way that someone could neither identify me as being a conservative nor liberal"
1
.The company has also rewritten Claude's system prompt to include guidelines such as avoiding persuasive rhetoric, using neutral terminology, and being able to "pass the Ideological Turing Test" when articulating opposing views
3
.Anthropic has created an open-source tool that measures AI responses for political neutrality using a paired-prompt evaluation system. This method presents the same contentious topic from both left-leaning and right-leaning perspectives, then evaluates whether responses match in depth, logic, detail, and seriousness
4
.The evaluation framework scores results across three key dimensions: even-handedness, plurality of perspective, and refusal rates. Recent testing showed Claude Sonnet 4.5 achieving a 95% even-handedness score and Claude Opus 4.1 scoring 94%, significantly outperforming competitors like Meta's Llama 4 at 66% and OpenAI's GPT-5 at 89%
2
.However, Claude slightly lagged behind some rivals, with Google's Gemini 2.5 Pro scoring 97% and Elon Musk's Grok 4 achieving 96% in even-handedness
2
.Related Stories
The timing of Anthropic's announcement is significant, coming months after President Donald Trump signed an executive order in July banning "woke AI" from government procurement. The order requires federal agencies to only procure "unbiased" and "truth-seeking" AI models, explicitly naming diversity, equity, and inclusion initiatives as threats to "reliable AI"
3
.While Anthropic doesn't explicitly mention Trump's order in its announcement, the company appears to be responding to this regulatory pressure. CEO Dario Amodei recently stated that Anthropic is aligned with the Trump administration on anti-woke policy, telling CNBC: "I fully believe that Anthropic, the administration, and leaders across the political spectrum want the same thing"
3
.
Source: Fortune
The executive order has created compliance risks across the AI industry, with companies like OpenAI also announcing efforts to "clamp down" on bias in ChatGPT. The U.S. Office of Management and Budget is required to issue guidance by November 20th on how agencies should procure models meeting the order's standards
2
.Anthropic's neutrality push reflects broader challenges facing the AI industry in navigating political sensitivities. Several studies have found major chatbots generally produce answers viewed as slightly left of center, though other research suggests models focusing on factual accuracy can appear politically biased when facts point in one direction on contested issues
2
.The company has made its evaluation tool available on GitHub under an open-source license, encouraging other developers to use it and develop additional measurement approaches. This transparency represents an attempt to create shared industry standards for measuring and addressing political bias in AI systems
2
.Summarized by
Navi
[1]