2 Sources
[1]
White Circle raises $11m Seed for production AI control platform
Senior figures from OpenAI, Anthropic, DeepMind, Hugging Face, Mistral, Datadog and Sentry have backed the Seed. The platform claims more than one billion API requests served and lists Lovable and two of the world's largest digital banks among its customers. White Circle, a platform that helps companies monitor, secure and control AI models in production, has raised $11m in Seed funding, the company announced today. The capital will fund product development and hiring across the US, UK and Europe. The round features personal investments from a notable list of AI-industry figures: Romain Huet (OpenAI); Dirk Kingma (formerly OpenAI, now at Anthropic); Guillaume Lample (Mistral); Thomas Wolf (Hugging Face); Olivier Pomel (Datadog); François Chollet (creator of Keras); Mehdi Ghissassi (formerly DeepMind); Paige Bailey (DeepMind); and David Cramer (Sentry). The participation list reads as senior practitioners from the labs that build the models White Circle is designed to police. White Circle was founded by engineer Denis Shilov, who went viral in 2024 after a single prompt bypassed the safety filters of every major AI model. The post reached 1.4 million views, prompted contact from Anthropic, OpenAI and Hugging Face, and led to Shilov joining Anthropic's bug-bounty programme. The White Circle platform was built to address the gap his demonstration exposed. The product is a single-API control layer that scans AI inputs and outputs in real time against customer-defined policies. It detects harmful content, catches hallucinations, blocks prompt-injection attacks, flags model drift and identifies abusive users. Customers can set custom enforcement actions, including rate-limiting and bans, and feed labelled user feedback back into White Circle's models to improve accuracy over time. The platform supports 150 languages and is SOC 2 Type I and Type II certified and HIPAA-compliant. Head of design is Elena Iumagulova. Shilov said in the announcement that the platform addresses an accountability gap that has grown alongside the rapid expansion of AI deployments. "AI is moving faster than our ability to guide it," Shilov said. "We already trust it with decisions that touch millions of people, from hiring to healthcare, finance and security. With White Circle, we're finally giving companies everything they need to hold their AI accountable and optimise their models in a single place, without sacrificing security, compliance or risk." The company has released two pieces of research alongside the platform. CircleGuardBench, published in May 2025, is a benchmark that tests how AI moderation models perform under real-world conditions. KillBench ran more than one million experiments across 15 AI models from OpenAI, Google, Anthropic and xAI; the study found preferences linked to nationality, religion, body type and even phone brand when models were asked to make decisions about human lives. KillBench also documented that structured-output integrations, the standard for production AI deployments, caused refusal rates to collapse and biases to amplify. Ophelia Cai, partner at Tiny VC, said in the announcement that the team had built infrastructure the industry needs. "Denis and the White Circle team have an unusual combination of deep technical credibility and a clear commercial instinct," Cai said. The platform has served more than one billion API requests to date. Customers named in the announcement include Lovable and two of the world's largest digital banks (not specifically identified). White Circle did not disclose its post-money valuation or revenue.
[2]
Exclusive: White Circle raises $11 million to stop AI models from going rogue | Fortune
One evening in late 2024, Denis Shilov was watching a crime thriller when he had an idea for a prompt that would break through the safety filters of every leading AI model. The prompt was what researchers call a universal jailbreak, meaning it could be reused to get any model to bypass their own guardrails and produce dangerous or prohibited outputs, like instructions on how to make drugs or build weapons. To do so, Shilov simply told the AI models to stop acting like a chatbot with safety rules and instead behave like an API endpoint, a software tool that automatically takes in a request and sends back a response. The prompt reframed the model's job as simply answering, rather than deciding whether a request should be rejected, and made every leading AI model comply with dangerous questions it was supposed to refuse. Shilov posted about it on X and, by the next morning, it had gone viral. The social media success brought with it an invitation from companies Anthropic to test their models privately, something that convinced Shilov that the issue was bigger than just finding these problematic prompts. Companies were beginning to integrate AI models into their workflows, Shilov told Fortune, but they had few ways to control what those systems did once users started interacting with them. "Jailbreaks are just one part of the problem," Shilov said. "In as many ways people can misbehave, models can misbehave too. Because these models are very smart, they can do a lot more harm." White Circle, a Paris-based AI control platform that has now raised $11 million, is Shilov's answer to the new wave of risks posed by AI models in company workflows. The startup builds software that sits between a company's users and its AI models, checking inputs and outputs in real time against company-specific policies. The new seed funding comes from a group of backers that includes Romain Huet, head of developer experience at OpenAI; Durk Kingma, an OpenAI cofounder now at Anthropic; Guillaume Lample, cofounder and chief scientist at Mistral; and Thomas Wolf, cofounder and chief science officer at Hugging Face. White Circle said the funding will be used to expand its team, accelerate product development, and grow its customer base across the U.S., U.K., and Europe. The startup currently has a team of 20, distributed across London, France, Amsterdam, and elsewhere in Europe. Shilov said almost all of them are engineers. White Circle's main product is a real-time enforcement layer for AI applications. If a user tries to generate malware, scams, or other prohibited content, the system can flag or block the request. If a model starts hallucinating, leaking sensitive data, promising refunds it cannot issue, or taking destructive actions inside a software environment, White Circle says its platform can catch that too. "We're actually enforcing behavior." Shilov said. "Model labs do some safety tuning, but it's very general and typically about the model refraining from answering questions about drugs and bioweapons. But in production, you end up having a lot more potential issues." White Circle is betting that AI safety will not be solved entirely at the model-training stage. As businesses embed models into more products, Shilov said the relevant question is no longer just whether OpenAI, Anthropic, Google, or Mistral can make their models safer in the abstract; it is whether a healthcare company, bank, legal app, or coding platform can control what an AI system is allowed to do in its own environment. As companies transition from using chatbots to autonomous AI agents that can write code, browse the web, access files, and take actions on a user's behalf, Shilov said the risks become much more widespread. For example, a customer service bot might promise a refund that it is not authorized to give, a coding agent might install something dangerous on a virtual machine, or a model embedded in a fintech app might mishandle sensitive customer information. To avoid these issues, Shilov says companies relying on foundational models need to define and enforce what good AI behavior looks like inside their own products, instead of relying on the AI labs' safety testing. White Circle says its platform has processed more than one billion API requests and is already used by Lovable, the vibe-coding startup, as well as several fintech and legal companies. Shilov said that model providers have mixed incentives to build the kind of real-time control layer White Circle provides. AI companies still charge for input and output tokens even when a model refuses a harmful request, he said, which reduces the financial incentive to block abuse before it reaches the model. He also pointed to what researchers call the alignment tax, the idea that training models to be safer can sometimes make them less performant on tasks such as coding. "They have a very interesting choice of training safer and more secure models versus more performant models," Shilov said. "And then there is always a problem with trust. Why would you trust Anthropic to judge Anthropic's model outputs?" White Circle's research arm has also tried to illustrate the new risks. In May, the company published KillBench, a study that ran more than one million experiments across 15 AI models, including models from OpenAI, Google, Anthropic, and xAI, to test how systems behaved when forced to make decisions about human lives. In the experiments, models were asked to choose between two fictional people in scenarios where one had to die, with details such as nationality, religion, body type, or phone brand changed between prompts. White Circle said the results showed models making different choices depending on those attributes, suggesting hidden biases can surface in high-stakes settings even when models appear neutral in ordinary use. The company also said the effect became worse when models were asked to give their answers in a format that software can easily read, such as choosing from a fixed set of options or filling out a form, which is a common way companies plug AI systems into real products. This kind of research has also helped White Circle pitch itself as an outside check on how models behave once they leave the lab. "Denis and the White Circle team have an unusual combination of deep technical credibility and a clear commercial instinct," said Ophelia Cai, partner at Tiny VC. "The KillBench research alone shows what's possible when you approach AI safety empirically."
Share
Copy Link
Paris-based White Circle has raised $11 million in seed funding for its AI control platform that monitors and secures AI models in production. Founded by Denis Shilov, who went viral after exposing safety vulnerabilities in major AI models, the platform has already processed over one billion API requests. Backers include senior figures from OpenAI, Anthropic, Mistral, and Hugging Face.
White Circle, an AI control platform designed to monitor and secure AI models in production, has raised $11 million in seed funding from a roster of prominent AI industry figures. The Paris-based startup attracted personal investments from Romain Huet of OpenAI, Durk Kingma (formerly OpenAI, now at Anthropic), Guillaume Lample of Mistral, Thomas Wolf of Hugging Face, Olivier Pomel of Datadog, François Chollet (creator of Keras), and David Cramer of Sentry
1
2
. The participation list reads as a who's who from the labs that build the very models White Circle is designed to police. The capital will fund product development and hiring across the US, UK, and Europe, with the company currently operating a team of 20, mostly engineers, distributed across London, France, Amsterdam, and elsewhere in Europe2
.
Source: Fortune
White Circle was founded by Denis Shilov, an engineer who gained widespread attention in late 2024 after discovering a universal jailbreak that bypassed the safety filters of every major AI model. While watching a crime thriller one evening, Shilov conceived a prompt that reframed AI models as API endpoints rather than chatbots with safety rules, making them comply with dangerous requests they were supposed to refuse
2
. His post on X reached 1.4 million views and prompted contact from Anthropic, OpenAI, and Hugging Face, leading to his participation in Anthropic's bug-bounty programme1
. The experience convinced Shilov that jailbreaks represented just one facet of a much larger problem facing companies integrating AI into their workflows. "In as many ways people can misbehave, models can misbehave too. Because these models are very smart, they can do a lot more harm," Shilov explained2
.The White Circle platform functions as an enforcement layer between users and AI models, providing real-time scanning of AI inputs and outputs against customer-defined policies. The single-API control layer detects harmful content, catches instances where models hallucinate, blocks prompt-injection attacks, flags model drift, and identifies abusive users
1
. If a user attempts to generate malware, scams, or other prohibited content, the system can flag or block the request. When a model starts leaking sensitive data, promising refunds it cannot issue, or taking destructive actions inside a software environment, White Circle's platform can intervene2
. Customers can set custom enforcement actions, including rate-limiting and bans, and feed labelled user feedback back into White Circle's models to improve accuracy over time. The platform supports 150 languages and is SOC 2 Type I and Type II certified and HIPAA-compliant1
.As companies transition from simple chatbots to autonomous AI agents that can write code, browse the web, access files, and take actions on behalf of users, the risks become more widespread. A customer service bot might promise a refund it is not authorized to give, a coding agent might install something dangerous on a virtual machine, or a model embedded in a fintech app might mishandle sensitive customer information
2
. Shilov argues that AI safety will not be solved entirely at the model-training stage, and that companies need to define and enforce what good AI behavior looks like inside their own products rather than relying solely on AI labs' safety testing. "We're actually enforcing behavior," Shilov said. "Model labs do some safety tuning, but it's very general and typically about the model refraining from answering questions about drugs and bioweapons. But in production, you end up having a lot more potential issues"2
.Related Stories
White Circle has released two pieces of research that underscore the accountability gap in AI deployments. CircleGuardBench, published in May 2025, is a benchmark that tests how AI moderation models perform under real-world conditions
1
. KillBench ran more than one million experiments across 15 AI models from OpenAI, Google, Anthropic, and xAI, finding preferences linked to nationality, religion, body type, and even phone brand when models were asked to make decisions about human lives. The study also documented that structured-output integrations, the standard for production AI deployments, caused refusal rates to collapse and biases to amplify1
. These findings highlight why companies need additional layers of control beyond what model providers offer.The platform has already served more than one billion API requests and lists Lovable, the vibe-coding startup, and two of the world's largest digital banks among its customers
1
2
. Shilov pointed out that model providers have mixed incentives to build the kind of real-time control layer White Circle provides. AI companies still charge for input and output tokens even when a model refuses a harmful request, which reduces the financial incentive to block abuse before it reaches the model. He also referenced what researchers call the alignment tax—the idea that training models to be safer can sometimes make them less performant on tasks such as coding2
. This dynamic creates an opening for third-party solutions that can prevent prompt-injection attacks and other security issues without compromising model performance. Ophelia Cai, partner at Tiny VC, noted that "Denis and the White Circle team have an unusual combination of deep technical credibility and a clear commercial instinct"1
. As businesses embed models into more products across healthcare, finance, legal, and coding platforms, the ability to control what an AI system is allowed to do in specific environments becomes increasingly critical for managing risk and maintaining compliance.Summarized by
Navi
[1]
1
Technology

2
Policy and Regulation

3
Policy and Regulation
