10 Sources
[1]
An OpenAI model solved a famous math problem that stumped humans for 80 years
In mid-May, OpenAI announced that an internal AI model had disproved the Erdős unit distance conjecture, a famous problem in discrete geometry that had stumped human mathematicians for the last 80 years. OpenAI gave several mathematicians early access to the result and published their reactions. Tim Gowers -- who won the Fields Medal, the most prestigious prize in mathematics -- wrote that "there is no doubt that the solution to the unit-distance problem is a milestone in AI mathematics." University of Toronto professor Daniel Litt wrote that "this is the first example of a result produced autonomously by an AI that I find exciting in itself, as opposed to as a leading indicator." It's arguably the first time that an AI system has found a proof resolving a major open conjecture. That's impressive, but I don't view it as a radical break from the previous trajectory of AI progress in mathematics. Three years ago, LLMs struggled to solve arithmetic problems. It was only last year that LLMs started acing high school mathematics competitions. When I attended the Joint Mathematics Meetings -- the largest annual mathematics conference in the world -- in January, I learned that AI systems were starting to contribute to mathematical research, but only in constrained settings. It took significant human interpretation to turn an AI output into a publishable theorem. OpenAI's new result is the next step in this progression. The AI model cleverly applied existing ideas drawn from several subfields of mathematics to create a full proof. But it didn't pioneer any genuinely new techniques. The result has since been cleaned up and extended by human mathematicians. This points to a medium-term future where human mathematicians and AI models complement each other: AIs have a broader knowledge of past work than any human alive and much more willingness to grind through tedious proof strategies that aren't likely to work. But humans can still think more deeply about any one problem and ask more interesting questions. That might not last. AI systems have been improving at math so rapidly that it's unclear what role, if any, human mathematicians will play a decade from now. The unit distance problem Paul Erdős was one of the most prolific mathematicians in history. He wrote over 1,500 papers in his lifetime, the most ever. One of his greatest talents was coming up with problems that are simple to state but have deep roots. In 1946, he introduced the unit distance problem. Imagine you have some points in a 2D plane and you measure the distance between each pair of points: In this diagram, there are five points and ten pairs of points. Three pairs happen to be exactly 1 unit apart: AD, BE, and CE. Can we rearrange the points so that more pairs of points are exactly 1 unit apart? Yes. For instance, we could move points A and D to be closer to the B, C, and E cluster. With a bit more work, we could further rearrange the points so that there are seven pairs exactly one unit apart. But that's the most we can do. We could do the same analysis with 6 points, 7 points, and so on. But as the number of points grows, the problem very quickly becomes too complicated to find the exact answer. So instead of asking exactly how many unit distances are possible for a given number of points, Erdős tried to calculate upper and lower bounds on the number of length-one lines for n points, assuming that n is a large number. To help calculate a lower bound, Erdős assumed that the points would be laid out in a grid. This is probably not the optimal layout, but if he could demonstrate that points in a grid have a certain number of pairs with unit distance, then the optimal arrangement must have at least that number. The simplest option is to space the grid so that every point is distance 1 from its neighbors directly above, below, left, and right. However, Erdős saw that you could do even better if you took diagonals into account. If you make the grid spacing smaller, you can make each point be distance 1 from a greater number of neighbors. In the diagram above, if the grid spacing is 1, then each individual point is one unit away from four neighbors (the left panel). Instead, if the grid spacing is ⅕ (as shown on the right), then each individual point is one unit away from 12 neighbors: OpenAI's write-up of its new result included a confusing diagram showing points in a grid with a bunch of lines connecting them. The diagram becomes easier to understand if we superimpose a circle like this: This works because of the Pythagorean theorem, which states that if we have a point that is a units to the right and b units above another point, the distance c between those two points satisfies a² + b² = c². The trick is to choose some number c² so that there are a whole bunch of pairs of whole numbers a and b such that a² + b² = c². Then, if we scale the grid down so that each point is 1/c from its neighbors, there will be a bunch of unit distances. For example, if we choose c² = 25, then the Pythagorean equation can be satisfied by either 0² + 5² = 25 or 3² + 4² = 25. This corresponds to the 12-grid-point circle I showed earlier, with points at (0,5), (3,4), (4,3), (5,0), (-4,3), (-3,4), and so forth. (Technically, these lengths should all be divided by 5 -- (⅗, ⅘) for example -- but I'm leaving the denominators out for clarity.) OpenAI's diagram is based on choosing c² = 65, which can be satisfied by either 1² + 8² = 65 or 4² + 7² = 65. This means that if the grid spacing is 1/√65, each point will be one unit away from 16 other points: (1,8), (4,7), (7,4), (8,1), (-1,8), (-4,7), and so forth. Larger values for c² -- if they're chosen carefully -- enable more whole-number diagonals and hence more unit-distance pairs. However, if c² is too large compared to the number of points in the grid, then many of the potential one-unit-away neighbors will be outside the grid. In short, we want to choose a c² that's large enough but not too large. Using insights from number theory, including Jacobi's two-square theorem, Erdős was able to show that an optimally sized circle will enable the number of unit-distance pairs to grow faster than the number of points, but only barely. The question became "can you do better?" To find an upper bound, Erdős used an argument from a quite different area of mathematics called graph theory to show that you could only have so many unit distances. But his upper bound grows much, much faster than the best lower bound he was able to construct. Erdős's conjecture was that the actual optimum was much closer to the lower bound than the upper one. He predicted, but couldn't prove, that the maximum number of unit-distance pairs grows just barely faster than the number of points. To be more precise, Erdős conjectured that the number of unit distances would be n^(1+o(1)). In other words, for a sufficiently large n, the maximum number of unit distances would be less than n^(1+𝜖) for any 𝜖 > 0. That could end up growing a little faster than his lower-bound construction -- which was n^(1 + C/(log log n)) for some constant C -- but within the same general ballpark. Proving his guess became known as the unit distance problem. For the next 80 years, it looked like Erdős was right. Then an OpenAI model proved him wrong. The AI's approach Erdős's conjecture assumed that, at least for a large number of points, a square grid could yield about as many unit-distance pairs as organizing the points in other ways. OpenAI's AI proved this wrong by demonstrating that there was another, more complex way to organize n points that allowed more pairs to be exactly one unit apart. Precisely because the new pattern of points is more complicated, it's tricky to explain it concisely. But you can think of it as a clever modification of Erdős's grid. The AI constructed a grid in a high-dimensional space and then projected this more complex structure into two dimensions. And instead of using a whole-number grid with points like (1,3) or (-3,6), the AI construction used something called algebraic integers to build this more complicated grid. It turns out that this kind of higher-dimensional grid has richer structure, which allows the AI to pack more unit distances into the same number of points. It's hard to illustrate this alternative arrangement of points because it only becomes advantageous with a very large number of points. But here's a simpler arrangement of points that was constructed in a similar way. You can click here if you want to play with the illustration yourself. It has 1,345 points and only produces 5,916 unit distances, fewer than the 7,632 unit distances that a square 1,296-point grid produces using the Erdős technique. But I think it gives a sense of how a pattern that isn't a grid could produce more unit distances than a square grid. The more complicated patterns pay off. While the OpenAI model's proof does not explicitly state how many unit-distance pairs are possible for n points, human mathematician Will Sawin was able to show that it grows at least at the rate of n1.014. This might seem small, but as n gets really big, this number will become much larger than the counts produced by the Erdős approach. That being said, the AI's result doesn't completely resolve the problem. Our best upper bound for the number of unit distances is around n1.333. More work is needed to close this gap. How does this result fit into AI for mathematics? If you'd asked me two weeks ago -- before OpenAI's announcement -- about the most novel contributions of LLMs to mathematics, I probably would have pointed to the AlphaEvolve system from Google DeepMind. AlphaEvolve harnesses LLMs to be the engine of an optimization process. If you can turn a math problem into a piece of code to optimize, which you often can, the LLM might find better solutions than humans have for certain types of problems. In November, four mathematicians (including Terence Tao) released a paper that analyzed AlphaEvolve's performance on 67 optimization problems across the mathematical literature. They found that AlphaEvolve was able to improve on the established literature in some cases. This was a step up in autonomy from previous LLM contributions, such as literature review, but it still required humans to frame it as an optimization problem and turn the AI's output into usable mathematics. And only certain types of problems are amenable to this approach. More conceptual questions that don't include a number to optimize can't easily be studied with AlphaEvolve. So AI companies have been working to develop LLM systems that can directly output a correct solution to any math problem. OpenAI's result is a substantial step in that direction. But it also fits the pattern of previous AI-assisted mathematics. For one thing, other companies have also worked to solve Erdős problems. Because Erdős posed hundreds of problems over his career -- and because mathematician Thomas Bloom has organized an effort to compile all of them at www.erdosproblems.com -- AI companies have used them as a testing ground to evaluate AI systems. In January, Cambridge undergraduate Kevin Barreto worked with a friend to ask GPT-5.2 and Harmonic's Aristotle to produce the first autonomous solution of an Erdős problem. On May 22, two days after OpenAI's announcement, Google announced that its AI system had solved nine open Erdős problems, including two that had been open for over 50 years. To be clear, the problem that OpenAI solved is more impressive than any of the other work I just mentioned. But OpenAI's solution is more in line with past AI efforts than the headline result might suggest. One reason the unit distance problem was unsolved for 80 years, despite being so well known, is that most people thought Erdős's conjecture was true. But the mathematical tools we have are nowhere close to being able to prove Erdős's bound. So mathematicians expected that any proof of the conjecture would involve major new ideas or approaches. Instead, as we've seen, the AI disproved the conjecture by making an extension of Erdős's initial construction. It was a clever and nonobvious solution, but it also bore some similarity to the kind of optimization work done by a system like AlphaEvolve. This dynamic is reflected in some of the mathematicians' responses. Mathematician Tim Gowers wrote that when he first heard about the AI's result, he thought it had proved the theorem. "I spent the evening adjusting my world view: If the AI could come up with a proof like that, then maybe it would be all over for mathematicians very soon." But the next morning, Gowers and other external reviewers received an email about the result, and he realized that the LLM "had disproved the conjecture rather than proving it, which came as a big relief." OpenAI's solution also had two properties that played to the strengths of AI models relative to humans. First, the eventual solution relied on applying sophisticated techniques from a quite different area of mathematics: algebraic number theory. AI systems have been trained on huge swaths of mathematics -- and there's a lot of math out there -- so they have a broader knowledge of previous mathematical work than any human in the world. For a human to solve this, they would have needed to have the relevant algebraic number theory knowledge while also being interested in the unit distance problem, a rare combination. Second, the reasoning process was such a grind, and seemingly unlikely to succeed, that most humans would not have thought it worth the trouble. Jacob Tsimerman, a University of Toronto professor, remarked in the OpenAI document that he had briefly considered taking a similar approach to disprove the conjecture. But that type of technique "consumes much time and frequently doesn't work out," so he abandoned the project. An AI, on the other hand, can work through many proof strategies that don't work out before discovering one that does. OpenAI could have run the problem many times before a model found a solution. Indeed, an OpenAI chart revealed that even with the maximum token budget, the internal model solves the problem only half of the time. To be clear, what the AI system did is still impressive. "It's always tempting to look at a completed proof and declare it obvious after the fact," Tsimerman said later in his remark. But as I noted previously, it also played to the strengths of AI systems. In the short to medium term, this points to a world where AI models complement humans but do not replace them. AI systems will tackle lists of problems curated by human mathematicians or aid humans in finding relevant approaches from seemingly unrelated mathematical fields. But they won't immediately displace the human role in choosing which questions to ask or developing wholly new techniques. Even this result was very much a human-AI collaboration. While the AI system found the proof on its own, human mathematicians verified the result. Other humans came up with better-written proofs that extended the AI's initial ideas, like Will Sawin finding an explicit lower bound as I mentioned above. It's unclear how long this complementarity will last, however. Gowers spent the rest of his comment exploring whether the relief he felt on hearing that AI had disproved the conjecture was justified. He more or less concluded that it was, but in a footnote, he wrote that he would guess "that AI will soon reach a high level at other activities such as building theories, formulating definitions and asking interesting questions." In the past year, we've gone from AI systems that hadn't yet beaten high school mathematics competitions to ones that can advance mathematics in interesting ways. It seems likely that AI systems will continue to become more autonomous when working on mathematical problems. At the same time, we haven't fully explored what current models can achieve in math. Soon after OpenAI's announcement, University of Michigan postdoc Xiao Ma found that GPT-5.5 was also able to prove Erdős wrong if given a small hint. If a generally available model could disprove this famous conjecture and no one noticed, what other discoveries could happen today that no one has thought to try? Kai Williams is a reporter for Understanding AI, a Substack newsletter founded by Ars Technica alum Timothy B. Lee. His work is supported by a Tarbell Fellowship. Subscribe to Understanding AI to get more from Tim and Kai.
[2]
Mathematicians sign declaration to rein in AI use
I agree my information will be processed in accordance with the Scientific American and Springer Nature Limited Privacy Policy. We leverage third party services to both verify and deliver email. By providing your email address, you also consent to having the email address shared with third parties for those purposes. Last month many mathematicians were shocked by OpenAI's announcement that artificial intelligence had solved geometry's famous "unit distance" problem. For some, the achievement was exciting. But researchers also worry that AI technology, if left unchecked, will change their field for the worse. To address those fears, a group of mathematicians, computer scientists, and math historians have released guidelines to prevent AI from steamrolling their discipline. Among their most important prescriptions: disclose the use of AI in research, ensure all papers are peer-reviewed and level the playing field between academia and for-profit companies through, for instance, legal resources and public funding. On supporting science journalism If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today. The mathematicians have been working on the document since last fall, when around 60 researchers and policymakers convened at Leiden University's Lorentz Center in the Netherlands to discuss how technology will affect mathematics. At the top of many attendees' minds was the accelerating stream of proofs written partially or entirely by AI. Used responsibly, AI "can be extremely useful and helpful," says Ilka Agricola, a mathematician who chairs the Committee on Publishing at the International Mathematical Union (IMU) -- the foremost organization for global mathematics. "Unfortunately, this positive aspect is kind of getting small compared to the huge mess around it." Journal editors' inboxes are filling with more AI proofs than they can vet. Large language models regurgitate human ideas, often without attribution. Some fear for the integrity of research itself. They worry that values like transparency and accessibility, which mathematicians have long prioritized, are in danger. For example, almost every modern paper in math can be read for free on arXiv.org, and the American Mathematical Society hosts its own curated repository of mathematical papers, books and reviews. Commitment to these principles allows anyone on Earth to see and build on new research, says Jim Portegies, a mathematician at the Eindhoven University of Technology in the Netherlands. But tech companies, he says, often keep key details private. For instance, when Google DeepMind announced in 2024 that its AI model AlphaProof had solved three difficult math competition problems, it took more than a year before the methods were published in a peer-reviewed journal. Often, when it comes to AI proofs, "we retreat behind closed doors because there is now a lot of commercial interest," Portegies says. To try to combat these trends, participants at the Leiden workshop decided to work together on a joint statement modeled after similar documents on open science and data management. They called it the "Leiden Declaration on Artificial Intelligence and Mathematics." Though all the authors shared some basic concerns, wrangling them into a statement that everyone was happy with was a challenge. "It was a long, arduous process with a lot of lively discussion," says Rodrigo Ochigame, an anthropologist of AI at Leiden University. "I don't think I've ever been part of a writing process that involved so much debate for such a short text." In the final 11-page document, the authors lay out what they value about mathematics research, how those values are threatened by AI and how to address the situation. For instance, one of their concerns was that, whereas a human-written proof can be verified by anyone with the right expertise, AI is given to subtle, hard-to-spot errors; policies that subject AI proofs to extra scrutiny can help catch such errors. And the goals of humans and AI in math aren't always the same: mathematicians pursue research questions based on the potential for new techniques and ideas to emerge, and tech companies may focus on questions that showcase their AI models but have limited impact in mathematics. Independent funding can help ensure mathematicians still have a say in how their field develops. Some of the recommendations, such as disclosing AI use and properly attributing previous research, are up to individuals or AI companies. Others, like the recommendation to regulate the AI industry, require large-scale organization or government intervention. Most crucial for Ochigame is the call for commercial AI companies to adhere to the declaration's principles. "Mathematicians who never intended to contribute to AI development are having their work used for this purpose without their consent," he says. "I think that's a deeply concerning situation." The IMU plans to endorse the declaration, and Portegies, who led the declaration project, will speak about it at the organization's upcoming conference this summer. "They did an immense favor to the whole community, because now we have a starting point for decision making, for discussion," Agricola says. "I love it."
[3]
A golden age of maths is dawning and mathematicians are freaking out
Mathematicians are stunned at the progress AI is making in solving advanced problems, leaving some questioning whether there will still be room for humans I am attempting to solve a mathematical conundrum that has stumped many of humanity's greatest thinkers. I have zero mathematical training, apart from a distant undergraduate physics degree, which should put my odds of success at slim to none. But I also have a trick up my sleeve - a kind of mathematical genie that can conjure arcane secrets seemingly out of thin air. I make a short request concerning an esoteric conjecture in number theory, then cross my fingers. Perhaps "genie" is a bit too strong - I'm simply using GPT 5.5 Pro, the latest iteration of OpenAI's flagship model. But for mathematicians, modern AI models appear to have a spark of magic. Even in an era of rapid progress, the growth in AI's mathematical ability is stunning. In just a few months, many prominent mathematicians have walked back previous scepticism and replaced it with sweeping predictions, whispering behind closed doors about job concerns and whether it is even worth embarking on a particular research project if AI might get there first. In April, I visited San Francisco, where the future often seems to arrive fastest, to attend a hastily organised meeting between mathematicians and AI researchers. There was an air of excitement and curiosity in the room, but also an undeniable feeling of existential dread. If someone like me could produce mathematics at the press of a button, what would that mean for the professionals? Will we even need human mathematicians? And will the machines crack problems that no human could? The answers may have profound consequences for the millennia-old practice of mathematics, and it feels like mathematicians have only a brief window to prepare. "I think AI is going to come in a big way, and it will significantly revolutionise the field," says Jacob Tsimerman at the University of Toronto, Canada, who helped organise the conference. Opinions on the future are divided. "We are running out of places to hide," wrote Jeremy Avigad at Carnegie Mellon University in Pennsylvania in a recent essay. "We have to face up to the fact that AI will soon be able to prove theorems better than we can." Some mathematicians are welcoming the mechanisation of mathematics. Terence Tao at the University of California, Los Angeles, has said the field is moving from an era of "proof scarcity" to one of abundance that could see many once-thorny problems fall to AI. Rather than focusing on being the first person to find a proof, mathematicians might instead race to be the first to understand it, he argues. Artificial intelligence isn't terra incognita to mathematicians, but it is only in the past few years that it has started producing useful contributions. At first, these were artisan operations, using individually crafted neural networks to crack particular problems. These bespoke AI models proved difficult to apply across different mathematical disciplines, and remained of interest to only a tiny fraction of working mathematicians. Even when ChatGPT launched in 2022, mathematicians remained unimpressed - large language models like GPT-3.5, which powered the first version of OpenAI's chatbot, struggled to perform even basic arithmetic and spouted confident nonsense when asked to solve research-level mathematical problems. But as LLMs scaled up and were trained on increasing amounts of mathematical data, they began to yield results. One of the first signals that AI was becoming more adept came when AI models were tasked with attempting the International Mathematical Olympiad (IMO), an elite test for high-school students consisting of just six questions of devilish difficulty. The mathematical intuition and range of disciplines required to succeed at the test meant many researchers saw it as a benchmark for mathematical AI, but thought it would take years, possibly a decade, for it to score highly. They were wrong. In July 2024, Google DeepMind announced that its AlphaProof AI system could solve four out of six questions from that year's IMO, enough for a silver-level performance. This was impressive, but AlphaProof wasn't a strict large language model and had been fine-tuned for IMO-style questions, such as geometry, and it was unclear how much further it might go. But just a year later, Google and OpenAI announced they had achieved a gold-level performance, with OpenAI in particular using a less maths-focused model. The results made mathematicians sit up. "People's eyes really opened," says Ravi Vakil at Stanford University in California. Problem solving It wasn't long before these capabilities were made available to the public, where they quickly found use beyond high-school competitions and began encroaching on research-level mathematics. Thomas Bloom at the University of Manchester, UK, first noticed the impact of these newer models in the last months of 2025. He runs a website that tracks progress on a set of more than a thousand problems posed by the famous mathematician Paul Erdős. They tend to be simple to state, but range in complexity from relatively straightforward to very difficult, and many of them are seen as signposts for mathematical progress. Bloom started getting comments on the site from people he didn't recognise. At first, they were just using GPT-5, then recently released, to dig out obscure references in the literature that might help with a particular problem. But in a matter of months, the release of more powerful models like GPT 5.2 Pro saw people posting full-blown solutions with AI assistance, some of which were verified by Bloom and his colleagues as correct. These solutions took "non-trivial effort", Bloom told me at the time. "It's incredible that AI is capable of that." What's more, some of these solutions weren't coming from professional mathematicians, but amateurs and novices. Kevin Barreto, who is in his second year of an undergraduate mathematics degree at the University of Cambridge, has solved numerous Erdős problems using AI, frequently with his collaborator Liam Price, who has no maths degree or formal training. Inspired by their success, I wanted to try autonomous mathematics for myself. While these tools can, in theory, be used by anyone, Barreto and Price seem to have a magic touch in prodding the genie to produce useful answers, so I asked for help. The trick isn't just asking the model to produce a proof, Barreto tells me, but bizarrely giving it a certain level of support, like "try your best" or "don't give up". "You try to encourage the model," he says. "You try to hint it into believing the problem is of an easier difficulty than it actually is." Even so, success wasn't guaranteed. Solving certain problems has often taken Barreto numerous attempts, if he succeeds at all. "Coaxing the correct proof strategy out of it is essentially like trying to play the lottery," he says. Still, I wanted to try my hand and spin the wheel in the mathematical proof casino. I chose an unsolved Erdős problem, number 710, which concerns a list of requirements that must be satisfied by a set of numbers, with the goal being to find a set with the smallest difference between the lowest and highest numbers. It is a bit like having a list of picky hotel guests, who insist on having a room with a bath or a sea view, for instance, and needing to find the shortest block of rooms that will satisfy them all. Mindful that I needed to use the most powerful AI model available, I asked OpenAI for access to ChatGPT 5.5 Pro, which normally costs $200 a month but was provided for free for this article. Like Barreto suggested, my prompt for the AI hints that the solution is within reach and that "it just takes a few clever tricks". As I left the AI crunching away, I turned to consider the most recent developments in this mathematical revolution. If solving Erdős problems is AI creeping up on the door of research-level mathematics, the past few months have seen it kicked down. A steady stream of mathematical papers are claiming to solve real, cutting-edge problems. In January, Vakil and his colleagues uploaded one such paper, noting that "the proof of this result was obtained in conjunction with Google Gemini and related tools". The proof focuses on a particularly thorny problem concerning how certain sphere-like shapes can be linked to other mathematical objects called flag spaces, which can be thought of as collections of nesting-doll-like objects. This would provide an important link between topology, which concerns the more general properties of shapes, and algebraic geometry, which deals with the precise shapes themselves. The task is made difficult by the multitude of ways in which the flag spaces and sphere-like shapes can correspond. Vakil and his colleagues first gave a simpler version of what they wanted to prove to a custom AI model from Google DeepMind. The model found a mathematical structure they hadn't previously seen, making it clear to them how to generalise and write the entire argument, which turned out to be simpler than it initially seemed. Human and machine "There's no way the AI could do it by itself because it wouldn't know the [correct] question. We absolutely told it what to do," says Vakil. At the same time, the AI provided a shortcut. "The paper might never have happened because we might never have had the time to get together and figure out the argument," he says. "It's more how things will happen. The future will be some combination of human and machine." This line is already becoming increasingly blurry, however. The very same month as Vakil's paper, Tony Feng at the University of California, Berkeley, who also works with Google DeepMind, published a paper detailing how he had used Google's Aletheia AI to calculate a previously unknown collection of numbers that are vital for translating between two disparate mathematical disciplines, algebraic geometry and number theory. Building such bridges is an important goal in the Langlands programme, often seen as a grand unified theory of mathematics. According to Feng, the "core mathematical content" was generated entirely by Aletheia. The biggest result yet in AI mathematics came just a few weeks ago in May, when OpenAI announced that it had used an unreleased model to solve an 80-year-old maths conjecture called the planar unit distance problem. The firm didn't provide full details of the model, other than to say it was a general-purpose AI, rather than one trained specifically to do mathematics. The reaction among mathematicians has been one of stunned disbelief. It is becoming difficult to keep track of the torrent of mathematical research assisted by AI, not least for professional mathematicians, who are themselves busily attempting problems using AI that they may not have previously had time to do. "It opens up a world of possibility," says Alex Kontorovich at Rutgers University in New Jersey. "I can imagine projects I could undertake this summer, things that I know would have taken me five years that I would never have even started." Could those new possibilities even include a solution for the Riemann hypothesis, a deep question about the origin of prime numbers that is one of the Millennium Prize Problems, which are often seen as the greatest challenges in maths? Several mathematicians working for AI companies told me they thought we might see one of these problems fall in the next several years, while others cautioned that they are in a wildly different class of difficulty from those problems that had been solved so far. The San Francisco conference I attended in April was an attempt to map these possible futures. It took place in a nondescript building owned by a venture capital firm, the only clue that it existed an unmarked pink door and a video doorbell. As I waited for the door to open, I was joined by a former maths professor who now works for a hedge fund, stepping out from a driverless car. Once inside, I found eminent mathematicians like Vakil and Kontorovich mingling with employees from companies like OpenAI and Google. The ostensible goal of the meeting was to come up with a way to track AI's mathematical progress and where it might be headed. Attendees had their own personal priorities, however. "My hope was to understand a little bit more about where the models are and where they're going in terms of mathematical capability," says Daniel Litt, another conference organiser, also at the University of Toronto. "It's clear that the models are, in some sense, missing some capabilities that mathematicians have." In the past, the most common way to test an AI model's mathematical ability was to run it on a benchmark, a collection of problems that typically require simple and easy-to-verify solutions, like a single number. This was convenient for AI companies, because they could present their models' progress as a clean, rising line on a graph. But many mathematical tasks aren't so neat and tidy, requiring proofs that need interpretation by an expert. What's more, prowess in one area of maths doesn't imply a human-like mathematical ability in general, says Melanie Wood at Harvard University. "One big mistake that people make when they think about AI and math is to take the correlation of these skills in humans and think that it's going to match some correlation in AI." A button-pushing future Mathematicians at the conference worked in small groups to come up with a better way to track AI's mathematical ability and finished the week with a working draft. But boiling down all the things a working mathematician does into a short document wasn't easy, and there was still disagreement over the best way forward. A large part of the conference consisted of free-flowing group discussions primarily between the mathematicians, hashing out the details of what an AI-led mathematics might look like. Would it be of humans and machines working in lockstep, like Vakil thought, or would it be more like a slot machine, pressing a button that sometimes produced an interesting result in full? For Tsimerman, who grew up taking part in maths competitions like the IMO, the latter didn't have much appeal. "My experience of math is the act of solving problems, and if I don't do that anymore, I think I might prefer playing music or doing theatre or learning something else," he says. At one point in a group discussion, Tsimerman asked people in the room to indicate whether, in his button-pushing vision of the future, they would want to continue being mathematicians. Only around half raised their hand. Not everyone agreed that this was a useful exercise, however, or that solving problems was the most important mathematical activity. "What I actually care about is understanding things and figuring out what's true," says Litt. "One can do that by posing and proving a conjecture, but you can also do that by going over to your friend and asking them a question." And even if these tools can solve difficult and thorny problems, many mathematicians remained adamant that it was only humans that could decide what was interesting to work on or what the important problems to tackle should be. Maths isn't about solving puzzles just for the sake of it, points out Wood, and mathematicians generally look for solutions that push the field forward. "Does it suggest a way to solve a lot of other problems, or is it only a solution for that particular problem?" she says. On the conference's third day, excited murmurs rippled among the attendees. Overnight, it appeared that another Erdős problem had been cracked, one that was qualitatively different from the others. Jared Lichtman at Stanford University, who happened to be at the conference, had spent a considerable portion of his PhD wrestling with a closely related problem, after many mathematicians had spent decades trying to solve it. "It was a problem I was already independently very passionate about," he says. Price had elicited a solution to the problem, known as Erdős 1196, from a single request to ChatGPT 5.5 Pro. It concerns "primitive" sets of numbers that are similar to prime numbers, in that no number in the set can divide another. Erdős had come up with a number calculated from these sets that helped order them, and argued that the largest this number could be for any primitive set was 1.6. Lichtman had proved that Erdős was correct in this case, but wanted to do the same for a more restricted family of primitive sets. Erdős suspected the highest value this number could be was 1, but proving it remained a tougher nut to crack. The AI took an entirely different approach, using a mathematical tool that all previous attempts had missed, called a Von Mangoldt function. "You can use the Von Mangoldt function to circumvent a lot of technical difficulties that all these previous approaches had used," says Lichtman. Working with others, including Price, Barreto and Tao, he later adapted this technique to solve a related 60-year-old conjecture by Erdős. "This is perhaps one of the first examples of an AI-generated proof having downstream impacts, which we are still exploring," Lichtman said when posting about the work on social media. Meanwhile, I was finally ready to explore my own AI-generated proof. After "thinking" for 22 minutes and 18 seconds, ChatGPT pinged me with a response. "Here is the clean proof," it wrote, followed by dozens of lines of impenetrable mathematics. I felt a jolt of excitement. Had I solved a decades-old problem, cementing my name in the mathematical history books? I fed the answer back into ChatGPT, and soon received confirmation: "Yes -- the main argument is correct." I was growing even more confident. I dashed off an email to Barreto, asking whether I might be on to something. But as quickly as my excitement had arrived, it vanished. "It doesn't look like it solves the problem," he replied. I had missed that the AI had actually proven something different from the formula Erdős had hoped for, which had already been discovered by Erdős himself years ago. It was something that an expert mathematician might have quickly caught, but for me it was lost in the noise. Perhaps there is a future for mathematicians after all, even if only to help humans understand what an AI produces. "I still want to know what's going on," says Litt. "A model can't understand something for you."
[4]
An AI math breakthrough sparks calls for new guardrails
The model disproved a famous conjecture, raising questions about trust, credit and access Think about placing dots on a flat surface. You want as many pairs as possible to be separated by the same distance. For any amount of dots, what is the greatest possible number of pairs that can be exactly that far apart? The question, what mathematicians call the unit distance problem, seems simple. The answer is tricky. Eighty years ago, in 1946, the famous mathematician Paul Erdős proposed what he thought was the answer, but no one had been able to prove or disprove his conjecture. At least, not until now. Researchers at OpenAI gave an AI model Erdős' conjecture and walked away. When they returned, they discovered a breakthrough: The model had disproved the conjecture in a mathematical proof posted May 20 on OpenAI.com. "It's a beautiful piece of mathematics that has been discovered," says Melanie Matchett Wood of Harvard University, who contributed remarks to an accompanying paper in which outside experts reviewed the AI's result. The discovery bolsters hopes that AI can contribute to scientific understanding. But the AI proof relied on perseverance rather than creative insight and has raised concerns about how mathematics will be done going forward. On June 2 a group of experts published a declaration calling for tight guardrails around AI in mathematical research. As of June 5, the declaration has 1,590 signatures. A breakthrough for math, but maybe not for AI The AI model that produced this result isn't publicly available yet, but Open AI says it is a general-purpose large language model trained for reasoning. It did not use any math-specific tools or software. And "we didn't guide the model in any particular way," says OpenAI researcher Sébastien Bubeck. The original prompt, composed by AI, described the conjecture and instructed the model that a complete solution must either prove or disprove it. Mathematicians had believed the conjecture was true. Yet the model tried to disprove it instead. Wood sees the result as a breakthrough for mathematics. The AI came up with a counterexample using tools from two of the oldest and most foundational mathematical fields: algebra and number theory. It seems that these areas shouldn't have anything to do with this geometry question, Wood says. But the result "shows that tools from one part of mathematics can be applied really fruitfully in this other area of mathematics." She thinks this result will inspire mathematicians to think of new ways to apply those same tools. She's not convinced, however, that this is a breakthrough in artificial intelligence. When she read the solution, it seemed to her that the latest, publicly available AI models could have come up with it. (In fact,one researcher posted on X that he had reproduced the proof using a publicly available model.) Mathematician Thomas Bloom of the University of Manchester in England had a similar reaction. He noted in the paper from outside experts on the achievement that it would have been "truly incredible" if the AI had managed to prove the conjecture, as that kind of solution would require creative insight. Bubeck concedes that "this proof isn't exactly the spark of genius that we see sometimes in mathematics." AI still struggles to make leaps of discovery. But the tech can patiently slog through a huge number of unlikely strategies. Who will check AI's work? Still, as the declaration calling for guardrails notes, AI technology also threatens our ability to produce responsible, verifiable and ethical mathematics. For one thing, AI's reasoning can be unreliable. In this case, the AI model's proof happened to be relatively easy for a human expert to verify, Bloom says. But he has seen people on the internet who claim they have a solution to some open problem. These people have used AI to generate hundreds of pages of math that they can't understand or even read. "It could be right. It could be nonsense. Who's going to be able to check this?" Bloom says. If mathematicians knew the probability that an AI-generated proof was correct, that would help. But as Wood notes, OpenAI does not share all the times their internal model failed to solve an open problem in math or, even worse, produced an incorrect solution with flawed reasoning. OpenAI's Bubeck says that the team ran their prompt on the Erdős conjecture through the same model multiple times, and it produced the correct solution in 50 percent of those trials. His colleague Lijie Chen says that the new model is better than current models at generating an "I cannot solve it" response when it runs into difficulty on a problem. But data to support these claims have not been released or peer-reviewed. And OpenAI will not reveal how much time the model spent working on its solution. Wood, Bloom and those who signed the declaration have other concerns too. Right now, AI generates mathematical reasoning without showing what work inspired the ideas. That clashes with mathematicians' standard practice of giving credit to the work that inspired a breakthrough. "LLMs have read ALL the papers. They have read all the commentary and notes, and everything that's online.... It's not clear that there's a way for [AI] to reasonably attribute the source of the ideas," Wood says. Access is another concern, Bloom says. If the most powerful tools are expensive and private, mathematics could become less open and democratic, and some people may question why they should learn math at all, he says. Wood, Bloom, and some other mathematicians are cautiously optimistic, however. "I do think [AI] is going to become an indispensable tool in mathematics," Wood says.
[5]
As A.I. Makes Strides in Mathematics, Mathematicians Urge Caution
A week after OpenAI made headlines with an A.I.-generated proof, a new "declaration" by 16 experts raises concerns that the technology threatens math as a discipline. Recently there are signs that some branches of higher mathematics, among the most rarefied realms of human achievement, are vulnerable to a shake-up by artificial intelligence. Mathematicians, in turn, have been thinking about how to respond. On Tuesday, a group of 16 mathematicians, in consultation with colleagues and math organizations worldwide, published the Leiden Declaration on Artificial Intelligence and Mathematics. It aims to "frame the conversation about future directions," said Dame Ursula Martin, one of the authors, and a mathematician and computer scientist at Oxford. This effort comes as A.I. models have been making headlines with successful results in research-level mathematics. In late May, OpenAI, the maker of ChatGPT, announced that one of its models had disproved a notable 80-year-old mathematics conjecture in the field of combinatorial geometry. The conjecture is one of some 1,200 problems posed by the Hungarian mathematician Paul Erdos. While some of these "Erdos problems" are considered throwaway questions of narrow interest, others have proved influential and field shaping. Along with a research paper describing the proof, OpenAI released a companion paper by several independent mathematicians. Jacob Tsimerman of the University of Toronto, an expert in the adjacent subfield of number theory, commented: "This is a really impressive piece of work, and I would accept it for any journal without hesitation." Other figures in the field were less sanguine. Melanie Matchett Wood, a Harvard mathematician, was enthusiastic but raised concerns. For instance, she commented that the OpenAI paper did not appropriately reference "a history of closely related ideas in the literature." "It is a powerful tool, and I think it will be a great tool to accelerate mathematics research," Dr. Matchett Wood said in an interview. But she noted that the community needs to figure out how to use A.I. "in a way that will maintain human understanding of the mathematics." Among the potential threats that the Leiden Declaration authors articulate are accuracy and reliability: Journal editors are already complaining about a flood of plausible seeming A.I.- generated papers and proofs that have turned out to be incorrect, and in ways that are difficult for mathematicians to discern. Perhaps most pointedly, the authors raise the question of whether the many A.I. companies tackling mathematics -- major players such as OpenAI, Google DeepMind and Anthropic, or start-ups such as Harmonic, Math, Inc. and Axiom Math -- are keeping the field's best interests in mind. "Technology companies' involvement in research," they write, "raises the risk that research questions are prioritized and incentivized because of their amenability to A.I. methods and models, rather than their deeper significance to understanding." In turn, they point out, this disadvantages researchers who choose not to use the technology, and those who do not have access to it. For Rodrigo Ochigame, a historian and anthropologist of computing and artificial intelligence at Leiden University in the Netherlands, and one of the statement's authors, the latest OpenAI proof illustrates why this sort of collective reckoning in the discipline is necessary. "The story follows the same pattern as many other announcements by commercial A.I. developers," Dr. Ochigame said. "The A.I. model is proprietary and unavailable to anyone outside the company. We get a flashy promotional video, while basic information needed to assess the scientific meaning of the result is kept secret. The company disclosed nothing about the methods, human-written prompts, training data, or computational resources consumed." The declaration is being endorsed by the International Mathematical Union, and has a slot on the program at the International Congress of Mathematicians, which will be held this July in Philadelphia. And it is now open for signing by individuals and organizations such as national mathematical societies. The following conversation -- conducted by videoconference and email with Dr. Ochigame, Dr. Martin and mathematician Michael Harris of Columbia University, author of the Substack newsletter "Silicon Reckoner" and another member of the declaration's working group -- was edited and condensed for clarity. What is the Leiden Declaration? MARTIN It's a provocation, a stimulus for debate. There are more and more press stories about the mathematical achievements of A.I., and many mathematicians feel uneasy. What OpenAI has done is throw a great deal of resources at finding a counterexample to this particular conjecture. That's remarkable, and impressed the experts. We are not told about the model's failures. If you put vast quantities of human effort into this problem, you likely would've solved it in the same way. But in math human effort is scarce, and just tends to be spent on different things. To think of mathematics in terms of precise and neatly stated problems, like high school exams or the list of Erdos problems, is to misunderstand and diminish what makes mathematics so powerful and significant. Mathematics is not just about solving problems -- it is also the cultivation of ideas, understanding, judgment, and human insight. HARRIS The purpose, from my perspective, is to recover control of the narrative about the values and the goals of mathematics from the A.I. industry. Mathematicians are concerned that the values of the profession are being misrepresented, not intentionally but due to the media campaign on the part of the industry, which seems to want to promote the belief that they are in a position to transform mathematics -- "the A.I. revolution in math," as one headline put it not long ago. If the people who make the decisions about funding base their decisions only on what's being reported in most of the articles in the press, they could easily get the impression that A.I. is where the future of mathematics is. We want to affirm certain values that have characterized the profession: openness, honesty, giving credit where credit is due, sharing, transparency about methodologies, and access for independent verification of results. An aspect of mathematics that is cherished by mathematicians is that it is one of few successful examples of a gift economy -- that is to say, its economy is somehow an island of idealism in our society. As director of graduate studies in the mathematics department at Columbia, I read the personal statements of all the applicants every year, several hundred, and they are still idealists. The tech industry proceeds in accordance with commercial logic, which is antithetical to the values of mathematics. OCHIGAME Several A.I. companies are investing in dedicated teams focusing on mathematics, using problems as benchmarks and publications as training data. They are training their models to prove theorems not because they want to advance mathematical knowledge, but because they hope that such training will improve the models' reasoning abilities more generally. Those companies have repeatedly articulated this strategy in pitches to investors, so it is perhaps not a coincidence that OpenAI's announcement about the unit distance conjecture came out the same day the news broke that the company is preparing to file for an I.P.O. This situation has put mathematicians in a troubling ethical position. Without their consent, their published work is being used as strategic training data for the development of general-purpose A.I. The resulting models are being commercialized for many purposes, including military applications, that raise grave ethical concerns. Most mathematicians never imagined, much less consented, that their work would be used for such purposes. MARTIN It's important not to lose sight of the fact that what the A.I. companies are doing, what you can achieve with this technology, is absolutely extraordinary. I don't think we're challenging that. We're challenging the framing, we're challenging the behaviors around it. OCHIGAME The group of authors included many people who are excited about the potential of new mathematical results, and people who have even contributed to the development of the technology. But the public discourse is so heavily tilted toward the very effective P.R. campaigns of the A.I. companies and the narratives they are pushing. We feel an obligation to be the voice for expressing the critical concern. But we certainly understand the enthusiasm. HARRIS I would add that a lot of the enthusiasm and excitement is artificially generated by the corporations. The declaration warns against that: "Don't believe the hype." It is important for the mathematical community to have the last word on what, mathematically, is, and what is not, exciting. Do you worry that the declaration might be seen as mathematicians embarking on a futile effort -- circling the wagons in order to save an outdated profession that A.I. is threatening with obsolescence? HARRIS Depending in part on how the story is reported, industry supporters are likely to frame it this way, but such a framing is hostile to mathematics as an intellectual achievement and not merely to mathematicians. MARTIN It's not either/or but both/and. Centuries of enduring work by mathematicians underpin every aspect of modern science, life and society. The authors of the declaration, alongside the world's mathematical organizations, are committed to ensuring that mathematics continues to flourish through both intellectual rigor and practical application. We welcome A.I. companies as responsible partners in the spirit of the declaration. OCHIGAME Mathematics is a rich form of cultural expression with an ancient history, and I am not worried that any technology will ever render it obsolete. Its most precious aspects, such as the collective quest to understand beautifully intricate ideas, and to explore the limits of the human imagination, cannot ever be automated. What I am worried about is that a handful of corporations are mobilizing their vast financial resources to impose an impoverished view of mathematics so forcefully -- at a moment when scientific research is already under political attack -- that they may well end up destroying the social institutions that allow mathematics to flourish. What could be futile about resisting that?
[6]
A New Declaration Warns AI Could Threaten the Foundations of Mathematics
Mathematicians are setting some boundaries. Today, 16 mathematicians in consultation with peers and relevant organizations published the Leiden Declaration on Artificial Intelligence and Mathematics. The declaration, which had attracted more than 130 signatories by the time of publication, outlines key challenges that widespread AI use poses to mathematics research, as well as recommendations for individual researchers, organizations, governments, and commercial enterprises. "I do not expect every colleague to agree with every sentence of the declaration," Christoph Sorger, secretary general of the International Mathematical Union (IMU), wrote in a column in IMU's endorsement of the declaration. "It asks the mathematical community to respond in a way that is transparent and guided by the values of our discipline." "It was not easy to reach consensus on a complete text, and the process tested everyone's patience," Rodrigo Ochigame, an anthropologist of AI at Leiden University in the Netherlands, who was involved in the declaration, told Gizmodo. "We did this the hard way: we decided to publish the text only when we reached full consensus, after gathering extensive feedback from a wide range of people and debating every point in detail." Laying things out The 11-page document emerged from a workshop held in September of last year. To be clear, the declaration isn't denouncing the use of AI in mathematical research. Rather, it questions what it really means to use AI "responsibly," in the context of values such as accuracy, transparency, and the weight of human judgment and creativity behind mathematical breakthroughs. Unchecked, the advance of AI on mathematics puts the "autonomy of mathematics under threat," reads the declaration. For instance, the declaration argues that AI-generated proofs are difficult to incorporate into established procedures for ideating, presenting, and validating both formal and informal arguments in mathematics. It also warns that, when such results are promoted through informal press releases or blog posts without rigorous validation, it's difficult for mathematicians to rectify information that's already out there, should there be significant errors in the AI's work. "There's a rush to announce results that aren't often checked or contextualized correctly from a number of AI math startups," Daniel Litt, a mathematician at the University of Toronto who wasn't involved in the declaration, told Gizmodo. "By and large, those are mostly correct and also not very interesting. Of course, companies also have financial incentives to overstate how interesting they are." Another major concern is that AI agents scrape the literature -- arXiv, for example -- to concoct their answers, but rarely while properly citing the human work they build on. While repositories like arXiv are meant to be accessible, tech companies often abstain from sharing key details on how the AI reached its conclusions, Jim Portegies, a mathematician at the Eindhoven University of Technology in the Netherlands, told Scientific American. An action plan Some key recommendations of the declaration include the disclosure of AI use in research, stricter peer-review processes, and investments in public computational infrastructure to level the playing field against big tech firms. Again, the declaration stresses that greater focus should be placed on humans -- whether or not they use AI in the way they engage with mathematics. "Mathematics is, and should always remain, a profoundly human endeavor," Ulrike Tillmann, IMU's vice president, said in her endorsement comments. Among the recommendations, Ochigame told Gizmodo that the easiest item to implement might be to disclose tool use and, by extension, develop clearer instructions for AI disclosure in math. In addition, regulations on the AI industry "affect so much more than mathematics," so that should also be prioritized, he added. The declaration "certainly looks timely, and a lot of what's on there echoes my own thoughts," said Litt, who was also among the experts consulted for OpenAI's recent disproof of a longstanding mathematical conjecture. "I do think [AI] is a very important and powerful technology that has the potential to help us with a lot of interesting math... [but] I don't think the tools will do that on their own." Sorger added that the reactions from the mathematical community "already show exactly why the declaration is useful, prompting consideration and discussion of what we want to protect, what we are willing to change, and where we need more clarity." Indeed, the primary goal of the declaration is to initiate serious discussions on AI's influence on mathematics -- an area of fundamental research that has supported virtually every aspect of science, if you really think about it. And that's due to continue next month, as top mathematicians will convene in Philadelphia for the International Congress of Mathematicians hosted by the IMU.
[7]
Mathematicians issue Leiden Declaration against AI misuse of their work
The Leiden Declaration on Artificial Intelligence and Mathematics, endorsed by the International Mathematical Union and signed by Fields Medal recipient Peter Scholze, calls on mathematicians to confront how AI companies are using published research without consent, bypassing peer review, and threatening the integrity of proof and attribution. A coalition of mathematicians from institutions including Oxford, Cambridge, ETH Zurich, Columbia, and Northwestern has published a formal declaration calling on the mathematical community to confront the threats that artificial intelligence poses to their discipline. The Leiden Declaration on Artificial Intelligence and Mathematics, released on Monday and endorsed by the International Mathematical Union, is the most significant collective response from a major academic discipline to the way AI companies are using, and in some cases exploiting, published research. The 11-page document does not oppose AI in mathematics. It opposes the way AI companies are treating mathematical work: training models on published papers without consent, announcing results through press releases instead of peer review, undermining attribution, and reshaping research priorities to serve commercial interests rather than intellectual significance. "Mathematics is, and should always remain, a profoundly human endeavour," said Ulrike Tillmann, vice president of the IMU. Five threats to mathematical research The Declaration identifies five specific ways AI threatens the values that make mathematics trustworthy. First, current AI systems produce plausible but unreliable arguments that are difficult to distinguish from correct proofs. This applies not only to informal reasoning but also to formal computer-encoded proofs, where the difficulty lies in translating between machine and human representations of concepts. The problem of AI-generated content that looks authoritative but contains subtle errors is not unique to mathematics, but in a discipline built on certainty, it is existential. Second, AI models trained on published mathematical work do not properly cite the human contributions they synthesise. The Declaration notes that much training data was obtained by "systematically exploiting licences and access arrangements that were not made with artificial intelligence in mind, or indeed by simply violating copyright protections." Third, the use of AI is becoming incentivised for its own sake, distorting hiring, funding, and recognition. Fourth, results are increasingly communicated through press releases and blog posts rather than peer-reviewed journals, seeking publicity "on market timelines before the accepted processes of community evaluation in mathematics can take place." The Declaration cites Google DeepMind's AlphaProof, which solved three International Mathematical Olympiad problems in 2024 but took more than a year to publish its methods in a peer-reviewed venue. Google's broader AI strategy relies on mathematical reasoning capabilities as evidence of general intelligence, creating commercial incentives to announce results before the mathematical community can properly evaluate them. Fifth, the autonomy of mathematics is under threat. Research questions may come to be prioritised because they are amenable to automation rather than because experts judge them to be deeply significant. "Indeed, broader understanding of the field may be permanently lost in the process of automation," the Declaration warns. What it recommends The Declaration makes recommendations at four levels. Individual mathematicians should disclose all AI tool use in papers, retain personal responsibility for the correctness of results, refuse to grant authorship to AI systems, and "consider carefully which tools to use" based on whether their developers align with the Declaration's values. Mathematical organisations should insist that results obtained by automated techniques meet standards that address the specific risks those techniques introduce, protect authors' rights by developing licensing agreements that prevent use of published work as training data without consent, and demand that results continue to be published through peer-reviewed venues. European regulatory frameworks provide a model, but the Declaration argues that the mathematical community must also set its own standards independently of government. For policymakers, the recommendations are blunt. "Don't believe the hype," the Declaration states. "There is currently a strong commercial incentive on the part of the technology industry to overstate the capabilities of their products." It calls for significantly increased public oversight of the AI industry and investment in public computational infrastructure as an alternative to proprietary systems. Who signed it The Declaration carries significant weight because of its signatories. Peter Scholze, a Fields Medal recipient and director of the Max Planck Institute for Mathematics, endorsed it with a personal statement: "I am pondering my mathematical ideas without use of AI, and generally avoid reading AI-generated text as best as I can." Other endorsements came from Robbert Dijkgraaf, former Dutch minister of education and president-elect of the International Science Council, and Steven Strogatz, Cornell's distinguished professor for the public understanding of science and mathematics. Kevin Buzzard, the Imperial College professor who has been one of the most prominent advocates for formalised mathematics, called it "a well-thought-through response to what is currently happening, as AI continues to disrupt this space." The tension between AI capability and research integrity that the Declaration describes is not limited to mathematics, but mathematicians are among the first academic communities to respond with a coordinated, institution-backed statement. The deeper argument The Declaration's most provocative section addresses AI companies directly. It argues that tech companies are attracted to mathematics because formalised proofs can be checked automatically, creating an "effectively unlimited source of feedback for training artificial intelligence models." The strategy rests on an assumption that capabilities developed through mathematical theorem proving will extend to broader general reasoning, an assumption the Declaration treats sceptically. "Some of the resulting general-purpose models are being commercialised for applications that raise grave ethical concerns," the authors write, "including warfare, oppression, mass surveillance, and the undermining of democracy." The intersection of AI research and military applications has become one of the defining tensions of 2026, and the Leiden Declaration makes clear that mathematicians do not want their work used as training data for systems deployed in those contexts without their consent. The Declaration was developed over eight months by a 17-member working group following a September 2025 workshop at the Lorentz Center in Leiden. It had 37 verified signatories on its first day and is open for additional signatures from the mathematical community.
[8]
An OpenAI Model 'Disproved' a Famous Math Conjecture. This Mathematician Couldn't Leave It Alone
Will Sawin got OpenAI's email on a Friday night. Or Saturday morning. Either way, Sawin, a professional mathematician, spent his entire weekend thinking about that email. By next Monday, he decided to write up a paper that essentially improved what was given to him -- an AI's "proof" of Paul Erdős's unit-distance problem, an infamous conjecture from 1946. Last week, OpenAI published a blog post on the AI's proof. The paper came with a companion piece containing comments from nine renowned mathematicians uninvolved with OpenAI, including Sawin. Many prominent mathematicians praised the work, with Fields medalist Tim Gowers calling it a "milestone in AI mathematics." This result is just one of dozens of AI-derived solutions to long-time mathematical riddles. All this has us asking: Could AI usher in a new era of mathematical advancements? The answer, if one even exists, is likely a nuanced one. There are certainly computational advantages that AI brings to the equation (no pun intended). But what does this really mean? Does that represent some tangible revolution, or is it a "misconception" stemming from AI's data-driven imitation of human intelligence, to quote Pope Leo's recent encyclical? We'll for sure continue to see more AI solutions pop up -- after all, impossible math conjectures come by the hundreds -- and each time, human mathematicians will be summoned to check the computer's work. To OpenAI's credit, its blog post closes with this pleasant sentiment: "People choose the problems that matter, interpret the results, and decide what questions to pursue next." So, Gizmodo reached out to Will Sawin, who appears to have done just that: interpret the results and pursue a relevant question. We wanted to know what the experience was like. Sawin is a Fernholz Professor of mathematics at Princeton University. He began his academic career at Yale when he was 10 years old and has since worked in number theory, algebraic geometry, and combinatorics. During the conversation, Gizmodo asked Sawin about his own experience reviewing AI-derived mathematical proofs, the reality of using AI in mathematics -- and, most importantly, what it is and is not doing. The following conversation has been edited for grammar and clarity. Gayoung Lee, Gizmodo: Let's start with the headline of OpenAI's blog post: "an OpenAI model has disproved a central conjecture in discrete geometry." What's the conjecture we're talking about here? Will Sawin: We start with the problem in simple form: if you have a set of points in the plane, how many pairs of points can have a distance exactly 1 [unit distance] from each other? You can play around with different constructions. If you try it by hand, the best you're going to find is some kind of grid, like a triangular grid of points. Each point will be the next unit distance from 6 other points. That's pretty good for small numbers of points. The mathematical question is for each n number of points, "What's the greatest number of pairs of unit distances you can get with that many points?" For Erdős, the problem was, how does this grow as a function of n? And the conjecture he made was that it can grow slower than every power of n > 1. So, slower than n, and slower than n, and so on. This is a purely asymptotic question, so not about any particular value of n. I think one thing that people were disappointed by is that OpenAI's paper, our paper explaining the proof, and my paper on an optimized version -- none of these papers had an example of the construction for a particular value of n. That's because of the asymptotic nature of the problem. When Erdős first brought up the problem 80 years ago, he was not sure it should be true. But he seems to have gotten a little more confident in this over time, as nobody figured out a way to make it grow faster than that. There are some conjectures in mathematics that, if you disproved them, it would be a really big shock to the mathematical community. It wasn't such a huge shock that this statement wasn't true, but it was the opposite of what people generally believed. Gizmodo: You mentioned that Erdős grew more confident in his question. To ask a more metaphysical question -- what does it mean to prove or disprove a conjecture in mathematics? What does it mean exactly for a problem to be "solved"? Sawin: To answer your first question, a proof or a disproof in mathematics is an argument that is completely convincing, leaving no room for doubt. What mathematicians have considered to be a proof has changed over time. Hundreds of years ago, you might see people use some kind of physical reasoning and consider that to be about the real world and consider that to be a mathematical proof. Now people have different standards. One standard is a formal proof. There exist formal proof systems with logical rules for what statement you're allowed to introduce from another statement. One common belief is that a proof really should be a formal proof. If you have an informal proof using English words, it's only a valid proof if it's an explanation of why there exists a formal proof. Other people disagree and say there's something about informal proofs that is not completely captured by formal proofs. Let me not take a position on that philosophical question right now (laughs). But the [OpenAI] proof in question is an informal proof that you could eventually turn into a formal proof if you wanted. And this is usually what mathematicians mean when they say something's been "proved." Gizmodo: So OpenAI's proof is an informal proof? Sawin: It's an informal proof that looks very similar to the informal proofs that mathematicians produce. There are some things about the way it's organized that, if you've seen a lot of mathematical proofs, you can tell aren't exactly the same as how a mathematician would write them. But I would say someone who has not read a lot of mathematical proofs might not be able to tell the difference. Gizmodo: Could you unpack for me the content of this informal proof? How did the AI arrive at its conclusions, from what you can tell? Sawin: If you were trying to prove what Erdős said -- an upper bound for the number of unit distances -- you'd have to reason about any possible collection of points on the plane to make some argument that's valid for any set of points on the plane. Which would be hard. So the disproof, in some sense, is easier, because you have to come up with a specific sequence of points in the plane. So it's a question about the limit as the size goes to infinity, and you need to construct a sequence and show it has a lot of unit distances. The way the AI did this was to use algebraic number theory to use a ring of integers in an algebraic number field. When I describe this idea at that level of generality, it's not so foreign to what mathematicians had already tried. I would say the key thing that AI realized that humans didn't is not just that you can use algebraic number fields but that you can use algebraic number fields of growing degrees. You let the degree of the field grow, which basically increases the kind of complexity of the numbers you're working with. And that makes the number of unit distances grow very rapidly and more rapidly than Erdős expected. Gizmodo: I'm admittedly somewhat skeptical about news that AI did this impossible thing in math, since my thought tends to go to how AI is just a really great computing device. But here, it sounds like OpenAI's model looked at what human mathematicians were doing and kind of... made a logical decision as opposed to pure computation. Is that something we've seen before? Sawin: It depends on what you mean by "before." In terms of the cleverness of the mathematical reasoning, there's not a big gap between this and the most impressive previous cases of OpenAI and mathematics that I have seen. Erdős had a huge number of problems, and Thomas Bloom collected over a thousand on a website. And he has not collected all the problems that Erdős asked; there are even more than that. Over the last few months, a lot of people have been trying to use AI to generate solutions to Erdős problems. Some solutions aren't incredibly impressive. Some of them are, like, the AI discovers there's a paper where somebody solved the problem already, but they just didn't know about the problem. So they didn't know, but the [AI realizes] their paper just immediately solves the problem. But in some cases, AI introduces some new ideas that humans didn't use. Sometimes, the idea isn't very interesting. Other times, the idea is interesting and leads humans to wonder what they could do with this idea. That is not too dissimilar to what happened here [with OpenAI]. I think this was a more technically intricate idea than previous problems that I've seen. It definitely was a bigger problem that more people knew about and more people had worked on. I can tell you reasons for skepticism. So certainly, this is an idea that, as far as we can tell, humans did not come up with. This is not an idea that humans couldn't have come up with. I can see why it was hard for somebody to come up with that idea, but people come up with ideas that are hard for people to come up with all the time. It's definitely not a situation where AI is doing something that humans can't do. We don't know the amount of computing power that was used, the amount of problems that OpenAI tried it on, or how the AI system is set up. I mean, OpenAI did say some things about this, but certainly not at the level of detail that they used to have when AI was not such a big deal. So exactly how hard it is for the AI to do this is something we don't fully know. Gizmodo: On that note, let's talk about your own preprint. This "refined" the AI's proof. What does that mean? Sawin: Yeah. So the conjecture is that this function grows slower than any power of n > 1. The OpenAI proof said that it actually grows faster than a power of n > 1. But it doesn't tell you which power of n > 1, just "some" power of n > 1. I wanted to try getting an explicit value that's reasonably good, that's not a really tiny distance away from 1. The value I ended up getting is a little more than 1.01, so not a big difference from 1. Other people have improved it, and it's now more like 1.03. And as I was reading the argument of the AI and trying to understand each step, I was thinking, 'How would I do each step in an efficient way?' A lot of things about the AI's argument were inefficient in some way, which I think is not surprising. Like, a human wouldn't have written it that way. But the AI was clearly not trying to be efficient. There was definitely nothing wrong with the original argument. It achieved this goal. I had a somewhat different goal. But basically, almost every step of the argument had to be changed in some way to support this goal of getting a reasonable, explicit constant. Gizmodo: And OpenAI was fine that you did this? Sawin: Oh yeah, I told them. And they were fine with it. Their announcement mentioned my paper. No concerns. Gizmodo: Has this experience influenced at all how you view AI's impact on your work, or the perception of your work, as a mathematician? Sawin: There are two things that these AIs are very effective at now. One is searching the literature. If you want to know if a certain theorem exists, you're much more likely to get the result by asking an LLM than you are by typing it into a mathematics-specific search engine. The other thing it's good at is reading and proofreading a paper -- not just typos or grammatical errors but also mathematical errors that could affect the result. I mean, I do still read and proofread my paper as a human, because, you know, that's what's appropriate. And I think most other mathematicians don't want to let it write their papers for them, either, for reasons of personal credibility and accountability, but also because one still shouldn't fully trust the AI. They've gotten a lot better, but there are still blind spots. If you're using an AI-generated idea, you should still express your understanding of it as a human. As for generating ideas, I haven't found AI very effective at that. I think that's partly because it varies from one field of mathematics to another. It's easier for AI to generate new ideas in combinatorics than in some areas where there's more technical background and fewer prior works to use as a point of comparison. Gizmodo: And as you mentioned, sometimes it's because an AI having an idea doesn't necessarily mean that it's an idea that humans would never have thought about. Sawin: Yeah. Like, if I were to ask AI for an idea, I wouldn't be asking so I could throw out the ones I already came up with for a question. Gizmodo: So to you, a mathematician, is AI a partner or a search engine tool? Sawin: Currently, I'd say tool. The people that are having the most success are using AI as a tool. It's completely conceivable that soon somebody will prompt an AI to look at the recent literature, come up with its own problems, and solve those problems. And it'll come up with interesting stuff. But that hasn't happened yet, as far as I'm aware. I don't want to try to make any predictions about the future, but that's what's true currently. Gizmodo: Okay. So I'd like to ask you for some advice. When we see another AI-written, fantastical proof, what do you think is the most important question we should ask ourselves? Sawin: I'd say that it is probably less exciting than it sounds (laughs). But it's probably still somewhat exciting. If somebody's like, "Oh, AI solved this impossible math problem," it's definitely not that. It's not that AI solved an impossible math problem, but it's not nothing. It's somewhere in between. I've definitely seen people whose first reaction is to assume that if AI solved an impossible math problem, like, human mathematics is over. And I've definitely seen people whose reaction is to assume that it's nothing, that AI really didn't do anything. It's definitely somewhere in between those two. And I'm not sure exactly where it is, but it's somewhere in between those two.
[9]
Over 150 Mathematicians Warn Governments Not to "Believe the Hype" About AI
Can't-miss innovations from the bleeding edge of science and tech Earlier this year, a 23-year-old without any formal mathematics training made headlines by claiming he'd used OpenAI's ChatGPT to solve one of the "Erdős problems" -- a database of challenging conjectures left behind by Hungarian mathematician Paul Erdős. Then, last month, scholars were taken aback when OpenAI claimed its AI had disproved an 80-year-old "unit distance" conjecture, also devised by Erdős. "This marks the first time AI has autonomously solved a prominent open problem central to a field of mathematics," OpenAI boasted at the time. But whether the frontier AI models powering tools like ChatGPT really represent a major leap in our ability to solve problems that have been plaguing mathematicians for decades remains hotly debated among experts. In perhaps the strongest public rebuke yet, a new declaration signed by over 150 mathematics experts from around the world warned governments not to "believe the hype" when it comes to AI's capabilities to solve complex mathematical problems, throwing cold water on claims of a revolution in the field. In a statement accompanying the 11-page "Leiden Declaration on AI and Mathematics," International Mathematical Union vice president Ulrike Tillmann argued that AI "raises questions that cannot be left unexamined." "The future of mathematical research must be guided by human judgment, fair and transparent practices, and the shared values of the global mathematical community," Tillmann said. "There is currently a strong commercial incentive on the part of the technology industry to overstate the capabilities of their products," the declaration reads, advising policymakers to "consult with experts, including mathematicians, in forming policy decisions rather than relying on press releases or popular reporting of mathematical results." Worse yet, AI models may produce convincing-sounding solutions that don't actually withstand scrutiny. "Current automated techniques can produce plausible but unreliable (or even incorrect) arguments which are difficult to distinguish from correct mathematical proofs," said signee and University of Oxford head of computer science Leslie Ann Goldberg in a statement. "This is a serious problem: research in mathematics (and in mathematical disciplines like theoretical Computer Science) almost always builds on previous research, so it is essential for researchers to know that the results in the literature are correct." The declaration highlighted the highly precarious position many academics are finding themselves in. Attracting new funding has proven difficult, while interest in AI continues to soar, often forcing them to endorse the tech at all costs. "We recognize that industry has offered lucrative jobs, monetary rewards, computing resources, and intellectually stimulating opportunities that some mathematicians have found attractive," it reads. "This has taken place in an era of underfunding of higher education and precarious academic employment." The document also noted there were plenty of other reasons to call for regulatory oversight beyond the field of mathematics, noting the AI industry's "involvement in military and mass surveillance programs, development of technologies which promote misinformation and undermine democracy, and environmental costs." In short, it's a ringing denunciation of the persistent hype surrounding AI, and a call for reining in its use that reverberates far beyond the world of mathematics. The broader scientific community has been reeling from a flood of papers that make heavy use of AI, risking contaminating the peer-review process with hallucinations. It's also a pertinent reminder that AI models are being trained on cutting-edge research, often without sign-off from the original authors. "Mathematicians who never intended to contribute to AI development are having their work used for this purpose without their consent," Leiden University anthropologist of AI Rodrigo Ochigame, who helped draft the declaration, told Scientific American. "I think that's a deeply concerning situation." More on AI and mathematics: Mathematicians Claim Significant Discovery Using ChatGPT
[10]
Mathematicians say 'don't believe hype' on AI capabilities
Mathematicians are urging caution about artificial intelligence claims. Over 150 professors signed a declaration warning governments not to believe the hype surrounding AI's math skills. They highlight commercial incentives to overstate capabilities. The declaration emphasizes guiding mathematical research with human judgment and transparency. Concerns include AI producing incorrect proofs and undermining research credit. Dozens of mathematicians signed a declaration Tuesday calling for the discipline to resist beating the drum for artificial intelligence developers. The "Leiden Declaration", backed by over 150 professors from across the world including Europe, Japan and the US, warned governments especially not to "believe the hype" about systems' maths abilities. Their intervention follows claims of increasing capability from AI firms, including performance in elite international competitions and alleged solutions to thorny open questions in the field. AI "opens new and exciting opportunities, but it also raises questions that cannot be left unexamined," International Mathematical Union (IMU) vice-president Ulrike Tillmann wrote in an endorsement. "The future of mathematical research must be guided by human judgment, fair and transparent practices, and the shared values of the global mathematical community," she added. AI developers face "a strong commercial incentive... to overstate the capabilities of their products," the declaration read. Released "on market timelines" rather than at the pace of human-reviewed science, AI publicity can "misleadingly use specific mathematical tasks as metrics for the general reasoning capacities of commercial models", it added. With hundreds of billions of dollars in investor cash up for grabs, companies are scrambling to paint AI models in a glowing light. "There is a competition to the death on the part of the main labs... they are trying, using mathematics... to attract investment so that each of them will be left standing," Columbia University professor Michael Harris, one of the declaration's co-authors, told AFP. In recent days, both SpaceX -- the Elon Musk-owned rocket firm which includes subsidiary xAI -- and Anthropic have advanced towards stock market listings, while industry standard-bearer OpenAI is believed to be close behind. Just last week, OpenAI published to social media a video in which UCLA professor Terence Tao, a past winner of the IMU's prestigious Fields Medal, vaunted its products' potential to support research. Tao is "a very generous person, he gives an enormous amount to mathematics, but it's not healthy... to keep turning to the same person as if that person is the voice through which mathematics speaks," Harris said. Beyond the risk of maths being enlisted for commercial gain, the declaration authors said AI systems could produce plausible-seeming but incorrect proofs that are hard for humans to verify, or undermine attribution and credit to researchers on whose work it builds. They fear increased use of AI in maths could incentivise bandwagon-chasing research that takes advantage of the new tools at the expense of other problems, short-circuit peer review systems and put researchers at the service of AI developers, rather than self-directed free inquiry as in universities. AI also has potential harms in the shape of "warfare, mass surveillance, political disruption and environmental damage," the authors wrote. They urged individual mathematicians to "evaluate the ethical consequences of your research, and if necessary withdraw from harmful work".
Share
Copy Link
OpenAI's AI model disproved the famous Erdős unit distance conjecture that stumped mathematicians for eight decades. But the breakthrough sparked concern across the field. In response, 16 experts released the Leiden Declaration on Artificial Intelligence and Mathematics, calling for transparency, proper attribution, and guardrails to protect research integrity as AI reshapes their discipline.
In mid-May, OpenAI announced that an internal AI model had disproved the Erdős unit distance conjecture, a famous problem in discrete geometry that had remained unsolved for 80 years. The unit distance problem, introduced by prolific mathematician Paul Erdős in 1946, asks a deceptively simple question: for any number of points on a flat surface, what is the greatest possible number of pairs that can be exactly one unit apart
4
? While mathematicians had believed the conjecture was true, the OpenAI AI model tried to disprove it instead—and succeeded4
. Tim Gowers, who won the Fields Medal, wrote that "there is no doubt that the solution to the unit-distance problem is a milestone in AI mathematics". University of Toronto professor Daniel Litt called it "the first example of a result produced autonomously by an AI that I find exciting in itself".
Source: Gizmodo
The achievement represents a significant step in the progression of AI in mathematics. Just three years ago, large language models struggled with basic arithmetic problems, and only last year did they start acing high school mathematics competitions. The rapid acceleration continued when Google DeepMind's AlphaProof achieved silver-level performance on the International Mathematical Olympiad in July 2024, solving four out of six questions
3
. A year later, both Google and OpenAI announced gold-level performance, making mathematicians "sit up" and realize the technology's advancing capabilities3
. However, the OpenAI proof relied on perseverance rather than creative insight4
. Melanie Matchett Wood of Harvard University noted that while it's "a beautiful piece of mathematics," she wasn't convinced this represents a breakthrough in Artificial Intelligence itself4
. Thomas Bloom of the University of Manchester observed that it would have been "truly incredible" if the AI had managed to prove the conjecture, as that would require creative insight4
.
Source: Gizmodo
The OpenAI announcement triggered swift action from the mathematical community. On June 2, just days after the AI-generated proofs made headlines, a group of 16 mathematicians, computer scientists, and math historians published the Leiden Declaration on Artificial Intelligence and Mathematics
2
5
. The declaration aims to "frame the conversation about future directions," according to Dame Ursula Martin, one of the authors and a mathematician at Oxford5
. As of June 5, the declaration had gathered 1,590 signatures4
. The 11-page document lays out what mathematicians value about their research, how those values are threatened by AI, and how to address the situation2
. Among the most important prescriptions: disclose the use of AI in research, ensure all papers receive peer review, and level the playing field between academia and for-profit companies through legal resources and public funding2
.
Source: Futurism
The declaration addresses multiple threats to research integrity. Journal editors are already complaining about a flood of plausible-seeming AI-generated proofs that turn out to be incorrect in ways that are difficult for mathematicians to discern
5
. Bloom has seen people claim solutions to open problems using AI to generate "hundreds of pages of math that they can't understand or even read," raising the question: "Who's going to be able to check this?"4
. The declaration also highlights how AI generates mathematical reasoning without showing what work inspired the ideas4
. Wood raised concerns that the OpenAI paper did not appropriately reference "a history of closely related ideas in the literature"5
. Rodrigo Ochigame, an anthropologist of AI at Leiden University, criticized the pattern: "The AI model is proprietary and unavailable to anyone outside the company. We get a flashy promotional video, while basic information needed to assess the scientific meaning of the result is kept secret"5
.Related Stories
The rapid progress has left many mathematicians questioning their future role. At a hastily organized meeting in San Francisco in April, there was "an air of excitement and curiosity," but also "an undeniable feeling of existential dread," according to Jacob Tsimerman at the University of Toronto
3
. Jeremy Avigad at Carnegie Mellon University wrote: "We are running out of places to hide. We have to face up to the fact that AI will soon be able to prove theorems better than we can"3
. However, some see a complementary future. The current results point to a medium-term scenario where collaboration between humans and AI becomes standard: AI systems have broader knowledge of past work and more willingness to grind through tedious proof strategies, while humans can think more deeply about any one problem and ask more interesting questions. Terence Tao at UCLA suggests the field is moving from an era of "proof scarcity" to one of abundance, where mathematicians might race to be the first to understand a proof rather than find it3
.The declaration authors articulate that technology companies' involvement raises the risk that research questions are prioritized because of their amenability to AI methods rather than their deeper significance to understanding
5
. This disadvantages researchers who choose not to use the technology or lack access to it5
. Jim Portegies at Eindhoven University of Technology notes that while almost every modern paper in math can be read for free on arXiv.org, tech companies often keep key details private—when Google DeepMind announced AlphaProof in 2024, it took more than a year before methods were published in a peer-reviewed journal2
. The declaration is being endorsed by the International Mathematical Union and has a slot at the International Congress of Mathematicians in July in Philadelphia5
. Most crucial, according to Ochigame, is the call for commercial AI companies to adhere to the declaration's principles and implement proper guardrails2
.Summarized by
Navi
[2]
[3]
[4]
21 May 2026•Science and Research

07 May 2025•Science and Research

20 Jul 2025•Science and Research

1
Startups

2
Policy and Regulation

3
Policy and Regulation
