Google DeepMind's AlphaEvolve: A Breakthrough in AI-Driven Scientific Discovery

16 Sources

[1]

DeepMind unveils 'spectacular' general-purpose science AI

Google DeepMind has used chatbot models to come up with solutions to major problems in mathematics and computer science. The system, called AlphaEvolve, combines the creativity of a large language model (LLM) with algorithms that can scrutinize the model's suggestions and to filter and improve solutions. It was described in a preprint released by the company on 14 May. "The paper is quite spectacular," says Mario Krenn, who leads the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany. "I think AlphaEvolve is the first successful demonstration of new discoveries based on general-purpose LLMs." As well as using the system to discover solutions to open maths problems, DeepMind has already applied the artificial intelligence (AI) technique to its own practical challenges, says Pushmeet Kohli, head of science at the firm in London. AlphaEvolve has helped to improve the design of the firm's next generation of tensor processing units -- computing chips developed specially for AI -- and has found a way to more efficiently exploit Google's worldwide computing capacity, saving 0.7% of total resources. "It has had substantial impact," says Kohli. Most of the successful applications of AI in science so far -- including the protein-designing tool AlphaFold -- have involved a learning algorithm that was hand-crafted for its task, says Krenn. But AlphaEvolve is general-purpose, tapping the abilities of LLMs to generate code to solve problems in a wide range of domains. DeepMind describes AlphaEvolve as an 'agent', because it involves using interacting AI models. But it targets a different point in the scientific process from many other 'agentic' AI science systems, which have been used to review the literature and suggest hypotheses. AlphaEvolve is based on the firm's Gemini family of LLMs. Each task starts with the user inputting a question, criteria for evaluation and a suggested solution, for which the LLM proposes hundreds or thousands of modifications. An 'evaluator' algorithm then assesses the modifications against the metrics for a good solution (for example, in the task of assigning Google's computing jobs, researchers want to waste fewer resources). On the basis of which solutions are judged to be the best, the LLM suggests fresh ideas and over time the system evolves a population of stronger algorithms, says Matej Balog, an AI scientist at DeepMind who co-led the research. "We explore this diverse set of possibilities of how the problem can be solved," he says. AlphaEvolve builds on the firm's FunSearch system, which in 2023 was shown to use a similar evolutionary approach to outdo humans in unsolved problems in mathematics. Compared to FunSearch, AlphaEvolve can handle much larger pieces of code and tackle more complex algorithms across a wide range of scientific domains, says Balog. DeepMind says that AlphaEvolve has come up with a way to perform a mathematics calculation known as matrix multiplication that in some cases is faster than the fastest-known method, which was developed by German mathematician Volker Strassen in 1969. Such calculations involve multiplying numbers in grids and are used to train neural networks. Despite being general-purpose, AlphaEvolve outperformed AlphaTensor, an AI tool described by the firm in 2022 and designed specifically for matrix mechanics. The approach could be used to tackle optimization problems, says Krenn, or anywhere in science where there are concrete metrics, or simulations, to evaluate what makes a good solution. This could include designing new microscopes, telescope or even materials, he adds. In mathematics, AlphaEvolve seems to allow significant speed-ups in tackling some problems, says Simon Frieder, a mathematician and AI researcher at the University of Oxford, UK. But it will probably only be applied to the "narrow slice" of tasks that can be presented as problems to be solved through code, he says. Other researchers are reserving judgement about the tool's usefulness until has been trialled outside DeepMind. "Until the systems have been tested by a broader community, I would stay sceptical and take the reported results with a grain of salt," says Huan Sun, an AI researcher at the Ohio State University in Columbus. Frieder says he will wait until an open-source version is recreated by researchers, rather than a rely on DeepMind's proprietary system, which could be withdrawn or changed. Although AlphaEvolve requires less computing power to run than AlphaTensor, it is still too resource-intensive to be made freely available on DeepMind's servers, says Kohli. But the company hopes that announcing the system will encourage scientists to suggest areas of science in which to apply AlphaEvolve. "We are definitely committed to make sure that the most people in the scientific community get access to it," says Kohli.

[2]

Ars Technica

Google DeepMind creates super-advanced AI that can invent new algorithms

Google's DeepMind research division claims its newest AI agent marks a significant step toward using the technology to tackle big problems in math and science. The system, known as AlphaEvolve, is based on the company's Gemini large language models (LLMs), with the addition of an "evolutionary" approach that evaluates and improves algorithms across a range of use cases. AlphaEvolve is essentially an AI coding agent, but it goes deeper than a standard Gemini chatbot. When you talk to Gemini, there is always a risk of hallucination, where the AI makes up details due to the non-deterministic nature of the underlying technology. AlphaEvolve uses an interesting approach to increase its accuracy when handling complex algorithmic problems. According to DeepMind, this AI uses an automatic evaluation system. When a researcher interacts with AlphaEvolve, they input a problem along with possible solutions and avenues to explore. The model generates multiple possible solutions, using the efficient Gemini Flash and the more detail-oriented Gemini Pro, and then each solution is analyzed by the evaluator. An evolutionary framework allows AlphaEvolve to focus on the best solution and improve upon it. Many of the company's past AI systems, for example, the protein-folding AlphaFold, were trained extensively on a single domain of knowledge. AlphaEvolve, however, is more dynamic. DeepMind says AlphaEvolve is a general-purpose AI that can aid research in any programming or algorithmic problem. And Google has already started to deploy it across its sprawling business with positive results.

[3]

IEEE Spectrum

Can Large Language Models Actually Discover New Algorithms?

Dina Genkina is the computing and hardware editor at IEEE Spectrum There's a mathematical concept called the 'kissing number.' Somewhat disappointingly, it's got nothing to do with actual kissing; It enumerates how many spheres can touch (or 'kiss') a single sphere of equal size without crossing it. In one dimension, the kissing number is two. In two dimensions it's 6 (think the New York Times'spelling bee puzzle configuration). As the number of dimensions grows, the answer becomes less obvious: For most dimensionalities over 4, only upper and lower bounds on the kissing number are known. Now, an AI agent developed by Google DeepMind called AlphaEvolve has made its contribution to the problem, increasing the lower bound on the kissing number in 11 dimensions from 592 to 593. This may seem like an incremental improvement on the problem, especially given that the upper bound on the kissing number in 11 dimensions is 868, so the unknown range is still quite large. But it represents a novel mathematical discovery by an AI agent, and challenges the idea that large language models are not capable of original scientific contributions. And this is just one example of what AlphaEvolve has accomplished. "We applied AlphaEvolve across a range of open problems in research mathematics, and we deliberately picked problems from different parts of math: analysis, combinatorics, geometry," says Matej Balog, a research scientist at DeepMind that worked on the project. They found that for 75 percent of the problems, the AI model replicated the already known optimal solution. In 20 percent of cases, it found a new optimum that surpassed any known solution. "Every single such case is a new discovery," Balog says. (In the other 5 percent of cases, the AI converged on a solution that was worse than the known optimal one.) The model also developed a new algorithm for matrix multiplication -- the operation that underlies much of machine learning. A previous version of DeepMind's AI model, called AlphaTensor, had already beat the previous best known algorithm, discovered in 1969, for multiplying 4 by 4 matrices. AlphaEvolve found a more general version of that improved algorithm. In addition to abstract math, the team also applied their model to practical problems Google as a company faces every day. The AI was also used to optimize data center orchestration to gain 1 percent improvement, to optimize the design of the next Google tensor processing unit, and to discover an improvement to a kernel used in Gemini training leading to a 1 percent reduction in training time. "It's very surprising that you can do so many different things with a single system," says Alexander Novikov, a senior research scientist at DeepMind who also worked on AlphaEvolve. AlphaEvolve is able to be so general because it can be applied to almost any problem that can be expressed as code, and which can be checked by another piece of code. The user supplies an initial stab at the problem -- a program that solves the problem at hand, however suboptimally -- and a verifier program that checks how well a piece of code meets the required criteria. Then, a large language model, in this case Gemini, comes up with other candidate programs to solve the same problem, and each one is tested by the verifier. From there, AlphaEvolve uses a genetic algorithm such that the 'fittest' of the proposed solutions survive and evolve to the next generation. This process repeats until the solutions stop improving. "Large language models came around, and we started asking ourselves, is it the case that they are only going to add what's in the training data, or can we actually use them to discover something completely new, new algorithms or new knowledge?" Balog says. This research, Balog claims, shows that "if you use the large language models in the right way, then you can, in a very precise sense, get something that's provably new and provably correct in the form of an algorithm." For AlphaEvolve, however, the team broke from the reinforcement learning tradition in favor of the genetic algorithm. "The system is much simpler," Balog says. "And that actually has consequences, that it's much easier to set up on a wide range of problems." The team behind AlphaEvolve hopes to evolve their system in two ways. First, they want to apply it to a broader range of problems, including those in the natural sciences. To pursue this goal, they are planning to open up an early access program for interested academics to use AlphaEvolve in their research. It may be harder to adapt the system to the natural sciences, as verification of proposed solutions may be less straightforward. But, Balog says, "we know that in the natural sciences, there are plenty of simulators for different types of problems, and then those can be used within AlphaEvolve as well. And we are, in the future, very much interested in broadening the scope in this direction." Second, they want to improve the system itself, perhaps by coupling it with another DeepMind project: the AI co-scientist. This AI also uses an LLM and a genetic algorithm, but it focuses on hypothesis generation in natural language. "They develop these higher-level ideas and hypotheses," Balog says. "Incorporating this component into AlphaEvolve-like systems, I believe, will allow us to go to higher levels of abstraction." These prospects are exciting, but for some they may also sound menacing -- for example, AlphaEvolve's optimization of Gemini training may be seen as the beginning of recursively self-improving AI, which some worry would lead to a runaway intelligence explosion referred to as the singularity. The DeepMind team maintains that that is not their goal, of course. "We are excited to contribute to advancing AI that benefits humanity," Novikov says.

[4]

TechCrunch

DeepMind claims its newest AI tool is a whiz at math and science problems | TechCrunch

Google's AI R&D lab, DeepMind says it has developed a new AI system to tackle problems with "machine-gradeable" solutions. In experiments, the system, called AlphaEvolve, could help optimize some of the infrastructure Google uses to train its AI models, DeepMind said. The company says it's building a user interface for interacting with AlphaEvolve, and plans to launch an early access program for selected academics ahead of a possible broader rollout. Most AI models hallucinate. Owing to their probabilistic architectures, they confidently make things up sometimes. In fact, newer AI models like OpenAI's o3 hallucinate more than their predecessors, illustrating the challenging nature of the issue. AlphaEvolve introduces a clever mechanism to cut down on hallucinations: an automatic evaluation system. The system uses models to generate, critique and arrive at a pool of possible answers to a question, and automatically evaluates and scores the answers for accuracy. AlphaEvolve isn't the first system to take this tack. Researchers, including a team at DeepMind several years ago, have applied similar techniques in various math domains. But DeepMind claims AlphaEvolve's use of "state-of-the-art" models -- specifically Gemini models -- makes it significantly more capable than earlier instances of AI. To use AlphaEvolve, users must prompt the system with a problem, optionally including details like instructions, equations, code snippets and relevant literature. They must also provide a mechanism for automatically assessing the system's answers in the form of a formula. Because AlphaEvolve can only solve problems that it can self-evaluate, the system can only work with certain types of problems -- specifically those in fields like computer science and system optimization. In another major limitation, AlphaEvolve can only describe solutions as algorithms, making it a poor fit for problems that aren't numerical. To benchmark AlphaEvolve, DeepMind had the system attempt a curated set of around 50 math problems spanning branches from geometry to combinatorics. AlphaEvolve managed to "rediscover" the best-known answers to the problems 75% of the time and uncover improved solutions in 20% of cases, claims DeepMind. DeepMind also evaluated AlphaEvolve on practical problems, like boosting the efficiency of Google's data centers, and speeding up model training runs. According to the lab, AlphaEvolve generated an algorithm that continuously recovers 0.7% of Google's worldwide compute resources on average. The system also suggested an optimization that reduced the overall time it takes Google to train its Gemini models by 1%. To be clear, AlphaEvolve isn't making breakthrough discoveries. In one experiment, the system was able to find an improvement for Google's TPU AI accelerator chip design that had been flagged by other tools earlier. DeepMind, however, is making the same case that many AI labs do for their systems: that AlphaEvolve can save time while freeing up experts to focus on other, more important work.

[5]

MIT Technology Review

Google DeepMind's new AI uses large language models to crack real-world problems

Google DeepMind's new tool, called AlphaEvolve, uses the Gemini 2.0 family of large language models (LLMs) to produce code for a wide range of different tasks. LLMs are known to be hit and miss at coding. The twist here is that AlphaEvolve scores each of Gemini's suggestions, throwing out the bad and tweaking the good, in an iterative process, until it has produced the best algorithm it can. In many cases, the results are more efficient or more accurate than the best existing (human-written) solutions. "You can see it as a sort of super coding agent," says Pushmeet Kohli, a vice president at Google DeepMind who leads its AI for Science teams. "It doesn't just propose a piece of code or an edit, it actually produces a result that maybe nobody was aware of." In particular, AlphaEvolve came up with a way to improve the software Google uses to allocate jobs to its many millions of servers around the world. Google DeepMind claims the company has been using this new software across all of its data centers for more than a year, freeing up 0.7% of Google's computing resources. That might not sound like much, but at Google's scale it's huge. Jakob Moosbauer, a mathematician at the University of Warwick in the UK, is impressed. He says the way AlphaEvolve searches for algorithms that produce specific solutions -- rather than searching for the solutions themselves -- makes it especially powerful. "It makes the approach applicable to such a wide range of problems," he says. "AI is becoming a tool that will be essential in mathematics and computer science." AlphaEvolve continues a line of work that Google DeepMind has been pursuing for years. Its vision is that AI can help to advance human knowledge across math and science. In 2022, it developed AlphaTensor, a model that found a faster way to solve matrix multiplications -- a fundamental problem in computer science -- beating a record that had stood for more than 50 years. In 2023, it revealed AlphaDev, which discovered faster ways to perform a number of basic calculations performed by computers trillions of times a day. AlphaTensor and AlphaDev both turn math problems into a kind of game, then search for a winning series of moves. FunSearch, which arrived in late 2023, swapped out game-playing AI and replaced it with LLMs that can generate code. Because LLMs can carry out a range of tasks, FunSearch can take on a wider variety of problems than its predecessors, which were trained to play just one type of game. The tool was used to crack a famous unsolved problem in pure mathematics. AlphaEvolve is the next generation of FunSearch. Instead of coming up with short snippets of code to solve a specific problem, as FunSearch did, it can produce programs that are hundreds of lines long. This makes it applicable to a much wider variety of problems. In theory, AlphaEvolve could be applied to any problem that can be described in code and that has solutions that can be evaluated by a computer. "Algorithms run the world around us, so the impact of that is huge," says Matej Balog, a researcher at Google DeepMind who leads the algorithm discovery team.

[6]

Scientific American

New Google AI Chatbot Tackles Complex Math and Science

A Google DeepMind system improves chip designs and addresses unsolved math problems but has not been rolled out to researchers outside the company Google DeepMind has used chatbot models to come up with solutions to major problems in mathematics and computer science. The system, called AlphaEvolve, combines the creativity of a large language model (LLM) with algorithms that can scrutinize the model's suggestions to filter and improve solutions. It was described in a white paper released by the company on 14 May. "The paper is quite spectacular," says Mario Krenn, who leads the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany. "I think AlphaEvolve is the first successful demonstration of new discoveries based on general-purpose LLMs." If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today. As well as using the system to discover solutions to open maths problems, DeepMind has already applied the artificial intelligence (AI) technique to its own practical challenges, says Pushmeet Kohli, head of science at the firm in London. AlphaEvolve has helped to improve the design of the company's next generation of tensor processing units -- computing chips developed specially for AI -- and has found a way to more efficiently exploit Google's worldwide computing capacity, saving 0.7% of total resources. "It has had substantial impact," says Kohli. Most of the successful applications of AI in science so far -- including the protein-designing tool AlphaFold -- have involved a learning algorithm that was hand-crafted for its task, says Krenn. But AlphaEvolve is general-purpose, tapping the abilities of LLMs to generate code to solve problems in a wide range of domains. DeepMind describes AlphaEvolve as an 'agent', because it involves using interacting AI models. But it targets a different point in the scientific process from many other 'agentic' AI science systems, which have been used to review the literature and suggest hypotheses. AlphaEvolve is based on the firm's Gemini family of LLMs. Each task starts with the user inputting a question, criteria for evaluation and a suggested solution, for which the LLM proposes hundreds or thousands of modifications. An 'evaluator' algorithm then assesses the modifications against the metrics for a good solution (for example, in the task of assigning Google's computing jobs, researchers want to waste fewer resources). On the basis of which solutions are judged to be the best, the LLM suggests fresh ideas and over time the system evolves a population of stronger algorithms, says Matej Balog, an AI scientist at DeepMind who co-led the research. "We explore this diverse set of possibilities of how the problem can be solved," he says. AlphaEvolve builds on the firm's FunSearch system, which in 2023 was shown to use a similar evolutionary approach to outdo humans in unsolved problems in maths. Compared with FunSearch, AlphaEvolve can handle much larger pieces of code and tackle more complex algorithms across a wide range of scientific domains, says Balog. DeepMind says that AlphaEvolve has come up with a way to perform a calculation, known as matrix multiplication, that in some cases is faster than the fastest-known method, which was developed by German mathematician Volker Strassen in 1969. Such calculations involve multiplying numbers in grids and are used to train neural networks. Despite being general-purpose, AlphaEvolve outperformed AlphaTensor, an AI tool described by the firm in 2022 and designed specifically for matrix mechanics. The approach could be used to tackle optimization problems, says Krenn, or anywhere in science where there are concrete metrics, or simulations, to evaluate what makes a good solution. This could include designing new microscopes, telescope or even materials, he adds. In mathematics, AlphaEvolve seems to allow significant speed-ups in tackling some problems, says Simon Frieder, a mathematician and AI researcher at the University of Oxford, UK. But it will probably be applied only to the "narrow slice" of tasks that can be presented as problems to be solved through code, he says. Other researchers are reserving judgement about the tool's usefulness until has been trialled outside DeepMind. "Until the systems have been tested by a broader community, I would stay sceptical and take the reported results with a grain of salt," says Huan Sun, an AI researcher at the Ohio State University in Columbus. Frieder says he will wait until an open-source version is recreated by researchers, rather than a rely on DeepMind's proprietary system, which could be withdrawn or changed. Although AlphaEvolve requires less computing power to run than AlphaTensor, it is still too resource-intensive to be made freely available on DeepMind's servers, says Kohli. But the company hopes that announcing the system will encourage researchers to suggest areas of science in which to apply AlphaEvolve. "We are definitely committed to make sure that the most people in the scientific community get access to it," says Kohli.

[7]

Wired

Google DeepMind's AI Agent Dreams Up Algorithms Beyond Human Expertise

A key question in artificial intelligence is how often models go beyond just regurgitating and remixing what they have learned and produce truly novel ideas or insights. A new project from Google DeepMind shows that with a few clever tweaks these models can at least surpass human expertise designing certain types of algorithms -- including ones that are useful for advancing AI itself. The company's latest AI project, called AlphaEvolve, combines the coding skills of its Gemini AI model with a method for testing the effectiveness of new algorithms and an evolutionary method for producing new designs. AlphaEvolve came up with more efficient algorithms for several kinds of computation, including a method for calculations involving matrices that betters an approach called the Strassen algorithm that has been relied upon for 56 years. The new approach improves the computational efficiency by reducing the number of calculations required to produce a result. DeepMind also used AlphaEvolve to come up with better algorithms for several real-world problems including scheduling tasks inside datacenters, sketching out the design of computer chips, and optimizing the design of the algorithms used to build large language models like Gemini itself. "These are three critical elements of the modern AI ecosystem," says Pushmeet Kohli, head of AI for science at DeepMind. "This superhuman coding agent is able to take on certain tasks and go much beyond what is known in terms of solutions for them." Matej Balog, one of the research leads on AlphaEvolve, says that it is often difficult to know if a large language model has come up with a truly novel piece of writing or code, but it is possible to show that no person has come up with a better solution to certain problems. "We have shown very precisely that you can discover something that's provably new and provably correct," Balog says. "You can be really certain that what you have found couldn't have been in the training data." Sanjeev Arora, a scientist at Princeton University specializing in algorithm design, says that the advancements made by AlphaEvolve are relatively small and only apply to algorithms that involve searching through a space of potential answers. But he adds: "search is a pretty general idea applicable to many settings." AI-powered coding is starting to change the way developers and companies write software. The latest AI models make it trivial for novices to build simple apps and websites, and some experienced developers are using AI to automate more of their work. AlphaEvolve demonstrates the potential for AI to come up with completely novel ideas through continual experimentation and evaluation. DeepMind and other AI companies hope that AI agents will gradually learn to exhibit more general ingenuity in many areas, perhaps eventually generating ingenious solutions to a business problem or novel insights when given a particular problem. Josh Alman, an assistant professor at Columbia University who works on algorithm design, says that AlphaEvolve does appear to be generating novel ideas rather than remixing stuff it's learned during training. "It has to be doing something new and not just regurgitating," he says.

[8]

The Register

Google DeepMind debuts algorithm evolving agent, AlphaEvolve

AlphaEvolve may optimize your code in ways you hadn't thought possible. Or not. Not is possible, too Google's AI shop DeepMind has unveiled AlphaEvolve, its "evolutionary coding agent" powered by large language models to discover and optimize algorithms. Computer algorithms are sets of instructions used to solve complex problems. AlphaEvolve is pitched as a useful tool for mathematicians, scientists, and engineers working on algorithmic tasks, ranging from abstract mathematical proofs to scheduling jobs across datacenters. It promises to evaluate the performance of code using automated metrics, then proposes improvements by evolving new versions of the algorithm. For example, in an effort to improve matrix multiplication - a core operation in machine learning - AlphaEvolve discovered a new algorithm for multiplying 4×4 complex-valued matrices using just 48 scalar multiplications, surpassing Strassen's 1969 result, Google explains. Because AlphaEvolve focuses on code improvement and evaluation rather than representing hypotheses in natural language like Google's AI co-scientist system, hallucination is less of a concern. Inside Google, researchers say AlphaEvolve has improved the efficiency of data center scheduling, chip design, and AI training. They also credit it with helping design faster matrix multiplication algorithms and generating new solutions to long-standing math problems. "AlphaEvolve pairs the creative problem-solving capabilities of our Gemini models with automated evaluators that verify answers, and uses an evolutionary framework to improve upon the most promising ideas," the AlphaEvolve team explains in a blog post. DeepMind's use of the term "evolve" makes the coding agent's technological process sound organic. The accompanying paper [PDF] also uses terms with biological associations: "AlphaEvolve extends a long tradition of research on evolutionary or genetic programming, where one repeatedly uses a set of mutation and crossover operators to evolve a pool of programs." this is one more sign that neurosymbolic techniques that combine neural networks with ideas from classical AI, is the way of the future Asked whether Google's description of agent is overly anthropomorphic, Gary Marcus, an AI expert, author, and critic, told The Register that the terminology is fair enough. "The use of the term is fine, standard in that field and not unreasonable," he said. "It's great to see DeepMind think outside the box of pure large language models, and this is one more sign that neurosymbolic techniques that combine neural networks with ideas from classical AI, is the way of the future." Stuart Battersby, CTO of AI firm Chatterbox Labs, expressed optimism about AlphaEvolve's potential, while also emphasizing the need to keep security in mind during any AI deployment. "The development of AI algorithms needs to happen at pace, and so it is great to see AlphaEvolve helping to automate this process," he told The Register. "This means that AI solutions not only get through the development cycle quicker, but hopefully produce better results too - it seems that the AlphaEvolve team have provided evidence of this." Google has used AlphaEvolve to optimize the performance of its Borg compute cluster management system in its datacenters. According to the researchers, the coding agent proposed a heuristic function for online compute job scheduling that outperformed one running in production. "This solution, now in production for over a year, continuously recovers, on average, 0.7 percent of Google's worldwide compute resources," the researchers claim. The DeepMind team also note that AlphaEvolve helped optimize matrix multiplication operations involved in the training of Google's Gemini model family by speeding up its Pallas kernel 23 percent, for a training time reduction of 1 percent. To evaluate AlphaEvolve's utility, the DeepMind team gave it more than 50 open problems in mathematical analysis, geometry, combinatorics, and number theory. "In roughly 75 percent of cases, it rediscovered state-of-the-art solutions, to the best of our knowledge," the researchers claim. "And in 20 percent of cases, AlphaEvolve improved the previously best known solutions, making progress on the corresponding open problems." Google is planning to offer early access to academics. Those interested can apply here. ®

[9]

The Next Web

5 impressive feats of DeepMind's new self-evolving AI coding agent

The UK-based lab today unveiled its latest advancement: AlphaEvolve, an AI coding agent that makes large language models (LLMs) like Gemini better at solving complex computing and mathematical problems. AlphaEvolve is powered by the same models that it's trying to improve. Using Gemini, the agent proposes programs -- written in code -- that try to solve a given problem. It runs each code snippet through automated tests that evaluate how accurate, efficient, or novel it is. AlphaEvolve keeps the top-performing code snippets and uses them as the basis for the next round of generation. Over many cycles, this process "evolves" better and better solutions. In essence, it is a self-evolving AI. DeepMind has already used AlphaEvolve to tackle data centre energy use, design better chips, and speed up AI training. Here are five of its top feats so far. 1. It discovered new solutions to some of the world's toughest maths problems AlphaEvolve was put to the test on over 50 open problems in maths, from combinatorics to number theory. In 20% of cases, it improved on the best-known solutions to them. One of those was the 300-year-old kissing number problem. In 11-dimensional space, AlphaEvolve discovered a new lower bound with a configuration of 593 spheres -- progress that even expert mathematicians hadn't reached. 2. It made Google's data centres more efficient The AI agent devised a way to better manage power scheduling at Google's data centers. That has allowed the tech giant to improve its data centre energy efficiency by 0.7% over the last year -- a significant cost and energy saver given the size of its data centre operation. 3. It helped train Gemini faster AlphaEvolve improved the way matrix multiplications are split into subproblems, a core operation in training AI models like Gemini. That optimisation sped up the process by 23%, reducing Gemini's total training time by 1%. In the world of generative AI, every percentage point can translate into cost and energy savings. 4. It co-designed part of Google's next AI chip The agent is also using its code-writing skills to rewire things in the physical world. It rewrote a portion of an arithmetic circuit in Verilog -- a language used for chip design -- making it more efficient. That same logic is now being used to develop Google's future TPU (Tensor Processing Unit), an advanced chip for machine learning. 5. It beat a legendary algorithm from 1969 For decades, Strassen's algorithm was the gold standard for multiplying 4×4 complex matrices. AlphaEvolve found a more efficient solution -- using fewer scalar multiplications. This could lead to more advanced LLMs, which rely heavily on matrix multiplication to function. According to DeepMind, these feats are just the tip of the iceberg for AlphaEvolve. The lab envisions the agent solving countless problems, from discovering new materials and drugs to streamlining business operations.

[10]

VentureBeat

Meet AlphaEvolve, the Google AI that writes its own code -- and just saved millions in computing costs

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Google DeepMind today pulled the curtain back on AlphaEvolve, an artificial-intelligence agent that can invent brand-new computer algorithms -- then put them straight to work inside the company's vast computing empire. AlphaEvolve pairs Google's Gemini large language models with an evolutionary approach that tests, refines, and improves algorithms automatically. The system has already been deployed across Google's data centers, chip designs, and AI training systems -- boosting efficiency and solving mathematical problems that have stumped researchers for decades. "AlphaEvolve is a Gemini-powered AI coding agent that is able to make new discoveries in computing and mathematics," explained Matej Balog, a researcher at Google DeepMind, in an interview with VentureBeat. "It can discover algorithms of remarkable complexity -- spanning hundreds of lines of code with sophisticated logical structures that go far beyond simple functions." The system dramatically expands upon Google's previous work with FunSearch by evolving entire codebases rather than single functions. It represents a major leap in AI's ability to develop sophisticated algorithms for both scientific challenges and everyday computing problems. Inside Google's 0.7% efficiency boost: How AI-crafted algorithms run the company's data centers AlphaEvolve has been quietly at work inside Google for over a year. The results are already significant. One algorithm it discovered has been powering Borg, Google's massive cluster management system. This scheduling heuristic recovers an average of 0.7% of Google's worldwide computing resources continuously -- a staggering efficiency gain at Google's scale. The discovery directly targets "stranded resources" -- machines that have run out of one resource type (like memory) while still having others (like CPU) available. AlphaEvolve's solution is especially valuable because it produces simple, human-readable code that engineers can easily interpret, debug, and deploy. The AI agent hasn't stopped at data centers. It rewrote part of Google's hardware design, finding a way to eliminate unnecessary bits in a crucial arithmetic circuit for Tensor Processing Units (TPUs). TPU designers validated the change for correctness, and it's now headed into an upcoming chip design. Perhaps most impressively, AlphaEvolve improved the very systems that power itself. It optimized a matrix multiplication kernel used to train Gemini models, achieving a 23% speedup for that operation and cutting overall training time by 1%. For AI systems that train on massive computational grids, this efficiency gain translates to substantial energy and resource savings. "We try to identify critical pieces that can be accelerated and have as much impact as possible," said Alexander Novikov, another DeepMind researcher, in an interview with VentureBeat. "We were able to optimize the practical running time of [a vital kernel] by 23%, which translated into 1% end-to-end savings on the entire Gemini training card." Breaking Strassen's 56-year-old matrix multiplication record: AI solves what humans couldn't AlphaEvolve solves mathematical problems that stumped human experts for decades while advancing existing systems. The system designed a novel gradient-based optimization procedure that discovered multiple new matrix multiplication algorithms. One discovery toppled a mathematical record that had stood for 56 years. "What we found, to our surprise, to be honest, is that AlphaEvolve, despite being a more general technology, obtained even better results than AlphaTensor," said Balog, referring to DeepMind's previous specialized matrix multiplication system. "For these four by four matrices, AlphaEvolve found an algorithm that surpasses Strassen's algorithm from 1969 for the first time in that setting." The breakthrough allows two 4×4 complex-valued matrices to be multiplied using 48 scalar multiplications instead of 49 -- a discovery that had eluded mathematicians since Volker Strassen's landmark work. According to the research paper, AlphaEvolve "improves the state of the art for 14 matrix multiplication algorithms." The system's mathematical reach extends far beyond matrix multiplication. When tested against over 50 open problems in mathematical analysis, geometry, combinatorics, and number theory, AlphaEvolve matched state-of-the-art solutions in about 75% of cases. In approximately 20% of cases, it improved upon the best known solutions. One victory came in the "kissing number problem" -- a centuries-old geometric challenge to determine how many non-overlapping unit spheres can simultaneously touch a central sphere. In 11 dimensions, AlphaEvolve found a configuration with 593 spheres, breaking the previous record of 592. How it works: Gemini language models plus evolution create a digital algorithm factory What makes AlphaEvolve different from other AI coding systems is its evolutionary approach. The system deploys both Gemini Flash (for speed) and Gemini Pro (for depth) to propose changes to existing code. These changes get tested by automated evaluators that score each variation. The most successful algorithms then guide the next round of evolution. AlphaEvolve doesn't just generate code from its training data. It actively explores the solution space, discovers novel approaches, and refines them through an automated evaluation process -- creating solutions humans might never have conceived. "One critical idea in our approach is that we focus on problems with clear evaluators. For any proposed solution or piece of code, we can automatically verify its validity and measure its quality," Novikov explained. "This allows us to establish fast and reliable feedback loops to improve the system." This approach is particularly valuable because the system can work on any problem with a clear evaluation metric -- whether it's energy efficiency in a data center or the elegance of a mathematical proof. From cloud computing to drug discovery: Where Google's algorithm-inventing AI goes next While currently deployed within Google's infrastructure and mathematical research, AlphaEvolve's potential reaches much further. Google DeepMind envisions applications in material sciences, drug discovery, and other fields requiring complex algorithmic solutions. "The best human-AI collaboration can help solve open scientific challenges and also apply them at Google scale," said Novikov, highlighting the system's collaborative potential. Google DeepMind is now developing a user interface with its People + AI Research team and plans to launch an Early Access Program for selected academic researchers. The company is also exploring broader availability. The system's flexibility marks a significant advantage. Balog noted that "at least previously, when I worked in machine learning research, it wasn't my experience that you could build a scientific tool and immediately see real-world impact at this scale. This is quite unusual." As large language models advance, AlphaEvolve's capabilities will grow alongside them. The system demonstrates an intriguing evolution in AI itself -- starting within the digital confines of Google's servers, optimizing the very hardware and software that gives it life, and now reaching outward to solve problems that have challenged human intellect for decades or centuries.

[11]

Google DeepMind

AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms

New AI agent evolves algorithms for math and practical applications in computing by combining the creativity of large language models with automated evaluators Large language models (LLMs) are remarkably versatile. They can summarize documents, generate code or even brainstorm new ideas. And now we've expanded these capabilities to target fundamental and highly complex problems in mathematics and modern computing. Today, we're announcing AlphaEvolve, an evolutionary coding agent powered by large language models for general-purpose algorithm discovery and optimization. AlphaEvolve pairs the creative problem-solving capabilities of our Gemini models with automated evaluators that verify answers, and uses an evolutionary framework to improve upon the most promising ideas. AlphaEvolve enhanced the efficiency of Google's data centers, chip design and AI training processes -- including training the large language models underlying AlphaEvolve itself. It has also helped design faster matrix multiplication algorithms and find new solutions to open mathematical problems, showing incredible promise for application across many areas.

[12]

SiliconANGLE

Google DeepMind develops AlphaEvolve AI agent optimized for coding and math - SiliconANGLE

Google DeepMind develops AlphaEvolve AI agent optimized for coding and math Alphabet Inc.'s Google DeepMind unit today detailed AlphaEvolve, an artificial intelligence agent that can tackle complex programming and math challenges. The company says that it has used AlphaEvolve to make its data centers more efficient. Additionally, the AI agent is showing promise as a tool for mathematical research and chip development. AlphaEvolve carries out processing in multiple steps. When it's given a programming task, the agent uses Google LLC's lightweight Gemini 2.0 Flash language model to generate multiple pieces of code. An automated evaluation mechanism then ranks those code snippets by quality. From there, AlphaEvolve takes the best code snippets and asks Gemini 2.0 Flash to improve them. The agent makes optimizations to the AI-generated code over multiple rounds. When Gemini 2.0 Flash can no longer suggest improvements, AlphaEvolve switches to Gemini 2.0 Pro, a more capable model that trades off some speed for increased output quality. "The evolutionary process in AlphaEvolve leverages modern LLMs' ability to respond to feedback, enabling the discovery of candidates that are substantially different from the initial candidate pool in syntax and function," DeepMind researchers detailed in a research paper. Google has already put AlphaEvolve to use in multiple internal projects. Several of these initiatives focused on matrix multiplications, the mathematical operations that AI models use to process data. A matrix is a collection of numbers organized into spreadsheet-like rows and columns. Chip designers don't draw processor blueprints but rather write them using a programming syntax called Verilog. In one project, AlphaEvolve helped Google engineers enhance the Verilog code for a circuit optimized to perform matrix multiplications. The company has incorporated the circuit into an upcoming addition to its TPU line of AI processors. In another internal project, AlphaEvolve developed methods that allow Google's Gemini models to break down matrix multiplications into smaller, more manageable calculations. The search giant says that those improvements sped up one of Gemini's most important components by 23%. AlphaGo has also helped the company make its data centers more efficient. Google manages its infrastructure resources using a software platform called Borg. AlphaEvolve suggested an improvement to the platform that currently "recovers on average 0.7% of Google's fleet-wide compute resources," DeepMind's researchers detailed. According to the search giant, the reasoning capabilities that enable AlphaEvolve to optimize data centers and chip designs make it useful for mathematical research. "To investigate AlphaEvolve's breadth, we applied the system to over 50 open problems in mathematical analysis, geometry, combinatorics and number theory," the researchers wrote in a blog post that accompanied the paper. "The system's flexibility enabled us to set up most experiments in a matter of hours. In roughly 75% of cases, it rediscovered state-of-the-art solutions, to the best of our knowledge." Google plans to make the AI agent available to academics through an early access program. Additionally, the company is studying the possibility of broadening access to additional users down the line. "While AlphaEvolve is currently being applied across math and computing, its general nature means it can be applied to any problem whose solution can be described as an algorithm, and automatically verified," DeepMind's researchers wrote. "We believe AlphaEvolve could be transformative across many more areas such as material science, drug discovery, sustainability and wider technological and business applications."

[13]

Analytics India Magazine

Google DeepMind Launches AlphaEvolve, New AI Coding Agent for Maths and Science | AIM

Google DeepMind is planning an early access programme for selected academic users and is also exploring ways to make AlphaEvolve more broadly available. Google DeepMind has launched AlphaEvolve, a new coding agent that uses large language models to evolve and optimise algorithms across computing and mathematics. Powered by Gemini Flash and Gemini Pro, AlphaEvolve pairs model-generated code with automated evaluators to verify, score, and evolve high-performing solutions. "AlphaEvolve is an agent that can go beyond single function discovery to evolve entire codebases and develop much more complex algorithms," stated Google DeepMind in its blog post. Google DeepMind is planning an early access programme for selected academic users and is also exploring ways to make AlphaEvolve more broadly available. Interested users can register their interest through a dedicated form. The company believes AlphaEvolve could be transformative across multiple fields, including materials science, drug discovery, sustainability, and broader technological and business applications. The system integrates prompt sampling, language model outputs, and program evaluation through an evolutionary algorithm framework. Over the past year, AlphaEvolve has been used to improve data centre scheduling, hardware design, and AI training workflows across Google. One deployment optimised Borg, Google's data centre orchestrator, recovering 0.7% of compute resources globally. "This solution, now in production for over a year, continuously recovers, on average, 0.7% of Google's worldwide compute resources," the company said. AlphaEvolve also contributed to a Tensor Processing Unit (TPU) design. It suggested a Verilog-level change that removed redundant bits in a key arithmetic circuit. Google said this proposal passed verification tests and was integrated into an upcoming TPU release. In AI training, AlphaEvolve optimised matrix multiplication in the Gemini architecture, speeding up a core kernel by 23% and cutting training time by 1%. It also improved FlashAttention kernel performance by 32.5%, a domain typically untouched by human engineers due to compiler-level optimisation. Beyond infrastructure, AlphaEvolve tackled algorithmic challenges in mathematics. It discovered a new method to multiply 4×4 complex-valued matrices using 48 scalar multiplications, improving on the 1969 Strassen algorithm. "This finding demonstrates a significant advance over our previous work, AlphaTensor," the company said. Applied to over 50 open problems across mathematics, AlphaEvolve rediscovered known solutions in 75% of cases and improved on 20%. One of its advances was in the kissing number problem, where it found a configuration of 593 spheres touching a unit sphere in 11 dimensions, establishing a new lower bound.

[14]

Dataconomy

AlphaEvolve: How Google's new AI aims for truth with self-correction

Google's AI research and development lab, DeepMind, has unveiled AlphaEvolve, an AI system designed to tackle complex problems in math and science with "machine-gradable" solutions. The system leverages "state-of-the-art" models, specifically Gemini models, to generate, critique, and evaluate possible answers to a given problem. AlphaEvolve introduces a mechanism to reduce hallucinations in AI models by using an automatic evaluation system. This system scores the generated answers for accuracy, allowing it to work effectively on problems that can be self-evaluated, particularly in fields like computer science and system optimization. To utilize AlphaEvolve, users must provide a problem statement along with optional details such as instructions, equations, and relevant literature. They must also supply a mechanism for automatically assessing the system's answers, typically in the form of a formula. The system's capability is limited to describing solutions as algorithms, making it less suitable for non-numerical problems. In benchmarking tests, AlphaEvolve was presented with around 50 math problems across various branches, including geometry and combinatorics. The system successfully "rediscovered" the best-known answers 75% of the time and uncovered improved solutions in 20% of cases. DeepMind also applied AlphaEvolve to practical problems, such as optimizing Google's data center efficiency and speeding up model training runs. Video: Google DeepMind According to DeepMind, AlphaEvolve generated an algorithm that recovered 0.7% of Google's worldwide compute resources on average and suggested an optimization that reduced the overall time to train Gemini models by 1%. While AlphaEvolve isn't making groundbreaking discoveries, it is claimed to save time and free up experts to focus on more critical tasks. DeepMind plans to build a user interface for AlphaEvolve and launch an early access program for selected academics before considering a broader rollout. The lab asserts that AlphaEvolve's capabilities make it a valuable tool for domain experts.

[15]

NDTV Gadgets 360

Google's AlphaEvolve Coding Agent Can Help Fix AI Hallucinations

Google said AlphaEvolve has enhanced the efficiency of its data centres Google DeepMind announced a new artificial intelligence (AI) coding agent on Wednesday that can enhance the capabilities of AI models. Dubbed AlphaEvolve, it is designed to discover and optimise algorithms across complex computing and mathematical tasks. The powerful AI system is built on the Mountain View-based tech giant's Gemini models, and it combines outputs generated by large language models with automated evaluators to ground the responses in reality and reduce the risk of hallucinations. Beyond this, the system is also said to have shown potential in solving and optimising mathematical problems. In a blog post, DeepMind detailed the new technology it has been working on. AlphaEvolve is not an AI model, instead, it is a complex AI system with agentic capabilities. One of the primary functions the system performs is algorithm discovery and optimisation. AI models, at a fundamental level, are a series of code. These code bases process and compile information, break it down, and use probabilistic algorithms to generate an output. However, since AI systems are highly complex, their code bases are massive. This large size often causes optimisation and efficiency-based issues. AlphaEvolve can help with that, the company said. AlphaEvolve uses automated evaluation metrics, and using these parameters, it verifies, runs, and scores responses generated by AI models. Google said this method allows the system to quantifiably assess responses from multiple AI models and reduce the risk of hallucinations. Additionally, the system can also fix and improve code that allows such hallucinations. The tech giant said that AlphaEvolve has improved the efficiency of Google's data centres, chip design, and AI training processes. Interestingly, it was also able to improve the training of its own base LLM. In one case, it discovered a new scheduling method that recovers around 0.7 percent of Google's global compute resources -- a massive gain when applied across the company's massive infrastructure. Since AlphaEvolve works with code bases and algorithms, it is also said to have high potential in different areas of mathematical problem solving, the company said. It is said to have discovered a faster method to multiply 4x4 complex matrices, beating a solution that had stood for more than 50 years. In tests across 50 open mathematical problems, AlphaEvolve matched the current best solutions in most cases, and even improved on them in about 20 percent of problems, the post added.

[16]

Economic Times

Google DeepMind's new AI coding tool can solve complex math problems, design algorithms

One of the major threats facing the nascent AI world is hallucinations by chatbots. Google DeepMind's AlphaEvolve has the versatility of LLMs -- to summarise documents, generate code, and generate new ideas -- and also has the ability to verify answers through automated evaluators. Google's artificial intelligence (AI) research lab DeepMind has unveiled an advanced agent, AlphaEvolve, which can target fundamental and complex mathematics and computing problems. It has the versatility of large language models (LLMs), which can summarise documents, generate code, and generate new ideas. It also goes a step ahead by verifying answers through automated evaluators. One of the major threats facing the nascent AI world is hallucinations by chatbots. What AlphaEvolve does is that it uses LLMs to generate answers to prompts, and automatically evaluates and scores these answers for accuracy.

Twitter

Facebook

Copy Link

Google DeepMind unveils AlphaEvolve, a general-purpose AI system that combines large language models with evolutionary algorithms to solve complex problems in mathematics, computer science, and practical applications.

Google DeepMind Unveils AlphaEvolve: A New Frontier in AI-Driven Scientific Discovery

Google DeepMind has introduced AlphaEvolve, a groundbreaking AI system that combines the power of large language models (LLMs) with evolutionary algorithms to tackle complex problems across various scientific domains. This general-purpose AI tool marks a significant advancement in using artificial intelligence for scientific discovery and practical problem-solving 1.

How AlphaEvolve Works

AlphaEvolve utilizes Google's Gemini family of LLMs as its foundation. The system operates by:

Accepting user input in the form of a problem statement, evaluation criteria, and a suggested solution.
Generating hundreds or thousands of potential solutions using the LLM.
Employing an 'evaluator' algorithm to assess these solutions against specified metrics.
Evolving a population of stronger algorithms based on the best-performing solutions 1.

This approach allows AlphaEvolve to explore a diverse set of possibilities for solving complex problems, often surpassing human-developed solutions 2.

Achievements and Applications

AlphaEvolve has demonstrated impressive capabilities across various fields:

Mathematics: The system has made novel discoveries, including improving the lower bound on the "kissing number" in 11 dimensions and developing new algorithms for matrix multiplication 3.
Computer Science: AlphaEvolve has outperformed specialized AI tools like AlphaTensor in certain matrix multiplication tasks 1.
Practical Applications: Within Google, AlphaEvolve has:
- Optimized data center orchestration, saving 0.7% of total resources.
- Improved the design of Google's next-generation tensor processing units.
- Enhanced Gemini model training efficiency by 1% 3 4.

Implications and Future Directions

The development of AlphaEvolve represents a significant step towards using AI for original scientific contributions. Unlike previous AI systems designed for specific tasks, AlphaEvolve's general-purpose nature allows it to tackle a wide range of problems that can be expressed as code and verified programmatically 5.

Google DeepMind plans to expand AlphaEvolve's applications to natural sciences and is developing a user interface for broader access. The company also intends to launch an early access program for selected academics 4.

Challenges and Limitations

While AlphaEvolve shows great promise, it faces some limitations:

Resource Intensity: The system requires significant computing power, making it challenging to make freely available 1.
Problem Scope: AlphaEvolve is currently limited to problems that can be expressed as code and have clear evaluation criteria 5.
Verification: Some researchers emphasize the need for broader community testing and open-source recreation to fully validate AlphaEvolve's capabilities 1.

As AlphaEvolve continues to evolve, it promises to reshape the landscape of AI-driven scientific discovery, potentially accelerating breakthroughs across various fields of study and practical applications.

References

Summarized by

Navi

[1]

Nature

|DeepMind unveils 'spectacular' general-purpose science AI

[2]

Ars Technica

|Google DeepMind creates super-advanced AI that can invent new algorithms

[3]

IEEE Spectrum

|Can Large Language Models Actually Discover New Algorithms?

[4]

TechCrunch

|DeepMind claims its newest AI tool is a whiz at math and science problems | TechCrunch

[5]

MIT Technology Review

|Google DeepMind's new AI uses large language models to crack real-world problems

Explore today's top stories

Samsung Secures $16.5 Billion Deal with Tesla for Next-Gen AI Chips

Samsung Electronics has signed a $16.5 billion contract with Tesla to manufacture next-generation AI chips for self-driving cars, with production set to take place in a new Texas facility.

16 Sources

Technology

12 hrs ago

Samsung Secures $16.5 Billion Deal with Tesla for Next-Gen

16 Sources

Technology

12 hrs ago

The Rise of AI Deepfakes: A Growing Threat and the AI-Powered Fight Back

As AI-generated deepfakes become increasingly sophisticated and prevalent, posing significant threats to national security and corporate integrity, experts are turning to AI itself as the most promising solution to combat this emerging challenge.

3 Sources

Technology

4 hrs ago

The Rise of AI Deepfakes: A Growing Threat and the

3 Sources

Technology

4 hrs ago

Trump Administration Pauses Tech Export Controls to China, Sparking AI Security Concerns

The U.S. has temporarily halted restrictions on technology exports to China, aiming to facilitate trade talks and a potential meeting between Presidents Trump and Xi. This move has raised concerns about its impact on America's AI advantage.

3 Sources

Policy and Regulation

4 hrs ago

Trump Administration Pauses Tech Export Controls to China,

3 Sources

Policy and Regulation

4 hrs ago

Google Rolls Out AI Mode Shortcut in Search Widget for Android Users

Google has widely released an AI Mode shortcut in its Search widget for Android devices, making it easier for users to access AI-powered search features directly from their home screens.

6 Sources

Technology

20 hrs ago

Google Rolls Out AI Mode Shortcut in Search Widget for

6 Sources

Technology

20 hrs ago

Chinese AI Firms Form Alliances to Build Domestic Ecosystem Amid US Curbs

Chinese AI companies announce two new industry alliances to develop a domestic ecosystem, reducing dependence on foreign tech amid US export restrictions. The World Artificial Intelligence Conference in Shanghai showcases innovative AI products and technologies.

2 Sources

Technology

4 hrs ago

Chinese AI Firms Form Alliances to Build Domestic Ecosystem

2 Sources

Technology

4 hrs ago

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

The Outpost

Top stories

News

About