AI achieves silver-medal standard solving International Mathematical Olympiad Problems

Breakthrough models AlphaProof and AlphaGeometry 2 solve advanced reasoning problems in mathematics.

This is a huge advance for AI to make big progress with better reasoning and better math.

Artificial general intelligence (AGI) with advanced mathematical reasoning has the potential to unlock new frontiers in science and technology.

We’ve made great progress building AI systems that help mathematicians discover new insights, novel algorithms and answers to open problems. But current AI systems still struggle with solving general math problems because of limitations in reasoning skills and training data.

AlphaProof, a new reinforcement-learning based system for formal math reasoning, and AlphaGeometry 2, an improved version of a geometry-solving system. Together, these systems solved four out of six problems from this year’s International Mathematical Olympiad (IMO), achieving the same level as a silver medalist in the competition for the first time.

The annual International Mathematical Olympiad, IMO, competition has also become widely recognized as a grand challenge in machine learning and an aspirational benchmark for measuring an AI system’s advanced mathematical reasoning capabilities.

This year, Deep Mind applied their combined AI system to the competition problems, provided by the IMO organizers. The solutions were scored according to the IMO’s point-awarding rules by prominent mathematicians Prof Sir Timothy Gowers, an IMO gold medalist and Fields Medal winner, and Dr Joseph Myers, a two-time IMO gold medalist and Chair of the IMO 2024 Problem Selection Committee.

The fact that the program can come up with a non-obvious construction like this is very impressive, and well beyond what I thought was state of the art.

Prof Sir Timothy Gowers,

AlphaProof: a formal approach to reasoning

AlphaProof is a system that trains itself to prove mathematical statements in the formal language Lean. It couples a pre-trained language model with the AlphaZero reinforcement learning algorithm, which previously taught itself how to master the games of chess, shogi and Go.

Formal languages offer the critical advantage that proofs involving mathematical reasoning can be formally verified for correctness. Their use in machine learning has, however, previously been constrained by the very limited amount of human-written data available.

In contrast, natural language based approaches can hallucinate plausible but incorrect intermediate reasoning steps and solutions, despite having access to orders of magnitudes more data. We established a bridge between these two complementary spheres by fine-tuning a Gemini model to automatically translate natural language problem statements into formal statements, creating a large library of formal problems of varying difficulty.

When presented with a problem, AlphaProof generates solution candidates and then proves or disproves them by searching over possible proof steps in Lean. Each proof that was found and verified is used to reinforce AlphaProof’s language model, enhancing its ability to solve subsequent, more challenging problems.

They trained AlphaProof for the IMO by proving or disproving millions of problems, covering a wide range of difficulties and mathematical topic areas over a period of weeks leading up to the competition. The training loop was also applied during the contest, reinforcing proofs of self-generated variations of the contest problems until a full solution could be found.
Process infographic of AlphaProof’s reinforcement learning training loop: Around one million informal math problems are translated into a formal math language by a formalizer network. Then a solver network searches for proofs or disproofs of the problems, progressively training itself via the AlphaZero algorithm to solve more challenging problems.

A more competitive AlphaGeometry 2

AlphaGeometry 2 is a significantly improved version of AlphaGeometry. It’s a neuro-symbolic hybrid system in which the language model was based on Gemini and trained from scratch on an order of magnitude more synthetic data than its predecessor. This helped the model tackle much more challenging geometry problems, including problems about movements of objects and equations of angles, ratio or distances.

AlphaGeometry 2 employs a symbolic engine that is two orders of magnitude faster than its predecessor. When presented with a new problem, a novel knowledge-sharing mechanism is used to enable advanced combinations of different search trees to tackle more complex problems.

Before this year’s competition, AlphaGeometry 2 could solve 83% of all historical IMO geometry problems from the past 25 years, compared to the 53% rate achieved by its predecessor. For IMO 2024, AlphaGeometry 2 solved Problem 4 within 19 seconds after receiving its formalization.

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.

Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.

A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts.  He is open to public speaking and advising engagements.

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

Related Posts
Why you should go hiking in Symi, Greece thumbnail

Why you should go hiking in Symi, Greece

This article was produced by National Geographic Traveller (UK).“Ela!” says Valantis Makrakis with a huff, mid-step, one hand clutching his side, the other lifted in greeting. A young man nods back as he trots down the sloping street, followed by a litter of cats. “Kalimera!” Valantis says again to a woman, who answers with a
Read More
Scrambling: Xiaomi 12 Series Sets A New Sales Record thumbnail

Scrambling: Xiaomi 12 Series Sets A New Sales Record

Xiaomi’nin yeni akıllı telefon serisi satış rekorları kırıyor. Şirketin bundan önceki rekorunu tutan Xiaomi Mi 11 serisi, şirketin son göz bebeği Xiaomi 12 serisine tahtını kaptırdı. Yalnızca 5 dakika içerisinde dudak uçuklatan sayılarda gelir elde edildi. Teknoloji takipçileri bir süredir Xiaomi’nin yeni akıllı telefon serisini tanıtması için bekliyordu. Özellikleri ve fiyatları hakkında yapılan tahminlerle gündeme…
Read More
Atomic Alchemy – Scientists Have Made a Game-Changing Breakthrough in Drug Discovery Chemistry thumbnail

Atomic Alchemy – Scientists Have Made a Game-Changing Breakthrough in Drug Discovery Chemistry

Chemists at the University of Chicago have developed two innovative methods for replacing carbon atoms with nitrogen in molecules, a significant advancement that could streamline the development of new pharmaceuticals. These breakthroughs  offer more efficient pathways for drug design, potentially revolutionizing the field.Method to replace carbon with nitrogen atom has been ‘top of wish list’.For
Read More
In Topology, When Are Two Shapes the Same? thumbnail

In Topology, When Are Two Shapes the Same?

topologyBy Kevin HartnettSeptember 28, 2021As topologists seek to classify shapes, the effort hinges on how to define a manifold and what it means for two of them to be equivalent.David Parker/Science SourceSorting a collection of shapes is child’s play. Circles here, squares there, triangles in their own pile. But if you take the task seriously,…
Read More
Regular Tea Consumption, Particularly Dark Tea, May Help Reduce Diabetes Risk thumbnail

Regular Tea Consumption, Particularly Dark Tea, May Help Reduce Diabetes Risk

Tea, a beverage consumed extensively worldwide, has been reported to be associated with substantial health benefits, including a reduced risk of cardiovascular disease and type 2 diabetes. However, the mechanism underlying these benefits has been uncertain. In a new cross-sectional study, researchers from the University of Adelaide and China’s Southeast University examined the association of
Read More
Index Of News
Total
0
Share