AI achieves silver-medal standard solving International Mathematical Olympiad Problems

Breakthrough models AlphaProof and AlphaGeometry 2 solve advanced reasoning problems in mathematics.

This is a huge advance for AI to make big progress with better reasoning and better math.

Artificial general intelligence (AGI) with advanced mathematical reasoning has the potential to unlock new frontiers in science and technology.

We’ve made great progress building AI systems that help mathematicians discover new insights, novel algorithms and answers to open problems. But current AI systems still struggle with solving general math problems because of limitations in reasoning skills and training data.

AlphaProof, a new reinforcement-learning based system for formal math reasoning, and AlphaGeometry 2, an improved version of a geometry-solving system. Together, these systems solved four out of six problems from this year’s International Mathematical Olympiad (IMO), achieving the same level as a silver medalist in the competition for the first time.

The annual International Mathematical Olympiad, IMO, competition has also become widely recognized as a grand challenge in machine learning and an aspirational benchmark for measuring an AI system’s advanced mathematical reasoning capabilities.

This year, Deep Mind applied their combined AI system to the competition problems, provided by the IMO organizers. The solutions were scored according to the IMO’s point-awarding rules by prominent mathematicians Prof Sir Timothy Gowers, an IMO gold medalist and Fields Medal winner, and Dr Joseph Myers, a two-time IMO gold medalist and Chair of the IMO 2024 Problem Selection Committee.

The fact that the program can come up with a non-obvious construction like this is very impressive, and well beyond what I thought was state of the art.

Prof Sir Timothy Gowers,

AlphaProof: a formal approach to reasoning

AlphaProof is a system that trains itself to prove mathematical statements in the formal language Lean. It couples a pre-trained language model with the AlphaZero reinforcement learning algorithm, which previously taught itself how to master the games of chess, shogi and Go.

Formal languages offer the critical advantage that proofs involving mathematical reasoning can be formally verified for correctness. Their use in machine learning has, however, previously been constrained by the very limited amount of human-written data available.

In contrast, natural language based approaches can hallucinate plausible but incorrect intermediate reasoning steps and solutions, despite having access to orders of magnitudes more data. We established a bridge between these two complementary spheres by fine-tuning a Gemini model to automatically translate natural language problem statements into formal statements, creating a large library of formal problems of varying difficulty.

When presented with a problem, AlphaProof generates solution candidates and then proves or disproves them by searching over possible proof steps in Lean. Each proof that was found and verified is used to reinforce AlphaProof’s language model, enhancing its ability to solve subsequent, more challenging problems.

They trained AlphaProof for the IMO by proving or disproving millions of problems, covering a wide range of difficulties and mathematical topic areas over a period of weeks leading up to the competition. The training loop was also applied during the contest, reinforcing proofs of self-generated variations of the contest problems until a full solution could be found.
Process infographic of AlphaProof’s reinforcement learning training loop: Around one million informal math problems are translated into a formal math language by a formalizer network. Then a solver network searches for proofs or disproofs of the problems, progressively training itself via the AlphaZero algorithm to solve more challenging problems.

A more competitive AlphaGeometry 2

AlphaGeometry 2 is a significantly improved version of AlphaGeometry. It’s a neuro-symbolic hybrid system in which the language model was based on Gemini and trained from scratch on an order of magnitude more synthetic data than its predecessor. This helped the model tackle much more challenging geometry problems, including problems about movements of objects and equations of angles, ratio or distances.

AlphaGeometry 2 employs a symbolic engine that is two orders of magnitude faster than its predecessor. When presented with a new problem, a novel knowledge-sharing mechanism is used to enable advanced combinations of different search trees to tackle more complex problems.

Before this year’s competition, AlphaGeometry 2 could solve 83% of all historical IMO geometry problems from the past 25 years, compared to the 53% rate achieved by its predecessor. For IMO 2024, AlphaGeometry 2 solved Problem 4 within 19 seconds after receiving its formalization.

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.

Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.

A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts.  He is open to public speaking and advising engagements.

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

Related Posts
Someone to swim with thumbnail

Someone to swim with

כיצד דגי זברה יודעים לשחות בקבוצות? כשאנחנו מארגנים מסיבה, מזמינים משפחה לארוחה או יוצאים לטיול מאורגן, מניע אותנו המרכיב הבסיסי ביותר של התנהגות חברתית: הרצון להתרועע עם בני אנוש אחרים. אמנם הדחף לבלות עם בני מיננו נשלט במידה זו או אחרת על-ידי הגנים שלנו, אך בבני-אדם קשה לדעת היכן נגמרת תרומתם של הגנים ואיפה מתחילה…
Read More
Astronomers Detect Radio Waves from Type Ia Supernova thumbnail

Astronomers Detect Radio Waves from Type Ia Supernova

The detection of the Type Ia supernova SN 2020eyj at radio wavelengths show that the exploded white dwarf had a helium-rich companion. An artist’s rendition of SN 2020eyj, a white dwarf star that went supernova after pulling material from a helium companion star. Image credit: Adam Makarenko / W.M. Keck Observatory. Type Ia supernovae are
Read More
Rock carvings of ancient Egyptian pharaohs found underwater near Aswan thumbnail

Rock carvings of ancient Egyptian pharaohs found underwater near Aswan

Archaeologists found stone carvings from ancient times during a diving expedition near Aswan, Egypt. (Image credit: Courtesy of the Egyptian Ministry of Tourism and Antiquities) During a diving expedition in the Nile River, archaeologists in Egypt discovered rock carvings featuring depictions of several ancient Egyptian pharaohs, along with hieroglyphic inscriptions. A joint French-Egyptian team found the
Read More
The Heady Neuroscience Behind 'Paying Attention' thumbnail

The Heady Neuroscience Behind ‘Paying Attention’

There's a paradox in our ability to pay attention. When we are hyper-focused on our surroundings, our senses become more acutely aware of the signals they pick up. But sometimes when we are paying attention, we miss things in our sensory field that are so glaringly obvious, on a second look we can’t help but…
Read More
Index Of News
Total
0
Share