AI achieves silver-medal standard solving International Mathematical Olympiad Problems

Breakthrough models AlphaProof and AlphaGeometry 2 solve advanced reasoning problems in mathematics.

This is a huge advance for AI to make big progress with better reasoning and better math.

Artificial general intelligence (AGI) with advanced mathematical reasoning has the potential to unlock new frontiers in science and technology.

We’ve made great progress building AI systems that help mathematicians discover new insights, novel algorithms and answers to open problems. But current AI systems still struggle with solving general math problems because of limitations in reasoning skills and training data.

AlphaProof, a new reinforcement-learning based system for formal math reasoning, and AlphaGeometry 2, an improved version of a geometry-solving system. Together, these systems solved four out of six problems from this year’s International Mathematical Olympiad (IMO), achieving the same level as a silver medalist in the competition for the first time.

The annual International Mathematical Olympiad, IMO, competition has also become widely recognized as a grand challenge in machine learning and an aspirational benchmark for measuring an AI system’s advanced mathematical reasoning capabilities.

This year, Deep Mind applied their combined AI system to the competition problems, provided by the IMO organizers. The solutions were scored according to the IMO’s point-awarding rules by prominent mathematicians Prof Sir Timothy Gowers, an IMO gold medalist and Fields Medal winner, and Dr Joseph Myers, a two-time IMO gold medalist and Chair of the IMO 2024 Problem Selection Committee.

The fact that the program can come up with a non-obvious construction like this is very impressive, and well beyond what I thought was state of the art.

Prof Sir Timothy Gowers,

AlphaProof: a formal approach to reasoning

AlphaProof is a system that trains itself to prove mathematical statements in the formal language Lean. It couples a pre-trained language model with the AlphaZero reinforcement learning algorithm, which previously taught itself how to master the games of chess, shogi and Go.

Formal languages offer the critical advantage that proofs involving mathematical reasoning can be formally verified for correctness. Their use in machine learning has, however, previously been constrained by the very limited amount of human-written data available.

In contrast, natural language based approaches can hallucinate plausible but incorrect intermediate reasoning steps and solutions, despite having access to orders of magnitudes more data. We established a bridge between these two complementary spheres by fine-tuning a Gemini model to automatically translate natural language problem statements into formal statements, creating a large library of formal problems of varying difficulty.

When presented with a problem, AlphaProof generates solution candidates and then proves or disproves them by searching over possible proof steps in Lean. Each proof that was found and verified is used to reinforce AlphaProof’s language model, enhancing its ability to solve subsequent, more challenging problems.

They trained AlphaProof for the IMO by proving or disproving millions of problems, covering a wide range of difficulties and mathematical topic areas over a period of weeks leading up to the competition. The training loop was also applied during the contest, reinforcing proofs of self-generated variations of the contest problems until a full solution could be found.
Process infographic of AlphaProof’s reinforcement learning training loop: Around one million informal math problems are translated into a formal math language by a formalizer network. Then a solver network searches for proofs or disproofs of the problems, progressively training itself via the AlphaZero algorithm to solve more challenging problems.

A more competitive AlphaGeometry 2

AlphaGeometry 2 is a significantly improved version of AlphaGeometry. It’s a neuro-symbolic hybrid system in which the language model was based on Gemini and trained from scratch on an order of magnitude more synthetic data than its predecessor. This helped the model tackle much more challenging geometry problems, including problems about movements of objects and equations of angles, ratio or distances.

AlphaGeometry 2 employs a symbolic engine that is two orders of magnitude faster than its predecessor. When presented with a new problem, a novel knowledge-sharing mechanism is used to enable advanced combinations of different search trees to tackle more complex problems.

Before this year’s competition, AlphaGeometry 2 could solve 83% of all historical IMO geometry problems from the past 25 years, compared to the 53% rate achieved by its predecessor. For IMO 2024, AlphaGeometry 2 solved Problem 4 within 19 seconds after receiving its formalization.

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.

Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.

A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

Acer Nitro V Gaming Laptop | Intel Core i5-13420H Processor | NVIDIA GeForce RTX 4050 Laptop GPU | 15.6" FHD IPS 144Hz Display | 8GB DDR5 | 512GB Gen 4 SSD | WiFi 6 | Backlit KB | ANV15-51-51H9

(4666)

$764.42 (as of September 16, 2024 18:29 GMT +00:00 - )

D’Addario Woodwinds - Rico Bb Clarinet Reeds - Reeds for Bb Clarinet - Crafted for Beginners, Students, Educators - Strength…

(12133)

$21.99 (as of September 16, 2024 18:27 GMT +00:00 - )

Karaoke Machine for Kids Adults, Mini Portable Bluetooth Karaoke Speaker with 2 Wireless Microphone and Lights, Birthday…

(738)

$28.79 (as of September 16, 2024 18:27 GMT +00:00 - )

OBD2 Scanner Reader Bluetooth Wireless Auto Diagnostic Scan Tool for iOS & Android for Performance Test Bluetooth 5.4 Car Check Engine Car Code Reader, Clear Error Code Live Data Reset Exclusive APP

(118)

$23.99 (as of September 16, 2024 18:29 GMT +00:00 - )

Core Power Fairlife Elite 42g High Protein Milk Shakes For kosher diet, Ready to Drink for Workout Recovery, Chocolate, 14 Fl Oz…

(19142)

$45.38 (as of September 16, 2024 18:27 GMT +00:00 - )

Index Of News Author

Science and Medical

Prioritization – More Important Than Any Productivity Technique

You are here: Home / Productivity / Prioritization – More Important Than Any Productivity TechniqueSo much of what gets passed off as productivity involves trying to do more tasks. If the value of your tasks are all equal, then this might make sense, but not all tasks are equal. Imagine you have 10 tasks you

March 7, 2018

Science and Medical

Someone to swim with

כיצד דגי זברה יודעים לשחות בקבוצות? כשאנחנו מארגנים מסיבה, מזמינים משפחה לארוחה או יוצאים לטיול מאורגן, מניע אותנו המרכיב הבסיסי ביותר של התנהגות חברתית: הרצון להתרועע עם בני אנוש אחרים. אמנם הדחף לבלות עם בני מיננו נשלט במידה זו או אחרת על-ידי הגנים שלנו, אך בבני-אדם קשה לדעת היכן נגמרת תרומתם של הגנים ואיפה מתחילה…

December 30, 2021

Science and Medical

Astronomers Detect Radio Waves from Type Ia Supernova

The detection of the Type Ia supernova SN 2020eyj at radio wavelengths show that the exploded white dwarf had a helium-rich companion. An artist’s rendition of SN 2020eyj, a white dwarf star that went supernova after pulling material from a helium companion star. Image credit: Adam Makarenko / W.M. Keck Observatory. Type Ia supernovae are

May 18, 2023

Science and Medical

Rock carvings of ancient Egyptian pharaohs found underwater near Aswan

Archaeologists found stone carvings from ancient times during a diving expedition near Aswan, Egypt. (Image credit: Courtesy of the Egyptian Ministry of Tourism and Antiquities) During a diving expedition in the Nile River, archaeologists in Egypt discovered rock carvings featuring depictions of several ancient Egyptian pharaohs, along with hieroglyphic inscriptions. A joint French-Egyptian team found the

July 18, 2024

Science and Medical

Why Was Earth’s Climate so Warm and Weird in 2023? — Part 1

By now, you probably know that 2023 was the warmest year on record. Climatically speaking, it was also arguably one of the weirdest. By early summer, it became obvious that something quite unusual was happening. In June, New York and other major cities were smothered by a toxic blanket of smoke from huge Canadian wildfires

January 29, 2024

Science and Medical

The Heady Neuroscience Behind ‘Paying Attention’

There's a paradox in our ability to pay attention. When we are hyper-focused on our surroundings, our senses become more acutely aware of the signals they pick up. But sometimes when we are paying attention, we miss things in our sensory field that are so glaringly obvious, on a second look we can’t help but…

February 17, 2022

Hand-Picked Top-Read Stories

Nebraska ballot can include competing measures to expand or limit abortion rights, high court rules

A military jet crashes in Bulgaria during a drill. Both pilots are killed

Biden wants to close a loophole that enables imports of clothing and illicit substances from China

Trending Tags

AI achieves silver-medal standard solving International Mathematical Olympiad Problems

Acer Nitro V Gaming Laptop | Intel Core i5-13420H Processor | NVIDIA GeForce RTX 4050 Laptop GPU | 15.6" FHD IPS 144Hz Display | 8GB DDR5 | 512GB Gen 4 SSD | WiFi 6 | Backlit KB | ANV15-51-51H9

D’Addario Woodwinds - Rico Bb Clarinet Reeds - Reeds for Bb Clarinet - Crafted for Beginners, Students, Educators - Strength…

Karaoke Machine for Kids Adults, Mini Portable Bluetooth Karaoke Speaker with 2 Wireless Microphone and Lights, Birthday…

OBD2 Scanner Reader Bluetooth Wireless Auto Diagnostic Scan Tool for iOS & Android for Performance Test Bluetooth 5.4 Car Check Engine Car Code Reader, Clear Error Code Live Data Reset Exclusive APP

Core Power Fairlife Elite 42g High Protein Milk Shakes For kosher diet, Ready to Drink for Workout Recovery, Chocolate, 14 Fl Oz…

Illegal crypto mining equipment found in Tehran stock exchange basement

Before Yami Gautam, THESE Bollywood actresses embraced and accepted their flaws and proved the importance of self-love

Leading Causes of Death Hit Childhood Cancer Survivors Earlier

Love Live! Nijigasaki High School Idol Club Anime Launches Singing Contest With TwitCasting

Look: Kane Brown, wife Katelyn say baby No. 3 is a boy

Nebraska ballot can include competing measures to expand or limit abortion rights, high court rules

A military jet crashes in Bulgaria during a drill. Both pilots are killed

Biden wants to close a loophole that enables imports of clothing and illicit substances from China

How a climate solution means a school nurse sees fewer students sick from the heat

No. 17 Michigan hosts Butch Jones-led Arkansas State, aims to bounce back from loss at home to Texas

AI achieves silver-medal standard solving International Mathematical Olympiad Problems

Related Posts