DeepMind AI gets silver medal at International Mathematical Olympiad

DeepMind’s AlphaProof AI can tackle a range of mathematical problems
Google DeepMind

An AI from Google DeepMind has achieved a silver medal score at this year’s International Mathematical Olympiad (IMO), the first time any AI has made it to the podium.

The IMO is considered the world’s most prestigious competition for young mathematicians. Correctly answering its test questions requires mathematical ability that AI systems typically lack.

In January, Google DeepMind demonstrated AlphaGeometry, an AI system that could answer some IMO geometry questions as well as humans. However, this was not from a live competition, and it couldn’t answer questions from other mathematical disciplines, such as number theory, algebra and combinatorics, which is necessary to win an IMO medal.

Google DeepMind has now released a new AI, called AlphaProof, which can solve a wider range of mathematical problems, and an improved version of AlphaGeometry, which can solve more geometry questions.

When the team tested both systems together on this year’s IMO questions, they answered four out of six questions correctly, giving them a score of 28 out of a possible 42 points. This was enough to win a silver medal and just one point under this year’s gold medal threshold.

At the contest in Bath, UK, last week, 58 entrants won a gold medal and 123 won a silver medal.

“We are all very much aware that AI will eventually be better than humans at solving most mathematical problems, but the rate at which AI is improving is breathtaking,” says Gregor Dolinar, the IMO president. “Missing the gold medal at IMO 2024 by just one point a few days ago is truly impressive.”

At a press conference, Timothy Gowers at the University of Cambridge, who helped mark AlphaProof’s answers, said the AI’s performance was surprising and it appeared to find “magic keys” to answer problems in a similar way to humans. “I thought that these magic keys would probably be a little bit beyond what it could do, so it came as quite a surprise in one or two instances when the program had indeed found these keys,” said Gowers.

AlphaProof works similarly to Google DeepMind’s previous AIs that can beat the best humans at chess and Go. All of these AIs rely on a trial-and-error approach called reinforcement learning, where the system finds its own way to solve a problem over many attempts. However, this method requires a large set of problems written in language that the AI can understand and verify, whereas most IMO-like problems are written in English.

To get around this, Thomas Hubert at DeepMind and his colleagues used Google’s Gemini AI, a language model like the one that powers ChatGPT, to translate these problems into a programming language called Lean so that the AI could learn how to solve them.

“At the beginning, it will be able to solve perhaps the simplest problems, and learn from solving those simpler problems to attack harder and harder problems,” Hubert said at the press conference. It also produces its answers in Lean, so they can be instantly verified as correct.

While AlphaProof’s performance is impressive, it works slowly, taking up to three days to find some solutions instead of the 4.5 hours per three questions that competitors are allowed. It also failed to answer both questions on combinatorics, which is the study of counting and arranging numbers. “We are still working to understand why this is, which will hopefully lead us to improve the system,” says Alex Davies at Google DeepMind.

It is also not clear how AlphaProof arrives at its answers or whether it uses the same kind of mathematical intuitions that humans do, said Gowers, but its ability to translate proofs from Lean into English makes it easy to check they are correct.

The result is impressive and a significant milestone, says Geordie Williamson at the University of Sydney, Australia. “There have been many previous attempts to do reinforcement learning on formal proofs and none have had much success.”

While a system like AlphaProof could be useful for working mathematicians in helping develop proofs, it obviously can’t help with identifying problems to solve and work on, which takes up a large portion of researchers’ time, says Yang-Hui He at the London Institute for Mathematical Sciences.

Hubert said his team hopes that AlphaProof will be able to help improve Google’s large language models, like Gemini, by reducing incorrect responses.

The trading company XTX Markets has offered a $5 million prize – called the AI Mathematical Olympiad – for an AI capable of achieving a gold medal at the IMO, but AlphaProof is not eligible because it is not publicly available. “We hope that DeepMind’s advances will inspire more teams to enter the AIMO Prize, and would of course welcome a public entry from DeepMind themselves,” says Alex Gerko at XTX Markets.

Topics:

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

TOSY Flying Ring - 16 Million Color RGB or 12 LEDs, Super Bright, Lost Mode, Auto Light Up, Safe & Soft, Waterproof, Lightweight Frisbee, Birthday, Camping & Outdoor/Indoor Gift Toy for Boy/Girl/Kid

(3150)

$15.99 (as of December 18, 2024 19:32 GMT +00:00 - )

Personalized Toiletry Bag for Men, Engraved Name Initials Toiletry Bag, Customized Monogram Travel Shaving Dopp Kit for Groomsmen, Gift for Christmas, Birthday, Dad, Husband, Grandpa, Lover

(1000)

$22.99 (as of December 18, 2024 19:11 GMT +00:00 - )

Apple Watch Ultra 2 [GPS + Cellular, 49mm] - Titanium Case with Blue Ocean Band, One Size (Renewed Premium)

(7)

$529.00 (as of December 18, 2024 19:04 GMT +00:00 - )

YETI Rambler 10 oz Tumbler, Stainless Steel, Vacuum Insulated with MagSlider Lid, Black

(11)

$20.00 (as of December 18, 2024 19:04 GMT +00:00 - )

The Woobles Beginners Crochet Kit with Easy Peasy Yarn as seen on Shark Tank - with Step-by-Step Video Tutorials - JoJo The Bunny

(2234)

$29.99 (as of December 18, 2024 19:11 GMT +00:00 - )

Index Of News Author

Science and Medical

Coles and Woolworths to fight claims their ‘price drop’ campaigns are misleading

Supermarket giants Coles and Woolworths will fight .Lawyers for both supermarkets appeared in the Federal Court on Wednesday after the Australian Competition and Consumer Commission (ACCC) launched separate legal proceedings in September.The watchdog says the companies violated consumer law by misleading shoppers on hundreds of popular supermarket items with their "Down Down" and "Prices Dropped"

October 23, 2024

Science and Medical

A potential target for overcoming resistance to breast cancer treatment

Some breast cancer patients develop resistance to standard treatments. This happens when cancer cells evolve to develop ways to circumvent the mechanisms by which the treatment works. Estrogen plays a harmful role in the majority of patients with breast cancer. Mainly, the tumor contains a receptor for this hormone. When estrogen binds to its receptor,…

March 9, 2022

Science and Medical

Revolutionary “Bionic” Pacemaker Reverses Heart Failure

A revolutionary pacemaker that re-establishes the heart’s naturally irregular beat is set to be trialed in New Zealand heart patients this year. A revolutionary pacemaker that re-establishes the heart’s naturally irregular beat is set to be trialed in New Zealand heart patients this year, following successful animal trials. “Currently, all pacemakers pace the heart metronomically,…

February 11, 2022

Science and Medical

How to Download PDFs Instead of Previewing Them in Chrome, Firefox, and Edge

When you click a PDF link in most browsers, the browser opens the PDF preview in a web browser window. To download a PDF and not preview it, you need to change a setting in your browser. This works in Chrome, Firefox, and Edge. Get Chrome to Download Instead of Preview a PDF Google Chrome…

February 5, 2022

Science and Medical

These Tiny Liquid Robots Never Run Out of Energy As Long as They Have Food

Artist’s rendering of autonomous, continuous “liquid robots” in an animated GIF. Credit: Jenny Nuss/Berkeley Lab By removing electricity from equation, discovery overcomes yearslong hurdle in robotics. When you think of a robot, images of R2-D2 or C-3PO might come to mind. But robots can serve up more than just entertainment on the big screen. In…

January 3, 2022

Science and Medical

Long noncoding RNA CHROMR regulates antiviral immunity in humans

Checking if the site connection is secure www.pnas.org needs to review the security of your connection before proceeding.

August 24, 2022

Hand-Picked Top-Read Stories

Indian tourists can travel visa-free to Russia starting 2025

Amorim offers olive branch to axed stars after Manchester derby

Saracens eye last 16 as English clubs mount a French resistance

Trending Tags

DeepMind AI gets silver medal at International Mathematical Olympiad

TOSY Flying Ring - 16 Million Color RGB or 12 LEDs, Super Bright, Lost Mode, Auto Light Up, Safe & Soft, Waterproof, Lightweight Frisbee, Birthday, Camping & Outdoor/Indoor Gift Toy for Boy/Girl/Kid

Personalized Toiletry Bag for Men, Engraved Name Initials Toiletry Bag, Customized Monogram Travel Shaving Dopp Kit for Groomsmen, Gift for Christmas, Birthday, Dad, Husband, Grandpa, Lover

Apple Watch Ultra 2 [GPS + Cellular, 49mm] - Titanium Case with Blue Ocean Band, One Size (Renewed Premium)

YETI Rambler 10 oz Tumbler, Stainless Steel, Vacuum Insulated with MagSlider Lid, Black

The Woobles Beginners Crochet Kit with Easy Peasy Yarn as seen on Shark Tank - with Step-by-Step Video Tutorials - JoJo The Bunny

Sania Mirza regrets making retirement announcement: Think I made it too soon, will continue giving my 100 percent

Sugar-Coated COVID-19 Test Strip Takes Advantage of Coronavirus’ Sweet Tooth To Detect All Variants

Gobierno de Estados Unidos manifiesta su preocupación por el asesinato de periodistas en México

Josh Hill to Fill-In for Benny Bloss with Team Tedder

Who Knew We Needed This Unseen Altamont Footage So Badly?

Indian tourists can travel visa-free to Russia starting 2025

Amorim offers olive branch to axed stars after Manchester derby

Saracens eye last 16 as English clubs mount a French resistance

Home secretary criticised for not saying when small boat crossings will fall

Talktalk to axe hundreds of jobs in latest cost-cutting drive

DeepMind AI gets silver medal at International Mathematical Olympiad

Related Posts