Machines Learn Better if We Teach Them the Basics

Imagine that your neighbor calls to ask a favor: Could you please feed their pet rabbit some carrot slices? Easy enough, you’d think. You can imagine their kitchen, even if you’ve never been there — carrots in a fridge, a drawer holding various knives. It’s abstract knowledge: You don’t know what your neighbor’s carrots and knives look like exactly, but you won’t take a spoon to a cucumber.

Artificial intelligence programs can’t compete. What seems to you like an easy task is a huge undertaking for current algorithms.

An AI-trained robot can find a specified knife and carrot hiding in a familiar kitchen, but in a different kitchen it will lack the abstract skills to succeed. “They don’t generalize to new environments,” said Victor Zhonga graduate student in computer science at the University of Washington. The machine fails because there’s simply too much to learn, and too vast a space to explore.

The problem is that these robots — and AI agents in general — don’t have a foundation of concepts to build on. They don’t know what a knife or a carrot really is, much less how to open a drawer, choose one and cut slices. This limitation is due in part to the fact that many advanced AI systems get trained with a method called reinforcement learning that’s essentially self-education through trial and error. AI agents trained with reinforcement learning can execute the job they were trained to do very well, in the environment they were trained to do it in. But change the job or the environment, and these systems will often fail.

To get around this limitation, computer scientists have begun to teach machines important concepts before setting them loose. It’s like reading a manual before using new software: You could try to explore without it, but you’ll learn far faster with it. “Humans learn through a combination of both doing and reading,” said Karthik Narasimhana computer scientist at Princeton University. “We want machines to do the same.”

New work from Zhong and others shows that priming a learning model in this way can supercharge learning in simulated environments, both online and in the real world with robots. And it doesn’t just make algorithms learn faster — it guides them toward skills they’d otherwise never learn. Researchers want these agents to become generalists, capable of learning anything from chess to shopping to cleaning. And as demonstrations become more practical, scientists think this approach might even change how humans can interact with robots.

“It’s been a pretty big breakthrough,” said Brian Ichter, a research scientist in robotics at Google. “It’s pretty unimaginable how far it’s come in a year and a half.”

Sparse Rewards

At first glance, machine learning has already been remarkably successful. Most models typically use reinforcement learningwhere algorithms learn by getting rewards. They begin totally ignorant, but trial and error eventually becomes trial and triumph. Reinforcement learning agents can easily master simple games.

Consider the video game Snake, where players control a snake that grows longer as it eats digital apples. You want your snake to eat the most apples, stay within the boundaries and avoid running into its increasingly bulky body. Such clear right and wrong outcomes give a well-rewarded machine agent positive feedback, so enough attempts can take it from “noob” to High Score.

But suppose the rules change. Perhaps the same agent must play on a larger grid and in three dimensions. While a human player could adapt quickly, the machine can’t, because of two critical weaknesses. First, the larger space means it takes longer for the snake to stumble upon apples, and learning slows exponentially when rewards become sparse. Second, the new dimension provides a totally new experience, and reinforcement learning struggles to generalize to new challenges.

Zhong says we don’t need to accept these obstacles. “Why is it that when we want to play chess” — another game that reinforcement learning has mastered — “we train a reinforcement learning agent from scratch?” Such approaches are inefficient. The agent wanders around aimlessly until it stumbles upon a good situation, such as a checkmate, and Zhong says it requires careful human design to get the agent to know what it means for a situation to be good. “Why do we have to do this when we already have so many books on how to play chess?”

Partly, it’s because machines have struggled to understand human language and decipher images in the first place. For a robot to complete vision-based tasks like finding and slicing carrots, for example, it must know what a carrot is — the image of a thing must be “grounded” in a more fundamental understanding of what that thing is. Until recently, there was no good way of doing that, but a boom in the speed and scale of language and image processing has made the new successes possible.

New natural language processing models allow machines to essentially learn the meaning behind words and sentences — to ground them in things in the world — rather than just store a simple (and limited) meaning like a digital dictionary.

Computer vision has seen a similar digital explosion. Around 2009, ImageNet debuted as a database of annotated images for computer vision research. Today it hosts over 14 million images of objects and places. And programs like OpenAI’s FROM ·E generate new images upon command that look human-made, despite having no exact comparison to draw from.

It shows how machines only now have access to enough online data to really learn about the world, according to Anima Anandkumara computer scientist at the California Institute of Technology and Nvidia. And it’s a sign that they can learn from concepts as we do and use them for generation. “We are in such a great moment now,” she said. “Because once we can get generation, there is so much more we can do.”

Gaming the System

Researchers like Zhong decided machines didn’t have to embark on their explorations wholly uninformed anymore. Armed with sophisticated language models, the researchers could add a pre-training step where a program learned from online information before its trials and errors.

To test the idea, he and his colleagues compared the pre-training to traditional reinforcement learning in five different game-like settings where machine agents interpreted language commands to solve problems. Each simulated environment challenged the machine agent uniquely. One asked the agent to manipulate items in a 3D kitchen; another required reading text to learn a precise sequence of actions to fight monsters. But the most complicated setting was a real game, the 35-year-old NetHack, where the goal is to navigate a sophisticated dungeon to retrieve an amulet.

For the simple settings, automated pre-training meant simply grounding the important concepts: This is a carrot, that is a monster. For NetHack, the agent trained by watching humans play, using playthroughs uploaded to the internet by human players. These playthroughs didn’t even have to be that good — the agent only needed to build intuition for how humans behave. The agent wasn’t meant to become an expert, just a regular player. It would build intuition by watching — what would a human do in a given scenario? The agent would decide what moves were successful, formulating its own carrot and stick.

“Through pre-training, we form good priors for how to associate language descriptions with things that are happening in the world,” Zhong said. The agent would play better from the start and learn more quickly during subsequent reinforcement learning.

As a result, the pre-trained agent did outperform the traditionally trained one. “We get gains across the board in all five of these environments,” Zhong said. Simpler settings only showed a slight edge, but in NetHack’s complicated dungeons, the agent learned many times faster and reached a skill level that the classic approach couldn’t. “You might be getting a 10x performance because if you don’t do this, then you just don’t learn a good policy,” he said.

“These generalist agents are a big leap from what standard reinforcement learning does,” Anandkumar said.

Her team also pre-trains agents to get them to learn more quickly, achieving significant progress on the world’s bestselling video game, Minecraft. It’s known as a “sandbox” game, meaning it gives players a virtually infinite space in which to interact and create new worlds. It’s futile to program a reward function for thousands of tasks individually, so instead the team’s model (“Mine Dojo”) built its understanding of the game by watching captioned playthrough videos. No need to codify good behavior.

“We are getting automated reward functions,” Anandkumar said. “This is the first benchmark with thousands of tasks and the ability to do reinforcement learning with open-ended tasks specified through text prompts.”

Beyond Games

Games were a great way to show that pre-training models could work, but they’re still simplified worlds. Training robots to handle the real world, where the possibilities are practically endless, is much harder. “We asked the question: Is there something in between?” Narasimhan said. So he decided to do some online shopping.

His team created Online store. “It’s basically like a shopping butler,” Narasimhan said. Users can say something like “Give me a Nike shoe that’s white and under $100, and I want the reviews to state that they’re very comfortable for toddlers,” and the program finds and buys the shoe.

As with Zhong’s and Anandkumar’s games, WebShop developed an intuition by training with images and text, this time from Amazon pages. “Over time, it learns to understand the language and map it to actions it has to take on the website.”

At first glance, a shopping butler may not seem that futuristic. But while a cutting-edge chatbot can link you to a desired sneaker, interactions like placing the order require a wholly different skill set. And even though your bedside Alexa or Google Home speakers can place orders, they rely on proprietary software that carries out preordained tasks. WebShop navigates the web the way people do: by reading, typing and clicking.

“It’s a step closer toward general intelligence,” Narasimhan said.

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

FOSJGO 2.4 Gal(9L) Collapsible Dish Basin with Drain Plug,Space Saving Multiuse Foldable Sink Tub,Dishpan,Kitchen Sink for Camping,Plastic Tub,Vegetable Washing,Beverage Tubs

(6816)

$16.99 (as of November 6, 2024 18:51 GMT +00:00 - )

Disney 5 Piece Random Assorted Mystery Pin Pack Ice Cream Bars

(19)

$49.98 (as of November 6, 2024 18:51 GMT +00:00 - )

Apple iPhone 12, 64GB, (Product) Red - AT&T (Renewed Premium)

(26631)

$266.00 (as of November 6, 2024 18:51 GMT +00:00 - )

BISSELL Steam Shot OmniReach Handheld Steam Cleaner

(37)

$49.99 (as of November 6, 2024 18:48 GMT +00:00 - )

Puoyis MAGA Hat Make America Great Again Hat, Trump Hat, Trump 2024 KAG Hat Baseball Cap

(3046)

$19.99 (as of November 6, 2024 18:44 GMT +00:00 - )

Index Of News Author

Science and Medical

The major science-fiction films that get botany spectacularly wrong

Life | Comment 15 December 2021 By James Wong Michelle D’urbanoWHAT with everything that has happened this year, I have found myself at home watching the box more often than usual. That was especially true recently when I was laid up in bed for three weeks with covid-19. I spent my time largely watching my…

December 15, 2021

Science and Medical

Revolutionary “Bionic” Pacemaker Reverses Heart Failure

A revolutionary pacemaker that re-establishes the heart’s naturally irregular beat is set to be trialed in New Zealand heart patients this year. A revolutionary pacemaker that re-establishes the heart’s naturally irregular beat is set to be trialed in New Zealand heart patients this year, following successful animal trials. “Currently, all pacemakers pace the heart metronomically,…

February 11, 2022

Science and Medical

Hyperbolic Stretching Review: Does it Work?

The Hyperbolic Stretching Program is one of the advanced result-oriented fitness programs that encourage you to unleash your performance potential within 4 weeks. This gender-specific program developed by Alex Larrson works to improve your entire body flexibility to reach optimal performance in stretching exercises.Everyone wants a strong and healthy body. Though it takes time and…

September 27, 2021

Science and Medical

Netflix movies, shows, & series: New releases for the week of January 2nd

We're back once again to share all of the latest Netflix releases for this week. As always, you can see everything Netflix has been adding to its library this year in our huge roundup of all the best movies and series. But for now, let's focus on everything Netflix is launching this week. Some of…

January 2, 2022

Science and Medical

Regular Coffee Drinking Associated with Lower Blood Pressure

According to new research from the University of Bologna and the Sant’Orsola-Malpighi University Hospital, self-reported regular coffee drinkers have significantly lower peripheral and aortic blood pressure than non-coffee drinkers; however, self-reported coffee consumption seems to not be significantly associated with arterial stiffness parameters. People who drank 2 cups of coffee per day and people who

February 9, 2023

Science and Medical

Arqit launches sale of satellite division

Virgin Orbit, which has invested in Arqit, was lined up to launch its satellites but collapsed into bankruptcy earlier this year. Credit: Virgin Orbit TAMPA, Fla. — British cybersecurity software developer Arqit has hired financial adviser Silverpeak to sell its space division following interest from potential buyers, according to a source close to the process.

May 16, 2023

Hand-Picked Top-Read Stories

The United Charms of Baseball

Watching an American Election from Across the Pond

The Influence of Sedona Prince

Trending Tags

Machines Learn Better if We Teach Them the Basics

Sparse Rewards

Gaming the System

Beyond Games

FOSJGO 2.4 Gal(9L) Collapsible Dish Basin with Drain Plug,Space Saving Multiuse Foldable Sink Tub,Dishpan,Kitchen Sink for Camping,Plastic Tub,Vegetable Washing,Beverage Tubs

Disney 5 Piece Random Assorted Mystery Pin Pack Ice Cream Bars

Apple iPhone 12, 64GB, (Product) Red - AT&T (Renewed Premium)

BISSELL Steam Shot OmniReach Handheld Steam Cleaner

Puoyis MAGA Hat Make America Great Again Hat, Trump Hat, Trump 2024 KAG Hat Baseball Cap

Adding Darolutamide: New SOC for Metastatic Prostate Cancer

9 Basic Google Sheets Functions You Should Know

Матч Безуса і Ко у чемпіонаті Бельгії екстрено перенесли – зірваний дах стадіону загрожував життю уболівальників

I owe no one explanation for my failed marriage – Ninalowo warns bloggers against posting fake news

New observations from ICESat-2 show remarkable Arctic sea ice thinning in just three years

The United Charms of Baseball

Watching an American Election from Across the Pond

The Influence of Sedona Prince

Trump’s Final Days on the Campaign Trail

New Yorkers urged to conserve water after driest October in 150 years

Machines Learn Better if We Teach Them the Basics

Sparse Rewards

Gaming the System

Beyond Games

Related Posts