The Gradient

Share this post
How to Train your Decision-Making AIs
thegradientpub.substack.com
Articles

How to Train your Decision-Making AIs

How do humans transfer their knowledge and skills to artificial decision-making agents more efficiently? What kind of knowledge and skills should humans provide and in what format?

Dec 11, 2021
Share this post
How to Train your Decision-Making AIs
thegradientpub.substack.com
https://thegradient.pub/content/images/2021/12/main.png

How to Train your Decision-Making AIs

The combination of deep learning and decision learning has led to several impressive stories in decision-making AI research, including AIs that can play a variety of games (Atari video games, board games, complex real-time strategy game Starcraft II), control robots (in simulation and in the real world), and even fly a weather balloon. These are examples of sequential decision tasks, in which the AI agent needs to make a sequence of decisions to achieve its goal.

Today, the two main approaches for training such agents are reinforcement learning (RL) and imitation learning (IL). In reinforcement learning, humans provide rewards for completing discrete tasks, with the rewards typically being delayed and sparse. But, success stories about RL and IL are often based on the fact that we can train AIs in simulated environments with a large amount of training data.

What if we don’t have a simulator for the learning agent to fool around in? What if these agents need to learn quickly and safely? What if the agents need to adapt to individual human needs? These concerns lead to the key questions we ask, which are: How do humans transfer their knowledge and skills to artificial decision-making agents more efficiently? What kind of knowledge and skills should humans provide and in what format?

Continue Reading ->

Share this post
How to Train your Decision-Making AIs
thegradientpub.substack.com
Comments

Create your profile

0 subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

TopNewCommunity

No posts

Ready for more?

© 2022 The Gradient
Privacy ∙ Terms ∙ Collection notice
Publish on Substack Get the app
Substack is the home for great writing