Debbie Roth

57%
Flag icon
This technique is called “inverse reinforcement learning” because these systems first try to learn the reward function they believe the skilled expert is optimizing for (i.e., their “intent”), and then these systems learn by trial and error, rewarding and punishing themselves using this inferred reward function. An inverse reinforcement learning algorithm starts from an observed behavior and produces its own reward function; whereas in standard reinforcement learning the reward function is hard-coded and not learned.
A Brief History of Intelligence: Evolution, AI, and the Five Breakthroughs That Made Our Brains
Rate this book
Clear rating
Open Preview