Debbie Roth

22%
Flag icon
Dopamine is not a signal for reward but for reinforcement. As Sutton found, reinforcement and reward must be decoupled for reinforcement learning to work. To solve the temporal credit assignment problem, brains must reinforce behaviors based on changes in predicted future rewards, not actual rewards.
A Brief History of Intelligence: Evolution, AI, and the Five Breakthroughs That Made Our Brains
Rate this book
Clear rating
Open Preview