Mark Gerstein

31%
Flag icon
The most basic model has several major components. The policy describes how actions are chosen in any particular state. In general, this is based on the estimated value of each possible action in that particular state. In our casino example, we would need to estimate the value of the winnings from each of the slot machines. Starting out we don’t really know these values, so we would just assume that they are all the same; the goal of the reinforcement learning model is to learn the values through experience. The simplest policy would be to just choose the machine that has the highest estimated ...more
Hard to Break: Why Our Brains Make Habits Stick
Rate this book
Clear rating
Open Preview