Error Pop-Up - Close Button Could not find Kindle Notes & Highlights for that user.

Juan  Luis  Cordero

9%
Flag icon
Regret is the result of comparing what we actually did with what would have been best in hindsight. In a multi-armed bandit, Barnard’s “inestimable loss” can in fact be measured precisely, and regret assigned a number: it’s the difference between the total payoff obtained by following a particular strategy and the total payoff that theoretically could have been obtained by just pulling the best arm every single time (had we only known from the start which one it was). We can calculate this number for different strategies, and search for those that minimize it.
Algorithms to Live By: The Computer Science of Human Decisions
Rate this book
Clear rating
Open Preview