The reinforcement learning model that we described above doesn’t have any knowledge about how the world works—it simply tries all of the possible actions and learns which one leads to the best outcome on average. Researchers refer to this (somewhat confusingly) as model-free reinforcement learning, because the learner does not have a model of how the world works.

