Reinforcement learning is similar to unsupervised learning in that the training data isn’t labeled. But when the system draws a conclusion about the data or acts upon it in some way, the outcome is graded, or rewarded, and the algorithm accordingly learns what actions are most valued.