This approach leverages the epsilon-decreasing strategy. Epsilon refers to the proportion of time spent exploring an alternative, to ensure that it is indeed less effective. Since we decrease epsilon as our confidence in the better ad is reinforced, this technique belongs to a class of algorithms known as reinforcement learning.

