If a small P-value is observed, then either something very surprising has happened, or the null hypothesis is untrue: the smaller the P-value, the more evidence that the null hypothesis might be an inappropriate assumption. This was intended as a fairly informal procedure, but in the 1930s Neyman and Pearson developed a theory of inductive behaviour which attempted to put hypothesis testing on a more rigorous mathematical footing.