Given the current high-stakes uses of tests, we can be confident of the validity of inferences about improvement only if we have an additional type of validity evidence: a comparison to a second measure less threatened by the possibility of corruption (often called an audit test). The logic of using an audit test is simple: if gains on the tested sample generalize to the domain, they should generalize to other, similar samples from the domain. In this case, the similar sample is the audit test.

