Judges also have been found to underestimate the difficulty of hard items and overestimate the difficulty of easy items, which can lead them to set higher standards when the items they evaluate are more difficult.8 Changing the response probability used with the bookmark method—an arbitrary choice—can have dramatic effects on the placement of the standards.9

