This is a readable intro to test design. Author W. James Popham is careful to explain the difference between test scores and the judgements we make based on test scores. He also explains the conceptual framework of statistics that is needed to make sense of these ideas (i.e. reliability vs. validity), without getting bogged down in technical jargon or formulas. Lots of his pointers are very concrete, although some seem painfully obvious (don't use double-negatives in test questions; when writing multiple choice questions, don't give away the answer by using the article "an" when only one option starts with a vowel). But many of suggestions were helpful to me, such as how to use tests to determine what to teach/how to teach, or thoughtful examples of how to make a cursory reading lead to a wrong answer without making the questions too complicated. The guidelines for rubrics were quite good as well (can students use it to self-assess? Is the rubric general enough to be recycled for other assignments? Are the criteria actually teachable?). I also really liked the section on portfolio assessment. One suggestion that I will use is to require students to practice assessing themselves and each other using a rubric, and then to submit a filled-in rubric with their portfolio assignments, so that self-assessment can be part of our conversation. There's a good explanation of how to create your own Likert inventories, as well.

Popham advocates for standards-based testing that is reported by standard (rather than in an aggregate). There's a great chapter that briefly explains the politics and economics of standardized tests. He pulls no punches in criticizing inappropriate applications of standardized tests. But he also includes a chapter on how to create your own standardized tests (invite parents to help vet the test; choose a few high-priority standards to assess; report the results by individual standards; consider using a split-and-switch design when comparing post-tests to pre-tests). The book paints a clear picture of how to assemble credible evidence of instructional effectiveness, even when read by a non-specialist like me.

