Definitely Maybe

A fascinating morning in a rather different sort of exam board to the ones I’ve been used to. Exeter, like I imagine most if not all other UK universities, is busy covering itself against the possibility that student performances were adversely affected by the strike action, and part of the response is a barrage of statistical analysis; rather than looking at individual mark profiles, today was all about considering the patterns of marks at module level, with the aim of identifying anywhere the distribution differs from previous years to a statistically significant degree, so that the possibility of scaling can then be considered – with individual profiles being reviewed once any additional number crunching has taken place.


This all led me to blow my cover as an amateur statistics nerd, as I not only found the whole thing fascinating but was happy to defend it against sceptical colleagues. The objections were interesting, and arguably revealing. Firstly, that this sort of cohort approach completely ignores the individual student experience, and telling a student who strongly feels their studies were disrupted that, no, we’ve done the numbers and they weren’t is not a good look. That’s certainly true in public relations terms, and it is definitely important that, when we come to look at individual (anonymised) profiles, it will be clear if a given student has had a lot of their modules disrupted and if their performance is substantially worse than last year. But it is surely not irrelevant to be able to say that, according to the numbers, performances across a given module were not significantly different from previous years, even if a given student feels that they ought to have been affected; and for me the advantage of this approach is that, if we had found a case where the numbers suggested a problem, any remedy would have benefitted all the affected students rather than just those who complained. There is always a risk of making concessions to those who make a fuss, neglecting those who just accept circumstances and decisions.


The second objection is about the marker, not the student: that it’s invidious to be compared with someone else’s marking practices. But this happens all the time (it’s what moderation and external examining is all about); it’s just that this is on a more solid evidential basis. I found it most interesting to realise that, while I hadn’t thought that one class was as good as last year’s lot, I’d actually marked them very similarly – my impression was clearly based on things other than their performances, or on a limited number of things (fewer really high firsts, perhaps) rather than a dispassionate overview of the whole cohort. I have a sense of myself as a relatively strict marker, because I tend to remember cases where I’m arguing down a colleague – but again the numbers don’t wholly support this impression.


If we were all expected to conform to a single pattern, and marks were automatically modified if they didn’t follow a normal distribution, there would clearly be a problem; there are lots of reasons, from the particular group of students to the group dynamic to changes in assessment, why performance might vary from year to year or module to module. It’s the start of a conversation, or of self-reflection, to be shown how e.g. students at the lower end perform less well in my final-year Thucydides class than in their other modules while those at the upper end do better. Yes, there’s something slightly disturbing about a colleague getting a p-value of 1 in a t-test comparing this year’s marks with last year’s – but we’re not about to start quoting p-values in staff review. We’ll just keep a close eye out for further evidence that he’s actually a replicant.


Is this an all too conventional humanities suspicion of numbers, and still more of what Certain People seek to do with numbers? A resistance to the idea of reducing individual uniqueness – of the student *and* the lecturer – to crude comparisons and cohort analysis? Of the implied loss of autonomy and denigration of authority? The simple fact is that, while I don’t doubt my own judgement in its own terms, I’m quite conscious that I value certain attributes and approaches more than others, and that’s something that does need to be checked; but it’s clear from this data that there are also patterns, and even biases, of which I am not so aware, and those also need to be checked.


You can’t assume that the numbers tell you all you need to know; but what they do tell you – crudely, this is what you actually did, regardless of what you thought you did, and this is how it compares – is not to be ignored.

 •  0 comments  •  flag
Share on Twitter
Published on June 13, 2018 10:48
No comments have been added yet.


Neville Morley's Blog

Neville Morley
Neville Morley isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Neville Morley's blog with rss.