Definitely Maybe

As I’ve mentioned on here before, one of my most treasured possessions for many years was a mug, given to me by a group of ancient history students; white with blue lettering, reading “The simple answer is… we just don’t know”. This gift meant a huge amount to me partly because they’d obviously put so much thought into it and partly because it suggested that they had perfectly grasped what I had been trying to teach them – that if I were to sum up the core message of my course on historical theory and methodology, this would be it. Maybe it was something I said with hilarious or annoying frequency, maybe it was the centre square on a Neville bingo card, maybe in fact it drove them up the wall, but as far as I was concerned it showed that they got me.

The key point – and this might have been equally aggravating, if what they were looking for was a straight answer to a simple question – was that this line was only ever the beginning of the response. In itself, it’s clearly inadequate; what you then need is an explanation of why we don’t know, the limitations and difficulties of the evidence, the problems of the key concepts or the framing of the question, the inevitable uncertainties and ambiguities of any attempt at exploring the past. That’s the interesting and illuminating stuff, and understanding uncertainty is arguably one of the most important skills to be learned through the study of history; the initial statement, however, is the general lesson that simple answers aren’t to be trusted, as it is usually more tricky and complicated than it appears, for most values of ‘it’.

I was reminded of this by a discussion (https://www.sciencealert.com/openai-has-a-fix-for-hallucinations-but-you-really-wont-like-it) this morning of a new OpenAI paper discussing the problem of ‘hallucinations’, i.e. the generation of false information by LLMs. In a system that generates text on the basis of strings of predictions of what word will follow another word, such errors are inevitable, especially in relation to terms and ideas that appear only rarely in training data (which is why they really can’t cope with anything new). As the article notes, processes of fine-tuning the model by providing feedback on its results currently treat a response of ‘I don’t know’ exactly the same as a wrong answer, which directly programmes the LLM to bullshit: “the expected score of guessing always exceeds the score of abstaining when an evaluation uses binary grading”.

The proposed solution is to establish confidence thresholds; if wrong answers are penalised more heavily than right answers are rewarded, and the LLM is instructed to express uncertainty rather than assert an answer if its confidence level is e.g. less than 75%, then it ought to offer less bullshit in future. One problem is that evaluating multiple possible responses, evaluating confidence levels in different answers and asking clarifying questions requires significantly more computational power, making the process much more expensive; this might make sense for specific business uses where errors would be far too costly, but isn’t feasible for casual queries, student essay-generation and the like. Overall, confident bullshit is the only cost-effective approach, banking on users not knowing or not caring about the problems.

That could well be a reasonable assumption; the paper also notes the risk that an LLM regularly – 30% of the time, maybe – declining to give an answer because of uncertainty would quickly alienate potential users. “Users accustomed to receiving confident answers to virtually any question would likely abandon such systems rapidly.” What the LLM doesn’t do – not least because this would also require a lot more expensive computational power – is develop the answer with an explanation of why we aren’t sure, and why this is actually interesting (often, more interesting than the original question). Yes, I imagine you could programme it to appear to do that – but of course it doesn’t know why it doesn’t know, it simply identifies a degree of uncertainty in the choice between possible outputs, and so if you demanded such an explanation you’d just be getting the same probabilistic string of text, subject to the same risks of inaccurate information.

This is not the LLM’s ‘fault’ in any meaningful sense, except insofar as it’s the ‘fault’ of a hammer that it doesn’t drill holes very effectively. It’s a matter of the expectations of users, wanting simple answers without having to make any effort and believing in the existence of simple answers even when these are manifestly unlikely to exist; and it’s a matter of the AI companies that have claimed to be able to offer them, instantly and at minimal cost.

It’s almost a variant of the Cretan Liar problem; the LLM will answer your question but you can’t trust it, the pedantic historian will not tell you want you want to know – but will perhaps tell you what you need to know…

 •  0 comments  •  flag
Share on Twitter
Published on September 17, 2025 07:16
No comments have been added yet.


Neville Morley's Blog

Neville Morley
Neville Morley isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Neville Morley's blog with rss.