Damn That Parson Bayes and His Cursed Theorem: Red Flagging Red Flag Rules
In the aftermath of the El Paso and Dayton shootings, “red flag” rules are all the rage. Identify people who are at high risk of committing such atrocities, and prevent them from buying weapons.
Most of the arguments in favor of this rely on statements like “many mass shooters have characteristic X (e.g., mental illness), so let’s prevent those with characteristic X from buying guns.” As appealing as these arguments sound, they founder due to a failure to understand fundamental probability concepts which imply that for extremely rare events like mass shootings, red flags are extremely unreliable.
Most of the arguments in favor of red flags rely on estimates of P(X|M), i.e., the probability that someone who committed a mass murder (“M“) had characteristic X. For example, “70 percent of mass shooters present evidence of mental illness.” Or Y percent play violent video games or post racist rants online.
But what we really need to know in order to implement red flags that do not stigmatize, and deny the rights of, people who present a low risk of committing a mass shooting is P(M|X): “what is the probability that someone with characteristic X will commit a mass shooting?” Although most people argue as if P(X|M) and P(M|X) are interchangeable, they are not, as Thomas Bayes demonstrated in the 18th century when he demonstrated something now called Bayes’ Theorem.
As Bayes showed, P(M|X)=P(X|M)P(M)/P(X) where P(M) is the unconditional probability someone is a mass shooter, and P(X) is the unconditional probability that someone has characteristic X.
The problem with attempting to determine whether someone with X poses a risk is that mass shooters are extremely rare, and hence P(M) is extremely small.
USA Today estimated there were 270 odd mass shootings between 2005 and 2017. A Michael Bloomberg-funded anti-gun group counts 110. Given a population of around 300 million, even using the higher number a rough estimate of P(M) is 9e-7: a 9 with six zeros in front of it. Therefore, even if P(X|M)=1 (i.e., all mass shooters share some characteristic X) , for any characteristic X that occurs fairly frequently in the population P(M|X) is extremely small.
Consider a characteristic where there is fairly good data on on P(X): schizophrenia. It is estimated that 1 percent of the population is schizophrenic. Plugging .01 for P(X) gives a value of P(M|X) of 9e-5, or about 1 out of 10,000. Meaning that the likelihood a random schizophrenic will commit a mass shooting is .001 percent.
This actually overstates matters, because P(X|M)P(X|M) is likely to be far less than one for most characteristics.
Things get even worse if one broadens the scope of the characteristic used to define the red flag. If instead of using schizophrenia, one uses serious mental illness, by some measures P(X)=.2. Well, if you increase the denominator by a factor of 20, P(M|X) falls by a factor of 20. So instead of a probability of .001 percent, the probability is .00005 percent.
And again, that is an exaggeration because it assumes P(X|M)=1.
Meaning that putting a red flag on schizophrenics or those who have experienced some mental illness will be vastly overinclusive.
Of course, life is a matter of trade-offs. One must weigh the costs imposed on those who are wrongly stigmatized (“false positives”) with the benefit of reducing mass shootings by imposing restrictions based on an overinclusive, but at least somewhat informative signal (i.e., a signal with P(X|M)>0).
For some there is no trade-off at all. For those primarily on the left who believe that guns are an anathema and have no benefit whatsoever, even a 99.99995 percent false positive rate is not at all costly. However, a very large number of Americans do think bearing arms is beneficial, these false positives come at a high cost.
That’s where the debate should really focus: the rate of false positives and the cost of those false positives vs. the benefits of true positives (which would represent mass shootings avoided). What Bayes’ Theorem implies is that for an act that someone is extremely unlikely to commit, that false positive rate is likely to be extremely high. It also implies that debating in terms of P(X|M) provides very little insight. P(M) is small, and for any fairly common characteristic, P(X) is fairly large, so P(X|M) has relatively little impact on the rate of false positives.
Again, what Bayes’ Theorem tells us is that for a rare event like mass shooting, vastly more innocent people than true risks will be red flagged. The costs of restricting those who pose no risk must be weighed against the benefits of reducing modestly the risk of a very rare event. Further, it must be recognized that implementing red flag rules are costly, and in these costs should be included the invasions of privacy that they inevitably entail. Yet further, red flag rules are certain to be abused by those with a grudge. And yet further, many of those with characteristic X will escape detection, or will be able to evade the legal restrictions (and indeed have a high motivation to do so).
In the aftermath of mass shootings, there is a hue and cry to do something. The hard lesson taught by Parson Bayes is that there is not a lot we can do. Or put more precisely, those things that we can do will inevitably stigmatize and restrict vastly more innocent people than constrain malign ones.
Craig Pirrong's Blog
- Craig Pirrong's profile
- 2 followers

