Allen B. Downey's Blog: Probably Overthinking It, page 6

September 19, 2023

The Overton Paradox in Three Graphs

Older people are more likely to say they are conservative.

And older people believe more conservative things.

But if you group people by decade of birth, most groups get more liberal as they get older.

So if people get more liberal, on average, why are they more likely to say they are conservative?

Now there are three ways to find out!

Today Brilliant launched an interactive article explaining the Overton Paraodox.Also today, the SuperDataScience Podcast published a conversation about Probably Overthinking It, including a discussion of the Overton Paradox. You can also watch the video on YouTube.And I gave a talk about it at PyData NYC 2022, which you can watch here.

Since some people have asked, I should say that “Overton Paradox” is the name I am giving this phenomenon. It’s named after the Overton window, for reasons that will be clear if you read my explanation.

1 like ·   •  0 comments  •  flag
Share on Twitter
Published on September 19, 2023 07:19

September 4, 2023

How Principal Are Your Components?

In a previous post I explored the correlations between measurements in the ANSUR-II dataset, which includes 93 measurements from a sample of U.S. military personnel. I found that measurements of the head were weakly correlated with measurements from other parts of the body – and in particular the protrusion of the ears is almost entirely uncorrelated with anything else.

A friend of mine, and co-developer of the Modeling and Simulation class I taught at Olin, asked whether I had tried running principal component analysis (PCA). I had not, but now I have. Let’s look at the results.

Click here to run this notebook on Colab.

The ANSUR data is available from The OPEN Design Lab.

Explained Variance

Here’s a visualization of explained variance versus number of components.

With one component, we can capture 44% of the variation in the measurements. With two components, we’re up to 62%. After that, the gains are smaller (as we expect), but with 10 measurements, we get up to 78%.

Loadings

Looking at the loadings, we can see which measurements contribute the most to each of the components, so we can get a sense of which characteristics each component captures.

I won’t explain all of the measurements, but if there are any you are curious about, you can look them up in The Measurer’s Handbook, which includes details on “sampling strategy and measuring techniques” as well as descriptions and diagrams of the landmarks and measurements between them.

Principal Component 1:0.135 suprasternaleheight0.134 cervicaleheight0.134 buttockkneelength0.134 acromialheight0.133 kneeheightsittingPrincipal Component 2:0.166 waistcircumference-0.163 poplitealheight0.163 abdominalextensiondepthsitting0.161 waistdepth0.159 buttockdepthPrincipal Component 3:0.338 elbowrestheight0.31 eyeheightsitting0.307 sittingheight0.228 waistfrontlengthsitting-0.225 heelbreadthPrincipal Component 4:0.247 balloffootcircumference0.232 bimalleolarbreadth0.22 footbreadthhorizontal0.218 handbreadth0.212 sittingheightPrincipal Component 5:0.319 interscyeii0.292 biacromialbreadth0.275 shoulderlength0.273 interscyei0.184 shouldercircumferencePrincipal Component 6:-0.34 headcircumference-0.321 headbreadth0.316 shoulderlength-0.277 tragiontopofhead-0.262 interpupillarybreadthPrincipal Component 7:0.374 crotchlengthposterioromphalion-0.321 earbreadth-0.298 earlength-0.284 waistbacklength0.253 crotchlengthomphalionPrincipal Component 8:0.472 earprotrusion0.346 earlength0.215 crotchlengthposterioromphalion-0.202 wristheight0.195 overheadfingertipreachsittingPrincipal Component 9:-0.299 tragiontopofhead0.294 crotchlengthposterioromphalion-0.253 bicristalbreadth-0.228 shoulderlength0.189 neckcircumferencebasePrincipal Component 10:0.406 earbreadth0.356 earprotrusion-0.269 waistfrontlengthsitting0.239 earlength-0.228 waistbacklength

Here’s my interpretation of the first few components.

Not surprisingly, the first component is loaded with measurements of height. If you want to predict someone’s measurements, and can only use one number, choose height.The second component is loaded with measurements of girth. No surprises so far.The third component seems to capture torso length. That makes sense — once you know how tall someone is, it helps to know how that height is split between torso and legs.The fourth component seems to capture hand and foot size (with sitting height thrown in just to remind us that PCA is not obligated to find components that align perfectly with the axes we expect).Component 5 is all about the shoulders.Component 6 is mostly about the head.

After that, things are not so neat. But two things are worth noting:

Component 7 is mostly related to the dimensions of the pelvis, but…Components 7, 8, and 10 are surprisingly loaded up with ear measurements.

As we saw in the previous article, there seems to be something special about ears. Once you have exhausted the information carried by the most obvious measurements, the dimensions of the ear seem to be strangely salient.

 •  0 comments  •  flag
Share on Twitter
Published on September 04, 2023 12:54

August 27, 2023

Taming Black Swans

At SciPy 2023 I presented a talk called “Taming Black Swans: Long-tailed distributions in the natural and engineered world“. Here’s the abstract:


Long-tailed distributions are common in natural and engineered systems; as a result, we encounter extreme values more often than we would expect from a short-tailed distribution. If we are not prepared for these “black swans”, they can be disastrous.


But we have statistical tools for identifying long-tailed distributions, estimating their parameters, and making better predictions about rare events.


In this talk, I present evidence of long-tailed distributions in a variety of datasets — including earthquakes, asteroids, and stock market crashes — discuss statistical methods for dealing with them, and show implementations using scientific Python libraries.


The video from the talk is on YouTube now:

I didn’t choose the thumbnail, but I like it.

Here are the slides, which have links to the resources I mentioned.

Don’t tell anyone, but this talk is part of my stealth book tour!

It started in 2019, when I presented a talk at PyData NYC based on Chapter 2: Relay Races and Revolving Doors.In 2022, I presented another talk at PyData NYC, based on Chapter 12: Chasing the Overton Window.In May I presented a talk at ODSC East based on Chapter 7: Causation, Collision, and Confusion.And this talk is based on Chapter 8: The Long Tail of Disaster.

If things go according to plan, I’ll present Chapter 1 at a book event at the Needham Public Library on December 7.

More chapters coming soon!

 •  0 comments  •  flag
Share on Twitter
Published on August 27, 2023 14:28

August 20, 2023

How Correlated Are You?

This post is an offshoot from Chapter 1 of Probably Overthinking It, which is available for pre-order now!

Suppose you measure the arm and leg lengths of 4082 people. You would expect those measurements to be correlated, and you would be right. In the ANSUR-II dataset, among male members of the armed forces, this correlation is about 0.75 — people with long arms tend to have long legs.

And how about arm length and chest circumference? You might expect those measurements to be correlated too, but not as strongly as arm and leg length, and you would be right again. The correlation is about 0.47.

So some pairs of measurements are more correlated than others. There are a total of 93 measurements in the ANSUR-II dataset, which means there are 93 * 92 = 8556 correlations between pairs of measurements. So here’s a question that caught my attention: Are there measurements that are uncorrelated (or only weakly correlated) with the others?

To answer that, I computed the average magnitude (positive or negative) of the correlation between each measurement and the other 92. The most correlated measurement is weight, with an average of 0.56. So if you have to choose one measurement, weight seems to provide the most information about all of the others.

The least correlated measurement turns out to be ear protrusion — its average correlation with the other measurements is only 0.03, which is not just small, it is substantially smaller than the next smallest, which is ear breadth, with an average correlation of 0.13.

Diagram showing where ear protrusion is measured, from The Measurer’s Handbook .Diagram showing where ear breadth is measured, from The Measurer’s Handbook .

So it seems like there is something special about ears.

Beyond the averages

We can get a better sense of what’s going on by looking at the distribution of correlations for each measurement, rather than just the averages. I’ll use my two favorite data visualization tools: CDFs, which make it easy to identify outliers, and spaghetti plots, which make it easy to spot oddities.

This figure shows the CDF of correlations for each of the 93 measurements.

Here are the conclusions I draw from this figure:

Correlations are almost all positive

Almost all of the correlations are positive, we we’d expect. The exception is elbow rest height, which is negatively correlated with almost half of the other measurements. This oddity is explainable if we consider how the measurement is defined:

Diagram showing where elbow rest height is measured, from The Measurer’s Handbook .

All of the other measurements are based on the distance between two parts of the body; in contrast, elbow rest height is the distance from the elbow to the chair. It is negatively correlated with other measurements because it measures a negative space — in effect, it is the difference between two other measurements: torso length and upper arm length.

Many distributions are multimodal

Overall, most correlations are moderate, between 0.2 and 0.6, but there are a few clusters of higher correlations, between 0.6 and 1.0. Some of these high correlations are spurious because they represent multiple measurements of the same thing — for example when one measurement is the sum of another two, or nearly so.

A few distributions have low variance

The distributions I’ve colored and labeled have substantially lower variance than the others, which means that they are about equally correlated with all other measurements. Notably, all of them are located on the head. It seems that the dimensions of the head are weakly correlated with the dimensions of the rest of the body, and that correlation is remarkably consistent.

And finally…

Ear protrusion isn’t correlated with anything

Among the unusual measurements with low variance, ear protrusion is doubly unusual because its correlations are so consistently weak. The exceptions are ear length (0.22) and ear breadth (0.08) — which make sense — and posterior crotch length (0.11), shown here:

The others are small enough to be plausibly due to chance.

I have a conjecture about why: ear protrusion might depend on details of how the ear develops, which might depend on idiosyncratic details of the developmental environment, with little or no genetic contribution. In that sense, ear protrusion might be like fingerprints.

All of these patterns are the same for women

Here’s the same figure for the 1986 female ANSUR-II participants:

The results are qualitatively the same. The variance in correlation with ear protrusion is higher, but that is consistent with random chance and a smaller sample size.

In conclusion, when we look at correlations among human measurements, the head is different from the rest of the body, the ear is different from the head, and ear protrusion is uniquely uncorrelated with anything else.

 •  0 comments  •  flag
Share on Twitter
Published on August 20, 2023 12:07

July 16, 2023

Homophobia and Religion

Two weeks ago I published an excerpt from Probably Overthinking It where I presented data from the General Social Survey showing a steep decrease in the percentage of people in the U.S. who think homosexuality is wrong.

Last week I followed up to answer a question about data from Pew Research showing a possible reversal of that trend.

Now I want to answer a question posed (or at least implied) on Twitter, “I’d love to see all this, including other less-salient changes, through the lens of the decline of religion.” If religious people are more likely to disapprove of homosexuality, and if religious affiliation is declining, how much of the decrease in homophobia is due to the decrease in religion?

To answer that question, I’ll use the most recent GSS data, released in May 2023. Here’s the long-term trend again:

The most recent point is a small uptick, but it follows an unusually large drop and returns to the long-term trend.

Here are the same results divided by strength of religious affiliation.

As expected, people who say they are strongly religious are more likely to disapprove of homosexuality, but levels of disapprobation have declined in all three groups.

Now here are the fractions of people in each group:

The fraction of people with no religious affiliation has increased substantially. The fraction with “not very strong” affiliation has dropped sharply. The fraction with strong affiliation has dropped more modestly. The most recent data points are out of line with the long-term trends in all three groups. Discrepancies like this are common in the 2021 data, due in part to the pandemic and in part to changes in the way the survey was administered. So we should not take them too seriously.

Now, to see how much of the decline in homophobia is due to the decline of religion, we can compute two counterfactual models:

What if the fraction of people in each group was frozen in 1990 and carried forward to the present?What if the fraction of people in each group was frozen in 2021 (using the long-term trend line) and carried back to the past?

The following figure shows the results:

The orange line shows the long-term trend (smoothed by LOWESS). The green line shows the first counterfactual, with the levels of religious affiliation unchanged since 1990. The purple line shows the second counterfactual, with affiliation from 2021carried back to the past.

The difference between the counterfactuals indicates the part of the decline of homophobia that is due to the decline of religion, and it turns out to be small. A large majority of the change since 1990 is due to changes within the groups — only a small part is due to shifts between the groups.

This result surprised me. But I have checked it carefully and I think I have an explanation.

First, notice that the biggest shifts between the groups are (1) the decrease in “not so strong” and (2) the increase in “no religion”. The decrease in strong affiliation is relatively small.Second, notice that the decrease in homophobia is steepest among those with “not so strong” affiliation.

Taken together, these results indicate that there was a net shift away from the group with the fastest decline in disapprobation and toward a group with a somewhat slower decline. As a result, the decrease in religious affiliation makes only a modest contribution to the decrease in homophobia. Most of the change, as I argued previously, is due to changed minds and generational replacement.

 •  0 comments  •  flag
Share on Twitter
Published on July 16, 2023 10:34

July 8, 2023

Backlash of Homophobia?

Last week I published an excerpt from Probably Overthinking It that showed a long-term decline in homophobic responses to questions in the General Social Survey, starting around 1990 and continuing in the most recent data.

Then I heard from a friend that Gallup published an article just a few weeks ago, with the title “Fewer in U.S. Say Same-Sex Relations Morally Acceptable”.

It features this graph, which shows that after a consistent increase from 2001 to 2022, the percentage of respondents who said same-sex relations are morally acceptable declined from 71% to 64% in 2023.

Looking the whole time series, there are several reasons I don’t think this change reflects an long-term reversal in the population:

1) The variation from year to year is substantial. This year’s drop is bigger than most, but not an outlier. I conjecture that some of the variation from year to year is due to short-term period effects — like whatever people were reading about in the news in the interval before they were surveyed.

2) Even with the drop, the most recent point is not far below the long-term trend.

3) Last year was a record high, so a part of the drop is regression to the mean.

4) A large part of the trend is due to generational replacement, so unless young people die and are replaced by old people, that can’t go into reverse.

5) The other part of the trend is due to changed minds. While it’s possible for that to go into reverse, I start with a strong prior that it will not. In general, the moral circle expands.

Taken together, I would make a substantial bet that next year’s data point will be 3 or more percentage points higher, and I would not be surprised by 7-10.

The Data

Gallup makes it easy to download the data from the article, so I’ll use it to make my argument more quantitative. Here’s the time series.

The responses vary from year to year. Here is the distribution of the differences in percentage points.

Changes of 4 percentage points in either direction are not unusual. This year’s decrease of 7 points is bigger than what we’ve seen in the past, but not by much.

This figure shows the time series again, along with a smooth curve fit by local regression (LOWESS).

Since last year’s point was above the long term trend, we would have expected this year’s point to be lower by about 1 percentage points, just by returning to the trend line.

That leaves 6 points unaccounted for. To get a sense of how unexpected a drop that size is, we can compute the average and standard deviation of the distances from the points to the regression line. The mean is 1.7 points, and the standard deviation is 1.3.

So a two-sigma event is a 4.2 point distance, and a three-sigma event is a 5.4 point distance.

Of the 7-point drop:

1 point is what we’d expect from a return to the long-term trend.4-5 points are within the range of random variation we’ve seen from year to year.

Which leaves 1-2 points that could be a genuine period effect.

But I think it’s likely to be short term. As the Gallup article notes, “From a longer-term perspective, Americans’ opinions of most of these issues have trended in a more liberal direction in the 20-plus years Gallup has asked about them.”

And there are two reasons I think they are likely to continue.

One reason is the expansion of the moral circle, an idea proposed by historian William Lecky in 1867. He wrote:


“At one time the benevolent affections embrace merely the family, soon the circle expanding includes first a class, then a nation, then a coalition of nations, then all humanity, and finally, its influence is felt in the dealings of man with the animal world.”

Lecky, A History of European Morals from Augustus to Charlemagne

Historically, the expansion of the moral circle seldom goes in reverse, and never for long.

The other reason is generational replacement. Older people are substantially more likely to think homosexuality is not moral. As they die, they are replaced by younger people who have no problem with it.

The only way for that trend to go in reverse is if a very large, long-term period effect somehow convinces Gen Z and their successors that they were mistaken and — actually — homosexuality is wrong.

I predict that next year’s data point will be substantially higher than this year’s.

Here’s the notebook where I created these plots.

 •  0 comments  •  flag
Share on Twitter
Published on July 08, 2023 07:17

July 2, 2023

The Remarkable Decline of Homophobia


This article is an excerpt from the draft manuscript of Probably Overthinking It, to be published by the University of Chicago Press in December 2023. It is available for preorder now from Amazon and other booksellers.


If you would like to receive infrequent notifications about the book (and possibly a discount), please subscribe to my free newsletter on Substack.


This book is intended for a general audience, so I explain some things that might be familiar to readers of this blog – and I leave out the Python code. After the book is published, I will post the Jupyter notebooks with all of the details!


[This excerpt is from a chapter on moral progress. Previous examples explored responses to survey questions related to race and gender.]

The General Social Survey includes four questions related to sexual orientation.

What about sexual relations between two adults of the same sex – do you think it is always wrong, almost always wrong, wrong only sometimes, or not wrong at all?And what about a man who admits that he is a homosexual? Should such a person be allowed to teach in a college or university, or not?If some people in your community suggested that a book he wrote in favor of homosexuality should be taken out of your public library, would you favor removing this book, or not?Suppose this admitted homosexual wanted to make a speech in your community. Should he be allowed to speak, or not?

If the wording of these questions seems dated, remember that they were written around 1970, when one might “admit” to homosexuality, and a large majority thought it was wrong, wrong, or wrong. In general, the GSS avoids changing the wording of questions, because subtle word choices can influence the results. But the price of this consistency is that a phrasing that might have been neutral in 1970 seems loaded today.

Nevertheless, let’s look at the results. The following figure shows the percentage of people who chose a homophobic response to these questions as a function of age.

It comes as no surprise that older people are more likely to hold homophobic beliefs. But that doesn’t mean people adopt these attitudes as they age. In fact, within every birth cohort, they become less homophobic with age.

The following figure show the results from the first question, showing the percentage of respondents who said homosexuality was wrong (with or without an adverb).

There is clearly a cohort effect: each generation is substantially less homophobic than the one before. And in almost every cohort, homophobia declines with age. But that doesn’t mean there is an age effect; if there were, we would expect to see a change in all cohorts at about the same age. And there’s no sign of that.

So let’s see if it might be a period effect. The following figure shows the same results plotted over time rather than age.

If there is a period effect, we expect to see an inflection point in all cohorts at the same point in time. And there is some evidence of that. Reading from top to bottom:

More than 90% of people born in the nineteen-oughts and the teens thought homosexuality was wrong, and they went to their graves without changing their minds.People born in the 1920s and 1930s might have softened their views, slightly, starting around 1990.Among people born in the 1940s and 1950s, there is a notable inflection point: before 1990, they were almost unchanged; after 1990, they became more tolerant over time.In the last four cohorts, there is a clear trend over time, but we did not observe these groups sufficiently before 1990 to identify an inflection point.

On the whole, this looks like a period effect. Also, looking at the overall trend, it declined slowly before 1990 and much more quickly thereafter. So we might wonder what happened in 1990.

What happened in 1990?

In general, questions like this are hard to answer. Societal changes are the result of interactions between many causes and effects. But in this case, I think there is an explanation that is at least plausible: advocacy for acceptance of homosexuality has been successful at changing people’s minds.

In 1989, Marshall Kirk and Hunter Madsen published a book called After the Ball with the prophetic subtitle How America Will Conquer Its Fear and Hatred of Gays in the ’90s. The authors, with backgrounds in psychology and advertising, outlined a strategy for changing beliefs about homosexuality, which I will paraphrase in two parts: make homosexuality visible, and make it boring. Toward the first goal, they encouraged people to come out and acknowledge their sexual orientation publicly. Toward the second, they proposed a media campaign to depict homosexuality as ordinary.

Some conservative opponents of gay rights latched onto this book as a textbook of propaganda and the written form of the “gay agenda”. Of course reality was more complicated than that: social change is the result of many people in many places, not a centrally-organized conspiracy.

It’s not clear whether Kirk and Madsen’s book caused America to conquer its fear in the 1990s, but what they proposed turned out to be a remarkable prediction of what happened. Among many milestones, the first National Coming Out Day was celebrated in 1988; the first Gay Pride Day Parade was in 1994 (although previous similar events had used different names); and in 1999, President Bill Clinton proclaimed June as Gay and Lesbian Pride month.

During this time, the number of people who came out to their friends and family grew exponentially, along with the number of openly gay public figures and the representation of gay characters on television and in movies.

And as surveys by the Pew Research Center have shown repeatedly, “familiarity is closely linked to tolerance”. People who have a gay friend or family member – and know it – are substantially more likely to hold positive attitudes about homosexuality and to support gay rights.

All of this adds up to a large period effect that has changed hearts and minds, especially among the most recent birth cohorts.

Cohort or period effect?

Since 1990, attitudes about homosexuality have changed due to

A cohort effect: As old homophobes die, they are replaced by a more tolerant generation.A period effect: Within most cohorts, people became more tolerant over time.

These effects are additive, so the overall trend is steeper than the trend within the cohorts – like Simpson’s paradox in reverse. But that raises a question: how much of the overall trend is due to the cohort effect, and how much to the period effect?

To answer that, I used a model that estimates the contributions of the two effects separately (a logistic regression model, if you want the details). Then I used the model to generate predictions for two counterfactual scenarios: what if there had been no cohort effect, and what if there had been no period effect? The following figure shows the results.

The circles show the actual data. The solid line shows the results from the model from 1987 to 2018, including both effects. The model plots a smooth course through the data, which confirms that it captures the overall trend during this interval. The total change is about 46 percentage points.

The dotted line shows what would have happened, according to the model, if there had been no period effect; the total change due to the cohort effect alone would have been about 12 percentage points.

The dashed line shows what would have happened if there had been no cohort effect; the total change due to the period effect alone would have been about 29 percentage points.

You might notice that the sum of 12 and 29 is only 41, not 46. That’s not an error; in a model like this, we don’t expect percentage points to add up (because it’s linear on a logistic scale, not a percentage scale).

Nevertheless, we can conclude that the magnitude of the period effect is about twice the magnitude of the cohort effect. In other words, most of the change we’ve seen since 1987 has been due to changed minds, with the smaller part due to generational replacement.

No one knows that better than the San Francisco Gay Men’s Chorus. In July 2021, they performed a song by Tim Rosser and Charlie Sohne with the title, “A Message From the Gay Community”. It begins:

To those of you out there who are still working against equal rights, we have a message for you […]
You think that we’ll corrupt your kids, if our agenda goes unchecked.
Funny, just this once, you’re correct.
We’ll convert your children, happens bit by bit;
Quietly and subtly, and you will barely notice it.

Of course, the reference to the “gay agenda” is tongue-in-cheek, and the threat to “convert your children” is only scary to someone who thinks (wrongly) that gay people can convert straight people to homosexuality, and believes (wrongly) that having a gay child is bad. For everyone else, it is clearly a joke.

Then the refrain delivers the punchline:

We’ll convert your children; we’ll make them tolerant and fair.

For anyone who still doesn’t get it, later verses explain:

Turning your children into accepting, caring people;
We’ll convert your children; someone’s gotta teach them not to hate.
Your children will care about fairness and justice for others.

And finally,

Your kids will start converting you; the gay agenda is coming home.
We’ll convert your children; and make an ally of you yet.

The thesis of the song is that advocacy can change minds, especially among young people. Those changed minds create an environment where the next generation is more likely to be “tolerant and fair”, and where some older people change their minds, too.

The data show that this thesis is, “just this once, correct”.

SourcesThe General Social Survey (GSS) is a project of the independent research organization NORC at the University of Chicago, with principal funding from the National Science Foundation. The data is available from the GSS website.The Pew Research study showing that familiarity breeds acceptance is “Four-in-Ten Americans Have Close Friends or Relatives Who are Gay”.You can see a performance of “A Message From the Gay Community” on YouTube.
 •  0 comments  •  flag
Share on Twitter
Published on July 02, 2023 12:25

June 25, 2023

Go Get the Data

My mantra when I was working on Probably Overthinking It was “Go Get the Data.” If I wanted to use a result from prior work, I would get the data whenever possible and make my own visualization. Of course, that’s more work than copying and pasting a figure, but there are a lot of benefits. One is that I can often get newer data. Another is that I can check the results.

I was reminded of these benefits a few months ago when I was reading a very good book that I won’t name because I’m about to point out an error, and I don’t want to be a jerk about it. The book includes the following figure:

Source: A book I’m not identifying because the point of this example is not to name and shame.

The top line got my attention because I have worked on several projects with data from the General Social Survey (GSS), and this particular question is one of the examples in my Political Alignment Case Study. Plotting the fraction of people who think homosexuality is wrong, I generated this figure:

Source: Downey, Political Alignment Case Study.

These results are not the same (even accounting for the inverted y-axis).

So I checked the source, which is Steven Pinker’s The Better Angels of Our Nature, specifically this figure:

Source: Page 452 of Pinker, The Better Angels of Our Nature

It looks like Pinker’s figure is consistent with mine. So what went wrong with the first figure?

Here’s what I think happened: they wanted to invert the y axis to show the percentage who think homosexuality is not wrong. But instead of flipping the y axis, they seem to have flipped the x axis, reversing the arrow of time! If you compare the other lines, it looks like they have all been reversed left-for-right.

Now, this is just one error in an otherwise excellent book. But it’s a reminder that this kind of thing can be avoided, if you Go Get the Data.

 •  0 comments  •  flag
Share on Twitter
Published on June 25, 2023 14:23

June 10, 2023

Abstracts and keywords

As Probably Overthinking It approaches the finish line, there are just a few more tasks: I am working on the index and — as I have recently learned — I also have to write a 200-word abstract, a list of keywords for each chapter, and a 250-word abstract for the whole book. I have come to two conclusions:

There is no good software for making an index from a PDF document.ChatGPT is pretty good at writing chapter abstracts.

By converting each chapter to a simple text file (no figures and tables) I was able to get most of them under the ~4000 token limit. Then I asked, “write a one-paragraph abstract for this chapter with no more than 180 words. write about the chapter in the third person”.

Here’s the unedited abstract for the last chapter:


The chapter titled “Chasing the Overton Window” explores the phenomenon of people considering themselves more conservative as they age, despite their views actually becoming more liberal over time. The chapter begins by examining the trends of self-identification and political views across different age groups. It then delves into the concept of the Overton window, which represents the range of acceptable public opinion on a given issue. The chapter explains how the center of public opinion has shifted toward liberalism over the past 50 years and how the relative terms “liberal” and “conservative” depend on the perceived center. Furthermore, it demonstrates that both liberals and conservatives have become more liberal over time. The chapter concludes by proposing an explanation for why people think they are becoming more conservative, even though their views are becoming more liberal, which involves the interplay of generational effects, the shifting center of public opinion, and the connotations associated with political labels.

ChatGPT June 10, 2023

It’s not great prose, but I think I can revise it into something acceptable without much effort.

Three of the chapters exceeded the token limit, so I asked for a summary of the first half, then a summary of the second half, then I asked, “Combine the following two abstracts into a single paragraph with no more than 180 words”. Here’s the combined abstract of Chapter 8:


This chapter delves into the distribution of natural and human-caused disasters, investigating their sizes, costs, prevalence, and characteristics within long-tailed distributions. Understanding the probabilities of major disasters is crucial for effective preparedness and response, despite the challenge of comprehending rare and large-scale events. By analyzing a dataset of 125 disasters, including hurricanes, earthquakes, floods, nuclear disasters, and terror attacks, the author demonstrates a pattern where doubling the rank of a disaster corresponds to halving its costs when plotted on a logarithmic scale. While exploring the limitations of the lognormal distribution in predicting the probabilities of large disasters, the author introduces Student’s t-distribution as a more suitable model for estimating the probabilities of extreme events. The chapter also examines lunar craters and their abundance and sizes, revealing the prevalence of long-tailed distributions and their connection to asteroid sizes. Additionally, it explores the occurrence of long-tailed distributions in stock market crashes and introduces the concept of black swans to emphasize their relevance in understanding rare and impactful events. Concluding the chapter, it discusses the challenges associated with predicting and comprehending rare, large events in a long-tailed world, with a specific focus on earthquake magnitudes and a comparison of prediction models.

ChatGPT June 10, 2023

Again, I think that’s editing distance away from acceptable — and a near perfect 198 words.

It does pretty well with keywords, too:

DisastersLong-tailed distributionsProbabilitiesPreparednessResponseNatural eventsHuman-made incidentsLognormal distributionStudent’s t-distributionBlack swans

To generate the abstract for the whole book, I pasted the chapter abstracts and keywords into a single document and asked, “the following are chapter abstracts for a book called ‘Probably Overthinking It’. write a one-paragraph abstract for the whole book, more than 200 words, but not more than 250 words. write about the book in the third person. not too much like an ad”

Here’s the unedited result:


“Probably Overthinking It” is a thought-provoking book that delves into various statistical and cognitive paradoxes, challenging common intuitions and shedding light on the complexities of human behavior and measurement. Through a series of captivating chapters, the author explores topics such as normalcy and weirdness, inspection paradoxes, family size and population growth, lognormal distributions, expected remaining time, Berkson’s paradox, low birthweight paradox, long-tailed distributions, fairness and fallacy in algorithmic decision-making, Simpson’s paradox, the moral circle, the evolution of societal values, and the shifting perceptions of political ideologies. By analyzing real-world data and employing statistical models, the book uncovers the nuances and contradictions that emerge when multiple factors interact, leading to counterintuitive outcomes. Readers will gain a deeper understanding of statistical biases, the limitations of measurement, and the challenges inherent in making accurate predictions and drawing meaningful conclusions. “Probably Overthinking It” encourages readers to question their assumptions, approach complex problems with skepticism, and develop a more nuanced and informed perspective on the intricacies of human behavior and the world around us.

ChatGPT June 10, 2023

Surprisingly, the book abstract is only 169 words, despite my prompt. In general, the word counts are not precise, and more often over than under (which is why I asked for 180, not 200).

So, I have some editing to do, but I’d say ChatGPT saved me at least a few hours — and spared me from exactly the kind of writing I dislike the most.

 •  0 comments  •  flag
Share on Twitter
Published on June 10, 2023 09:23

May 15, 2023

Absence of Evidence

If anyone tells you that absence of evidence is not evidence of absence, you have my permission to slap them. Of course, my permission will not prevent you from getting slapped back or charged with assault. Regardless, absence of evidence is very often evidence of absence, and sometimes strong evidence.

To make this claim precise, I propose we use the Bayesian definition of evidence:

If an observation, D, would be more likely under a hypothesis, H, than under the alternative hypothesis, then D is evidence in favor of H. Conversely, if D is less likely under H than under the alternative, D is evidence against H.

As an example, suppose H is the hypothesis that unicorns exist. Since people have explored most of the world’s land mass, I’d say there’s a 99% chance we would have found unicorns if they existed.

So if D is the fact that we have not found unicorns, the probability of D is only 1% if unicorns exist, and 100% if they don’t. Therefore, D is evidence that unicorns don’t exist, with a likelihood ratio of 100:1.

Let’s consider a more realistic example. In a recent article, The Economist discusses the hypothesis that social media use is a major cause of recent increases in rates of self-harm and suicide among teenage girls. To test this hypothesis, they propose an experiment:

Because smartphones were adopted at different rates in different countries, the timing of any increases they caused in suicides or self-harm should vary on this basis.

But their experiment came up empty:

[W]e could not find any statistical link between changes over time in the prevalence of either mobile-internet subscriptions or self-reported social-media use in a country, and changes over time in that country’s suicide or self-harm hospitalisation rates, for either boys or girls.

They conclude:

But if social media were the sole or main cause of rising levels of suicide or self-harm—rather than just one part of a complex problem—country-level data would probably show signs of their effect.

Since it did not, this negative result is evidence against the hypothesis. It may not be strong evidence; there are other reasons the experiment might have failed. And in light of other evidence, it is still plausible that social media is harmful to mental health.

Nevertheless, in this example, as in any reasonable experiment, absence of evidence is evidence of absence.

[In this 2015 article, I made a similar claim that we should stop saying correlation does not imply causation.]

 •  0 comments  •  flag
Share on Twitter
Published on May 15, 2023 06:04

Probably Overthinking It

Allen B. Downey
Probably Overthinking It is a blog about data science, Bayesian Statistics, and occasional other topics.
Follow Allen B. Downey's blog with rss.