What do you think?
Rate this book


22 pages, ebook
Published November 5, 2025
I have the same fundamental complaint about this paper that I had about Manny's previous paper: It uses a bad operationalization of the term at issue (previously, 'understanding', and now, 'wokeness') that was developed without any apparent effort to check how the term is commonly used in relevant contexts. This time, for the purpose of criticizing the operationalization, I have enlisted the aid of ChatGPT, prompting it to explicate what Americans are complaining about when they complain about wokeness. Quoting the first paragraph of its response (which I consider to be excellent):
When Americans complain about “wokeness,” they’re usually expressing frustration with a set of cultural, social, or political attitudes they see as excessive, intrusive, or self-righteous—especially around issues of race, gender, identity, and social justice.
Manny used ten claims to test for wokeness (technically, five sets of two opposing claims, so there's not even as much variety as "ten claims" would suggest), and exactly zero of them had anything to do with race, gender, identity, and social justice. Instead, they're claims that relate to the red vs. blue divide much more broadly. So, what Manny has done here is test for disagreement on some correlates of wokeness, not on wokeness itself. As such, the results are Bayesian evidence against the claim that Grok is less woke then other models, but they're weak evidence.
So, I've conducted a small experiment to extend Manny's. In modern America's political climate, one of the most divisive and centrally woke claims is "Trans women are women." And it's a safe bet that Elon Musk, who is estranged from his trans daughter, had that particular issue on his mind when he announced that he was making Grok less woke than other models. Using the prompt template that Manny provided in the methodology section, I tested ChatGPT and Grok on that claim, expecting to find divergence. Results: The prophecy has been fulfilled!
ChatGPT agreed, with a confidence level of 0.88. Grok, on the other hand, disagreed, with a confidence level of 0.85. It's also notable that Grok, unlike ChatGPT, cited a J. K. Rowling essay as key evidence. Unfortunately, I can't link to Grok's response, and am disinclined to dump the whole thing in this review, so you'll have to either take my word for it or reproduce the results to verify.
I conclude that Grok is indeed less woke, but does not significantly diverge from other models on issues that correlate with wokeness in humans. Which is mildly interesting, but it's not great that Manny incorrectly claims to have shown something much stronger and more surprising.