More on this book
Community
Kindle Notes & Highlights
by
Jodie Archer
Read between
November 7 - November 14, 2018
But working on finding viable new manuscripts in a threatened industry is also, if we may, about keeping that industry not just running but diverse. Our work is, of course, about an interest in identifying and explaining latent patterns in our culture.
we are interested in the potential to launch new authors, about encouraging publishers to use more of their Patterson/King/Steel budget on the young writers
the emancipatory and educational power of reading and writing fiction.
2008, Matt had just completed his part in a controversial computational study of authorial style in the scriptural text of the Book of Mormon.
theories of multiple authorship were probably true,
That’s because computers are experts in pattern recognition, and computers can study patterns at a scale and level of granularity that no human could ever manage.
we wrote for this study were designed to process books and extract detailed information about each book’s unique style, as well as its themes, its emotional highs and lows, its characters, and its settings, along with all sorts of seemingly mundane linguistic data that does not easily translate into concepts such as style and plot.
In the end, we winnowed 20,000 features down to about 2,800 that were useful in differentiating between stories that everyone seems to want to read and those novels that were more likely to remain, well, niche.
Naturally, there were also non-bestselling books that our machine begged us to read, but that’s another story.
the relationship between writer and reader in terms of an unwritten contract to fulfill, a contract whose details are hazy but that nevertheless point to the aesthetic, emotional, intellectual, and
even ethical reasons behind the choice to read. We thought
nouns in different proportions
It’s hard to believe. Who would have thought that sex does not sell? We tell people and still they do not believe us. But the truth is this: sex, or perhaps more precisely erotica, sells, and it sells in notable quantities, but only within a niche market.
What the machine does is a kind of reverse engineering. Take the metaphor of a bowl of soup. The machine first separates out all the ingredients—the meat, stock, peppers, onions, and spices—and then it carefully measures how much of each ingredient was used in that favorite recipe.
This formula can go on forever, and we will see how cleverly this is done: for now the point is in the proportion—one-third the same, two-thirds different.
there must be a dominant topic to give the glue to a novel,
that topics in the next highest proportions should suggest a direct conflict that might be quite threatening.
It is more specifically about human closeness and human connection. Scenes that display this most important indicator of bestselling are all about people communicating in moments of shared intimacy, shared chemistry, and shared bonds.
Characters must have these moments of casual intimacy and closeness, if not explicitly romantic. Be it a shopping date with Mom, a fishing date with Dad, or a cooking date with a new lover, there must be time to date.
technologies—preferably modern and vaguely threatening technologies—
Finally, no unicorns.
The readers kept repeating that the novel emotionally triggered them, viscerally triggered them, physically triggered them.
the reviewing community is full of Hawthornes and Joyces. Like these writers, they favor novels as a space for sociopolitical knowledge or self-aware language rather than embodied pleasure: in
the book, the text, and even my reading self dissolve in a peculiar act of transubstantiation whereby “I” become something other than what I have been and inhabit thoughts other than those I have been able to conceive before. This tactile, sensuous, profoundly emotional experience of being captured by a book is what those reading memories summoned for me—
The final plotline, Plot 7 (Fig. 13), has no inverse. Kurt Vonnegut has called this the “man in hole story”5; Booker calls it Overcoming the Monster. It is often about a hero and a bad guy where there is some threat to a person or a culture that must be eliminated. The threat might be a dragon or a disease, a situation or a system, but the main character is forced to take it on and then change his or her fortunes back to good.
Fifty Shades, The Da Vinci Code
It seems there could be some sort of patterning to counter the popular claim that their stratospheric success was totally random.
Displaying the ups and downs of emotion as curves is facilitated by what researchers in natural language processing call sentiment analysis.
The million-dollar move is in a good, strong, regular beat.
While it is true the unique way that an author uses words is not quite the same as that author’s individual and biological gene expressions, years of research in authorship attribution and stylometrics have suggested that each of us has a fairly unique and individual linguistic fingerprint or style. Even when Rowling tried, very consciously, to write like “Robert Galbraith” and not like J. K. Rowling, there were habits and patterns to her prose that she could not successfully suppress.
These might seem like tiny details, but the details of fingerprints are as tiny as they are significant.
491 most frequently occurring words and marks of punctuation, the machine was able to differentiate between bestselling books and non-bestselling books 70 percent of the time. Using only 148 features, the machine guessed correctly 68 percent of the time, and this was just using the most common filler words and punctuation: no nouns, no adjectives, no verbs, no syntax, no sentence data.
common because it is one way of creating an unspoken understanding between character and
reader. Readers like it.
In bestsellers, adjectives and adverbs are less common, particularly adjectives. What this means is that bestsellers are about shorter, cleaner sentences, without unneeded words.
we were able to predict whether a novel in our corpus was a bestseller or not with 72 percent accuracy. That was just from the basic verbs, one data point of the bestseller-ometer taken in isolation.
bestselling protagonists have and express their needs.
Why is need the top verb to differentiate bestsellers from non-bestsellers? How come wish is need’s equivalent in books that don’t sell?
Many heroes and heroines are those agents who are able to bring, usually with some struggle, purification to their fictional world.
she is both the perceived “problem” and the solution. The same is true of Rachel in The Girl on the Train and Amy in Gone Girl.
The final epilogue brings marriage to all good wizards involved in the series, and thus the history-making, record-breaking literary franchise ends with the phrase “All was well.”
Let’s take Gone Girl as the example. When we pull out all the sentences of the novel in which the characters express need, we get a snapshot of the plot and tone of the book in 163 sentences.
These girl books are, ultimately, character novels. They give us characters who do not fit or stay in the “all-American girl” role—a role that all three are frustrated with, and that perhaps many readers are frustrated with.
That all three struggle with this in the hands of contemporary writers suggests to us that these fictional girls are doing cultural work that is not yet complete.
He had been working on a tool that would detect the signal of great works of literature—“the canonizer” as he had dubbed it—
buy at the airport during a layover. He was, after all, trained as a Joycean. He taught Ulysses. He had read every page of Finnegans Wake
We think the bestseller-ometer has the potential to change how we write, publish, and read new fiction. We hope it has brought some respect to those mainstream novelists who are often dismissed.
Text mining can be defined rather narrowly as the process by which we discover and extract textual features from a book. It is, therefore, “step one.” Machine learning can be defined, also narrowly but sufficiently, as the way in which we process those features in order to make predictions about whether a book belongs in the bestselling group or not. That’s “step two.”
contractions and compounds are a bit like Schrödinger’s cat: existing in two different states at the same time.