« Apple Opens Up | Main | Twitter »


Mark Larson

Steven, this is awesome! I hadn't noticed this on Amazon until you pointed it out.

I'm sure there's got to be some correlation with sales here. I wonder what is median and range for something like the NYT bestsellers lists. Just as each author has their personal sweet spot, surely book buyers as a whole and within their niches have their own.

Beyond that, I'm curious about re-readability, which of course is tougher to measure. Though I got a few hours enjoyment the first time around, I don't think I'll ever go back to read a Godin or Gladwell book. Ever. But for books in the Johnson/Pinker/Hitchens/et al range and beyond, I'm nearly certain I will.

Jason Mittell

Another cool toy to play with on Amazon pages is the Concordance feature - it lists the most common words (aside from "the" "and" etc.) in tag cloud format. There's nothing quite like seeing a book you spent years writing boiled down to 100 key words, but it's an interesting interface into the text.

Michael Patrick Gibson

Interesting tool! Tolstoy, Dostoyevsky, Austen all write at about 18-20 words per sentence, with words usually around 1.5 syllables. But then I thought, what about Hemingway? Surely his terse style would yield a different outcome. And yet it didn't. After a cursory search, I find Hemingway's numbers are the same as these other writers.

So what's being lost in the aggregation of this data?

Tim Walker

Interesting stuff, Steven. To my baseball-addled mind, the short-sentence trick works just like a change-up for a pitcher: just when you think you've got him all figured out and can take him for granted, pow!, he changes the rhythm.

Re Michael's comment, "Surely his terse style would yield a different outcome." I once had a professor who walked our writing class through an analysis of Hemingway's sentence length. The surprising fact was that Hemingway *often* used sentences that were quite long -- above 50 words, even above 80 words. But they would still be in his terse style, as they would often be comprised of several short independent clauses joined by "and." It makes some intuitive sense when you remember the opening sentence of ~The Old Man and the Sea~, which includes three independent clauses in less than 30 words, but Hemingway used even bigger run-ons of sentencelets (?) in some of his earlier work.

Michael Patrick Gibson

I emend my Hemingway stat. The books I looked at were Farewell to Arms and For Whom The Bell Tolls. So to broaden the search, I went to his complete edition for short stories. (I figured since it collects from across his career, it'd be fairly representative.) On this, he's even terser than I thought:

Syllables per Word: 1.3
Words per Sentence: 10.3

Isabel Lugo

"you often hear people complain about the impenetrable jargon of critical theory, but it looks here like the sentence length is as least as much of a culprit."

This may be because people tend to read one sentence at a time. It seems natural to pause at the end of a sentence and think "what did that just say?", but not so natural to pause mid-sentence. So in text composed of long sentences, one is more likely to get lost -- a 50-word sentence is more likely to throw someone off the track than two 25-word sentences.

Rikard Linde

Great stuff Steven. Isn't this related to what you wrote in Interface Culture on Apple's V-Twin search tool? If I remember it correctly V-Twin picked words with more than six letters, or maybe seven, in all the documents on a computer. This enabled comparisons of documents and V-Twin could tell which ones were "related", the ones containing the same long words.

Eric H

I'm surprised that Hannah Arendt doesn't score higher. The Portable HA only hits 17% and 28.6, while her personal correspondence scored lower (12%, 16.8, though possibly mixed with others' writing). One should always keep Mark Twain in mind when reading anything written by someone with a Germanic background.




One thing that is probably being lost in Hemingway's case is that he practiced two rather different styles. He was terse at times, but as Tim notes, he could also tease a sentence along when he wanted to. Hemingway was a more bi-modal writer than most. Another thing not lost not in aggregation but rather in translation is Tolstoy's and Dostoyevshy's syllable count. In the original, they may have very different syllable counts because of the nature of the language, or because of the style of the translator. The same may be true of sentence length.


If you want to test your own writing (without getting a book published and sold on Amazon), you can do it within Microsoft Word (2000 or later, I think).

1. Under the Tools menu, choose Options.
2. On the Spelling and Grammar tab, check the boxes toward the bottom for "Check Grammar" and "Show readability statistics".
3. Click OK.
4. Under Tools, choose "Check Spelling and Grammar".
5. Click through all of the grammar mistakes that Word found in your document.
6. When it is done checking your grammar, Word will display readability statistics. It includes Words Per Sentence, but not syllables per word (although it does contain the Flesch Reading Ease score and Flesch-Kincaid score, which are partially based on both of these metrics).


There are also more "industrial strength" tools developed by computer linguists for this kind of thing:


Best, Max

Seth Godin

Hey, I won!

What do I get?

Tim Peter

So that's why more people read Seth's blog than mine! I used to just think he was smarter. ;-)

Ed (NextInstinct)

Perhaps this is why a Seth Godin read always feels 'fresh', and never labored. The thought of reading his books and blogs posts alike, never brings with it a hesitation that "this is going to be draining", or "written solely for the sake of writing".

RE Hemingway: This is where the stats can be misleading. Taking averages belies an author like Hemingway. While he may in fact have many long sentences, his 'voice' is clearly established by the short, redundant statements that often followed them.
This was his mastery; to SAY a lot in the lengthy sentences, capped by a brief repeat of the slice of the previous prose. Which was how and what he wanted the reader to remember from each page.

Now then, where's that book with the long title? Oh yes, The Dip.

Thanks for your thoughtful post.

Joe Marier

Here's something a little scary: John Henry Cardinal Newman's Essay on the Development of Christian Doctrine sits right on top of... one of Christopher Hitchens' books, at 17% complex words and 31.8 words per sentence.

No, it's not his book on atheism.

Joel D Canfield

Ooh; so nice to be reminded of this *before* I send my tome off to the printers . . . although perhaps I should just change my last name to something beginning with 'G'

I wonder how much web-style has affected this trend? I do know that my blog posts tend to be punchy; the stuff intended for dead-tree versions tend to be, well, less punchy.

David Locke

Back in the late 80's there was an application, Corporate Voice, that let you feed it writing samples, and it would tell you how close you were to those samples. It worked. Unfortuantely, the application was not a success in the market.


Short is good. Its almost hard to leave that sentence with so few words...

David Locke

Try Steinbeck. He is said to have experimented with his writing style.


Very interesting stuff, and a fantastic feature for Amazon to add - I'm sure they've been using it internally for ages.

The concordance thing is also interesting. I compared a few of Bill Bryson's travel books and the top 100 words are almost identical per book - I guess if you hit upon a winnnig formula then stick with it:

Book A: http://tinyurl.com/2n4zce
Book B: http://tinyurl.com/2obmb3

I wonder what further information could be gleaned from this. It certainly bodes well for the essays I currently have to write (my average words/sentence is 18.9).

Andrew Robinson

Thanks for the tip. I'm doing more and more writing, which I'm glad for. I appreciate as many "rails" as I can get to help guide the process.
I'll keep your post for future reference.


Leonardo Kuba

Pretty impressive study. If an author could match short sentences (aiming high sales books) and good content (to benefit the readers), then he/she would acomplish the perfect formula. Nice post.

Tara Jacobsen

I wonder how this translates to "hear-ability" as I really like Godin and Gladwell audiobooks when read by the author, but have a harder time reading them on the printed page.


We just had a demo at work last week of a tech writing plug-in for our docs that checks sentence length, among other things. It flags any sentence longer than 26 words as too long to be fully comprehended by our audience (which may include non-English readers).
When we ran this on our existing docs, practically every 2nd sentence was too long by these standards.


I'm just wondering what would the graph have looked like if you would have plotted Gayatri Chakravarti Spivak on it...

The comments to this entry are closed.

My Photo
I'm a father of three boys, husband of one wife, and author of nine books, host of one television series, and co-founder of three web sites. We split our time between Brooklyn, NY and Marin County, CA. Personal correspondence should go to sbeej68 at gmail dot com. If you're interested in having me speak at an event, drop a line to Wesley Neff at the Leigh Bureau (WesN at Leighbureau dot com.)

My Books

  • Steven Johnson: How We Got to Now: Six Innovations That Made the Modern World

    Steven Johnson: How We Got to Now: Six Innovations That Made the Modern World
    A history of innovation accompanied by a 6-part TV series on PBS and the BBC, this was the first of my books to crack the top 5 on the NY Times bestseller list. Appropriately for a book that celebrates diverse networks, this was the most collaborative of any of my books. (Available from IndieBound here.)

  • Steven Johnson: Future Perfect: The Case For Progress In A Networked Age

    Steven Johnson: Future Perfect: The Case For Progress In A Networked Age
    My first book-length attempt to organize my writings about emergence and networks into something resembling a political philosophy, which I called Peer Progressivism. (Available from IndieBound here.)

  • : Where Good Ideas Come From: The Natural History of Innovation

    Where Good Ideas Come From: The Natural History of Innovation
    An exploration of environments that lead to breakthrough innovation, in science, technology, business, and the arts. I conceived it as the closing book in a trilogy on innovative thinking, after Ghost Map and Invention. But in a way, it completes an investigation that runs through all the books, and laid the groundwork for How We Got To Now. (Available from IndieBound here.)

  • : The Invention of Air

    The Invention of Air
    The story of the British radical chemist Joseph Priestley, who ended up having a Zelig-like role in the American Revolution. My version of a founding fathers book, and a reminder that most of the Enlightenment was driven by open source ideals. (Available from IndieBound here.)

  • : The Ghost Map

    The Ghost Map
    The story of a terrifying outbreak of cholera in 1854 London 1854 that ended up changing the world. An idea book wrapped around a page-turner. I like to think of it as a sequel to Emergence if Emergence had been a disease thriller. You can see a trailer for the book here. (Available from IndieBound here.)

  • : Everything Bad Is Good for You: How Today's Popular Culture Is Actually Making Us Smarter

    Everything Bad Is Good for You: How Today's Popular Culture Is Actually Making Us Smarter
    The title says it all. This one sparked a slightly insane international conversation about the state of pop culture -- and particularly games. There were more than a few dissenters, but the response was more positive than I had expected. And it got me on The Daily Show, which made it all worthwhile. (Available from IndieBound here.)

  • : Mind Wide Open : Your Brain and the Neuroscience of Everyday Life

    Mind Wide Open : Your Brain and the Neuroscience of Everyday Life
    My first best-seller, and the only book I've written in which I appear as a recurring character, subjecting myself to a battery of humiliating brain scans. The last chapter on Freud and the neuroscientific model of the mind is one of my personal favorites. (Available from IndieBound here.)

  • : Emergence: The Connected Lives of Ants, Brains, Cities, and Software

    Emergence: The Connected Lives of Ants, Brains, Cities, and Software
    The story of bottom-up intelligence, from slime mold to Slashdot. Most of my books sold more copies than this one, but Emergence has influenced the most eclectic mix of fields: political campaigns, web business models, urban planning, the war on terror. (Available from IndieBound here.)

  • : Interface Culture : How New Technology Transforms the Way We Create and Communicate

    Interface Culture : How New Technology Transforms the Way We Create and Communicate
    My first. The book I wrote instead of finishing my dissertation, predicting the growing cultural significance of interface and information design. Still relevant, I think. But I haven't read it in a while, so who knows what's in there! (Available from IndieBound here.)

Blog powered by Typepad