Official Google Blog: Find out what’s in a word, or five, with the Google Books Ngram Viewer

Find out what’s in a word, or five, with the Google Books Ngram Viewer

December 16, 2010

in English literature from 1750 to 2008

Since 2004, Google has digitized more than 15 million books worldwide. The datasets we’re making available today to further humanities research are based on a subset of that corpus, weighing in at 500 billion words from 5.2 million books in Chinese, English, French, German, Russian, and Spanish. The datasets contain phrases of up to five words with counts of how often they occurred in each year.

These datasets were the basis of a research project led by Harvard University's Jean-Baptiste Michel and Erez Lieberman Aiden published today in Science and coauthored by several Googlers. Their work provides several examples of how quantitative methods can provide insights into topics as diverse as the spread of innovations, the effects of youth and profession on fame, and trends in censorship.

The Ngram Viewer lets you graph and compare phrases from these datasets over time, showing how their usage has waxed and waned over the years. One of the advantages of having data online is that it lowers the barrier to serendipity: you can stumble across something in these 500 billion words and be the first person ever to make that discovery. Below I’ve listed a few interesting queries to pique your interest:

World War I, Great War
child care, nursery school, kindergarten
fax, phone, email
look before you leap, he who hesitates is lost
virus, bacteria
tofu, hot dog
burnt, burned
flute, guitar, trumpet, drum
Paris, London, New York, Boston, Rome
laptop, mainframe, microcomputer, minicomputer
fry, bake, grill, roast
George Washington, Thomas Jefferson, Abraham Lincoln
supercalifragilisticexpialidocious

We know nothing can replace the balance of art and science that is the qualitative cornerstone of research in the humanities. But we hope the Google Books Ngram Viewer will spark some new hypotheses ripe for in-depth investigation, and invite casual exploration at the same time. We’ve started working with some researchers already via our Digital Humanities Research Awards, and look forward to additional collaboration with like-minded researchers in the future.

Posted by Jon Orwant, Engineering Manager, Google Books

Google Books Ngram Viewer

Comparing instances of [flute], [guitar], [drum] and [trumpet] (blue, red, yellow and green respectively)
in English literature from 1750 to 2008

Harvard University'sScienceWorld War I, Great Warchild care, nursery school, kindergartenfax, phone, emaillook before you leap, he who hesitates is lostvirus, bacteriatofu, hot dogburnt, burnedflute, guitar, trumpet, drumParis, London, New York, Boston, Romelaptop, mainframe, microcomputer, minicomputerfry, bake, grill, roastGeorge Washington, Thomas Jefferson, Abraham LincolnsupercalifragilisticexpialidociousDigital Humanities Research AwardsPosted by Jon Orwant, Engineering Manager, Google Books

Official Blog

Find out what’s in a word, or five, with the Google Books Ngram Viewer

Labels

Archive

Feed

Company-wide

Products

Developers