Monday, September 22, 2014

Dig into television and film corpuses with Bookworm Movies


One handy tool for cultural analysis is to measure how often words are used within a given set of texts, whether that's transcripts from Congress or every document ever written. It's much easier to search through the written word for obvious reasons, leaving audio-visual media left out of the content analysis process. Luckily, a very clever professor named Ben Schmidt has leveraged big data to make movies and television shows as searchable as books.

Schmidt's new service, Bookwork Movies, uses the Open Subtitles database to grab the scripts from thousands of movies and shows. Punch in any word or phrase – and, optionally, a specific show or medium – and Bookworm Movies will produce a detailed graph of how often each word is used relative to its entire corpus. As show in the chart above, Scrubs uses the word "doctor" more frequently than many medical dramas, while it appears comparatively little in Grey's Anatomy. There's all sorts of angles you could go above analyzing that. This is a terrific starting point for seeing how television shows and movies change language over time in comparison to one another.

The best part? The entirety of The Simpsons is included as well. And thankfully, they haven't used the word "selfie" yet.

No comments: