Image via CrunchBase
I haven’t written about journalism uses of Amazon’s or mTurk since 2008 when I read something from Andy Baio and I don’t think I have read much about its use by journalists in the intervening time,
But here’s Amanda Michel of ProPublica detailing how that news organization has been using mTurk complete with a guide for journalists written by her and Srinivas Rao. There also is an example of a Human Intelligence Tasks request template.
For those interested, Baio has written a few posts on using mTurk for data research projects including audio transcription and data entry.
Anybody else using it and what tasks did you use it for? Does the quality meet your standards? Other thoughts about it? Seems economical if the results aren’t crap. Dan Kennedy wrote about his experience earlier this year.
Here a snippet from Michel’s piece.
For those unfamiliar with Mechanical Turk, it’s an online marketplace, set up by the online shopping site Amazon, where anyone can hire workers to complete short, simple tasks like quickly transcribing interviews, copying data from thousands of charts, and even sorting through satellite images in hopes of locating missing individuals. Amazon originally developed it as an in-house tool, and commercialized it in 2005. The mTurk workforce now numbers more than 100,000 workers in 200 countries.
At the urging of Panos Ipeirotis, a professor at New York University’s Stern School of Business, we began experimenting with mTurk last spring to clean, de-duplicate and reformat data. We’ve since used the tool to collect or proof more than 28,000 data points, from the names of companies that received stimulus money to the categorization of answers to our home loan modification questionnaire. We’re impressed with the speed and accuracy of its results. For example, a project we estimated would take a full-time staffer almost three days to finish was completed on mTurk overnight for $37, with 99 percent accuracy.
Mechanical Turk has proven to be more than a shortcut. It has freed up staff time for more complicated work. We’ve also used it to retrieve data from government databases that prohibit scraping.
Ran an MTurk pricing experiment, transcribing audio at an exploitative $0.25/minute. Disturbing result: A fast, near-perfect transcription.