Amazon Mechanical Turk: The Future of Data Collection

As technology has rapidly changed, scientists – social and biological alike – have all been searching for ways in which to survey and gain new data in ways that best leverage the latest advancements in science and technology. One of the newest tools scientists have started to take advantage of is Amazon Mechanical Turk, or AMT.

What is Amazon Mechanical Turk?

Yes – it’s owned and ran by Amazon, who are also one of the largest users. Essentially, AMT is a crowdsourcing human task based site. “Requesters” that need something to be done, be it a survey, data entry, or otherwise, create what are known as Human Intelligence Tasks, or HIT’s that “Workers” can then complete. Such Workers are located all over the world and can login to AMT at any time to see what HIT’s they would like to action (and get paid for) today. A micro-payment for a particular task are distributed via AMT to the Worker, from the requester. A Worker can pick and choose what tasks they want to complete each hour or day, whenever they like.

So How Does AMT Help Data Scientists?

Because the Mechanical Turk community is made up of hundreds of thousands of human beings that can generate a variety of human responses, this human powered network is great for creating algorithms that software can work off. For example, if a number of pictures need to be rated for emotional content, preliminary data from AMT workers (via HIT’s) can create a reliable set of data within a system – that a software program can utilize or work with.

Are There Downsides to AMT?

Like with anything involving human decision making, much of the data received on Mechanical Turk is subjective; for instance, what one person sees as “ugly” may not coincide with what another person sees. This makes control variables and a large data sampling size absolutely necessary to support very important data science experiments.

Into the future, as it builds and diversifies, Amazon Mechanical Turk may become a truly great site for researchers and data scientists to use for data validation, mining and other experiments. The AMT crowdsourcing community can be tapped relatively inexpensively and is highly useful for helping to design and build new programs and algorithms for use in other areas of science and business. However, the drawbacks of human decision making sometimes means that researchers must be very careful about the data they collect and how they use it.


View full article: Mechanical Turk – the Data Scientist’s Best Friend


Leave a Comment

Your email address will not be published. Required fields are marked *