Ever seen a twitter-activity map? Then you must consider Big-Data bias…

“Over the weekend of Apple’s April 3 release of the iPad, 73% of circulated tweets were favorable toward the iPad, but 26% expressed disappointment that the iPad could not replace the iPhone, according to a study.”

If you’re not too careful, you could conclude that sentiment towards the iPad was largely favorable. But you would probably have made a biased conclusion.

This is the point that Harvard Business Review’s Kate Crawford makes in a recent article, “The Hidden Biases in Big Data.” With a data sample, it is always critical to ask whether the sample is representative of the target population.

Thus, considering the iPad sentiment example, a key questions is: are the people who tweeted about Apple’s iPad over that weekend (the sample) representative of all the people who have, or even could have, interacted with the iPad during that time (the target population)?

Some excerpts from the article:

  • Hidden biases in both the collection and analysis stages present considerable risks, and are as important to the big-data equation as the numbers themselves.
  • Data and data sets are not objective; they are creations of human design.
  • We get a much richer sense of the world when we ask people the why and the how not just the “how many”.

Read the article here.


Image: TheAtlantic.com


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s