“Over the weekend of Apple’s April 3 release of the iPad, 73% of circulated tweets were favorable toward the iPad, but 26% expressed disappointment that the iPad could not replace the iPhone, according to a study.”
If you’re not too careful, you could conclude that sentiment towards the iPad was largely favorable. But you would probably have made a biased conclusion.
This is the point that Harvard Business Review’s Kate Crawford makes in a recent article, “The Hidden Biases in Big Data.” With a data sample, it is always critical to ask whether the sample is representative of the target population.
Thus, considering the iPad sentiment example, a key questions is: are the people who tweeted about Apple’s iPad over that weekend (the sample) representative of all the people who have, or even could have, interacted with the iPad during that time (the target population)?
Some excerpts from the article:
- Hidden biases in both the collection and analysis stages present considerable risks, and are as important to the big-data equation as the numbers themselves.
- Data and data sets are not objective; they are creations of human design.
- We get a much richer sense of the world when we ask people the why and the how not just the “how many”.
Read the article here.
Our bodies have limited resources, which need to be replenished. Taking breaks during the day, and vacations during the year, is a recipe to be more productive. From a New York Times opinion piece:
- basketball players who slept for 10 hours a night increased their shooting percentages by 9%
- air traffic controllers who were given 40 minutes to nap performed much better on vigilance tests
- longer naps (60-90 minutes) can improve mental acuity even more
- full-vacationers were more likely to have higher ratings from their supervisorsEdit
“To maximize gains from long-term practice,” Dr. Ericsson concluded, “individuals must avoid exhaustion and must limit practice to an amount from which they can completely recover on a daily or weekly basis.”
Read it at The New York Times.
Steve Lohr writes a reflection on the promises of Big Data, citing increasing buzz, yet also its initial big failure:
“Many of the Big Data techniques of math modeling, predictive algorithms and artificial intelligence software were first widely applied on Wall Street.” And what happened there we all know.
A chief scientist from an ad-targeting startup:
“You can fool yourself with data like you can’t with anything else. I fear a Big Data bubble.”
“A major part of managing Big Data projects, he says, is asking the right questions: How do you define the problem? What data do you need? Where does it come from? What are the assumptions behind the model that the data is fed into? How is the model different from reality?”
“Models do not just predict, but they can make things happen,” says Rachel Schutt, who taught a data science course this year at Columbia.
A concern is that “the algorithms that are shaping my digital world are too simple-minded, rather than too smart.”
Read at The New York Times.