'Big Data' from Twitter, Facebook Doesn't Represent the Real World

Nov 29, 2014 12:45 PM EST | Jordan Ecarma

Close

Scientists are warning that "big data," social trends culled from Facebook, Twitter and other online spheres, doesn't necessarily result in the goldmine of information that has been represented.

While some have said that social media sites can be used to forecast stock market changes and hit summer movies, Twitter and others offer a limited view of society since they represent just part of the population, researchers said in a new study published in Science.  

Coming from scientists at McGill University in Montreal and Carnegie Mellon University in Pittsburgh, the study said that data from Twitter, Pinterest and Facebook can't be used without allowing for "population bias," the Telegraph reported.

Tapping these social media spheres for widely applicable social trends is jumping the gun, the researchers said.

"People want to say something about what's happening in the world and social media is a quick way to tap into that. You get the behavior of millions of people--for free," said co-author Juergen Pfeffer of Carnegie Mellon, as quoted by the Telegraph. "Not everything that can be labelled as 'Big Data' is automatically great."

Some of the ways social media can be biased in favor of portions of the population include:

1. Far more men than women use Twitter.

2. Instagram is dominated by urban young people, appealing to "adults between the ages of 18 and 29, African-Americans, Latinos, women and urban dwellers," the researchers said.

3. Pinterest overwhelmingly attracts women who are between the ages of 25 and 34 with average household incomes of $100,000.

4. More women use Facebook than men; 76 percent of women on the Internet are on Facebook compared with 66 percent of men.

5. Spammers and bots on all of the sites can tamper with results when they are unwittingly added into measurements for real humans.

6. While Facebook has data from around a seventh of the world's population, it has no "dislike" counterpart to the "like" button, which could skew results.

"A common assumption underlying many large-scale social media-based studies of human behavior is that a large-enough sample of users will drown our noise introduced by peculiarities of the platform's population," said lead author Derek Ruths, an assistant professor in McGill's School of Computer Science, as quoted by the Telegraph.

"These sampling biases are rarely corrected for, if even acknowledged."

See Now: OnePlus 6: How Different Will It Be From OnePlus 5?

© 2020 Auto World News, All rights reserved. Do not reproduce without permission.
Get the Most Popular Autoworld Stories in a Weekly Newsletter

Join the Conversation

Real Time Analytics