Affinity Workshop: WiML Workshop 1

Predicting Fake News and Real News Spreaders' Influence

Amy Zhang · Aaron Brookhouse · Francesca Spezzano


The spread of misinformation across social media is one of the biggest national security threats in the 21st century. Previous research has been successful at identifying misinformation spreaders on Twitter based on user demographics and past tweet history, and others have been relatively successful at predicting the number of retweets of a given tweet. However, the problem of predicting the number of retweets of news articles tweeted by a specific user has not yet been tackled, which determines the impact of the initial tweet containing misinformation. We use data from FakeNewsNet, containing a list of 43119 known fake news spreaders and 135,234 real news spreaders, and the past 500 tweets of each user to build profiles of each user to predict the number of retweets the news article tweet will receive. We present a Random Forest classifier that categorizes the number of retweets a news tweet will receive into 5 ranges using user profile characteristics and information about past tweets. This model resulted in a weighted F1 score of 0.931 at highest for the real news dataset and 0.853 for the fake news dataset, higher than existing models. We show the difficulty in fake news retweet prediction due to low variance in user characteristics in fake news spreader and propose the potential for graph-based models to more accurately predict retweets of fake news.

Chat is not available.