Detecting Satire and Fake News with Machine Learning

Sometimes it is even hard for humans to understand if a news article is real, fake or satire. So I asked my self if I can train a machine learning model to decide to which class (real or satire) a given article belongs. There are websites like https://www.theonion.com publishing satire news every day, which can be used together with regular news sites, to collect training data for this classification problem.

Dataset

and from the satirical news sites:

for training and testing of the model. In total, I collected 63,868 articles from 2008 to 2018 and stored them in a local database.

Database of news articles

Implementation

Results

Confusion matrix

I think the presented method can be used with other languages and I expect similar results as with the German news.

Are computers better than humans in detecting satire in texts?

More details can be found in the article https://arxiv.org/abs/1810.00593

University of Applied Sciences Upper Austria / School of Informatics, Communications and Media http://www.stoeckl.ai/profil/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store