An Analysis and Visualization with Pandas and Tableau

Map of Austria showing how many news articles in the regional newspapers dealt with the respective localities and towns. / Graphic by the author

About daily newspapers, the Austrian media landscape is divided into publications that describe themselves as regional media (newspapers of the federal provinces) and national media. In this article, I would like to examine whether this self-understanding is also reflected in the reporting. As a research method, I choose a data-based approach that collects and analyses the news articles published online in the period from 2 February 2021 to 3 May 2021.

I have 217,595 contributions distributed among the following 15 publishers:

National media


A Casestudy with Kaggle Data

Foto: www.pexels.com Creative Commons CC0

With the data that a potential customer (“lead”) leaves on a website, important insights and results on customer behavior can be gained. Machine learning is then used to create a prediction model from this data. The case study carried out shows that the accuracy of such a forecast model is 90%.

An education company sells online courses to industry professionals. On any given day, thanks to marketing efforts, many professionals interested in the courses land on their website and browse for courses. This is how campaigns on social media channels, websites, or search engines, such as Google, attract new prospects.


Step by Step

(Source: www.pexels.com)

In today’s digital age, “I saw it with my own eyes” can hardly be considered a valid argument. This is especially true when people themselves believe that digitally prepared content, such as online news or social media posts, has been consumed. The possibilities for manipulating digital content are simply too great.

A typical example to show what is possible is a video of an Obama speech that he never actually gave.


A Data Analysis of Coverage in Austrian Online News

Coverage over the last three month / Image by Author

Due to the extensive coverage of the COVID19 pandemic, it is easy to get the impression that nothing else is being reported. In this article, I would like to examine on the basis of data whether this subjective impression corresponds to reality. I will also take a look at how coverage of different topics has evolved over the last few months. Regional differences and the tone of the news coverage, i.e. whether it has been more positive or negative, will also be discussed.

For the analysis, I collected and evaluated a total of 148,991 articles published online in the Austrian…


Word Vectors for Chess Moves

Similarity map of chess moves / Image by Author

In this article, I want to analyze which moves in a game of chess are close, in the sense that they often occur in similar situations in games. If two moves often occur after or before the same moves, then these moves are similar in a certain sense.

For example, which move is close to the opening with the queenside pawn “d4”?

Is it possible to recognize a general structure and to represent it visually?

Data source

The source for my analyses are files of games played on the internet chess server Lichess. At https://database.lichess.org/ you can find all the games played…


Data analysis of online news

Length and volume of online news per weekday and time / Image by Author

The news published online by daily newspapers is an important source of information. Not only do they contain the statements to be disseminated, but also implicitly other information about the publisher and its employees. This flow of information is usually not intended, and the publishers are not even aware of it.

These are not secret hidden messages embedded in individual messages, as some people believe to find secret messages in Beatles songs, but information that is only apparent when a large amount of data is viewed together and correctly combined. …


with “BigML”

Distributions and correlations

Machine Learning is an important technology for handling data in today’s world. It is used to derive models of reality from data. For example, you can use it to segment customer data in an online store or to optimize a performance marketing campaign.
This usually requires the use of a programming language with a large number of program libraries for the selected language. Very often “Python” or “R” are used here today and libraries like “Scikit Learn” and “TensorFlow”.

Another way the platform “BigML” tries to go is by offering a user interface that allows them to control all steps…


A visual approach with different machine learning classifiers

Image by the author

In this article, I would like to show how different machine learning methods can be used to classify customers into buying and non-buying using tracking data from an online shop. With features aggregated from the raw data, such as the number of visits and number of page views, forecast models are trained and visualized.

Special attention is paid to the visual presentation of the forecast models with the help of 2-D plots and coloring of the decision boundaries. The peculiarities of the different methods become apparent as well as situations with under- and over-adjustment of the models. …


Lessons learned from an Eye-Tracking Study

Adapted Pacman Version

In a recent paper for the ETRA ’20 ACM Symposium on Eye Tracking Research and Applications, we took a closer look at the gaze behavior of computer gamers. The gaze behavior of players in different difficult situations is examined in order to gain potential insights for game design.

A comparative study was conducted in which the test persons played the game Pac-Man in three difficulty levels while their gaze behavior was recorded with an eye-tracking device. …


The race for larger language models is entering the next round.

Image: www.pexels.com

Progress in NLP applications is driven by larger language models consisting of neural networks using the Transformer Architecture. On the occasion of the recently published results of the currently largest model — GPT-3 of Open AI, I would like to take a closer look at these advances.

On May 28, 2020, a paper (https://arxiv.org/abs/2005.14165) by OpenAI researchers was published on ArXiv about GPT-3, a language model that is capable of achieving good results in a number of benchmark language processing tasks ranging from language translation and news article writing to question answering. …

Andreas Stöckl

University of Applied Sciences Upper Austria / School of Informatics, Communications and Media http://www.stoeckl.ai/profil/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store