Book “Applied Data Science in Tourism”

Data science is becoming more and more important in tourism, as in many other areas. However, it is not always easy for users from the specialized field to apply the general methods in their own domain.

For the field of tourism, I had the privilege of working on a book last year that attempts to close this gap. The book “Applied Data Science in Tourism” (, edited by Roman Egger and published by Springer, shows different methods of data science using tourism applications as examples. I was allowed to write the chapters “Classification”, “Regression” and “Web Scraping” together with colleagues. For these chapters, I would like to show examples of applications in the following sections where tourism benefits from the use of data science.

Example classification

Classification using machine learning is one of the most important and widespread methods of data science. In the book, we show how to classify visitors to a hotel’s website into potential customers and uninterested web surfers.

Consider a hotel that focuses on certain target groups and offers different holiday packages for them. Let’s further assume that the hotel uses web tracking to record the page views of the offers on its website and also the bookings made.

Using this data, a classification model is developed that predicts whether a visitor will make a booking or not based on the behavior on the website. This prediction is based on the following inputs in the example:
- Which pages were visited by the surfer and how often?
- How many pages were called up?
- In which categories were the offers that the visitors looked at?

This example demonstrates the most common classification methods:
- k-nearest neighbor (KNN) classification
- Logistic regression
- Naïve Bayes classification
- Decision trees
- Random forest
- Gradient tree boosting
- Support vector machine classification
- Artificial neural networks

Based on this classification, the hotel operator can then concentrate on the visitors classified as interested and, for example, target online marketing at these users or offer special packages.

Example regression

With the same task, another class of methods is shown in the next chapter, the so-called regression methods. Here the aim is to predict a numerical value from the data. In the example, these are used to predict the turnover for the visitors. In contrast to classification, the users are not divided into fixed classes, but each is described with a numerical value.

The following procedures are presented:
- Linear regression
- Regression trees
- Random forest
- Gradient tree boosting
- Support vector machine regression
- Artificial neural networks

Example Web Scrapping

Often, not only the analysis of data is a topic for Data Science, but also the acquisition. If the data is not available in a structured form, the so-called “web scraping” offers the possibility to automatically extract data from websites.

Using the example of extracting the ratings of a hotel rating platform, the technical possibilities are shown as well as the legal framework conditions.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Andreas Stöckl

Andreas Stöckl


University of Applied Sciences Upper Austria / School of Informatics, Communications and Media