• Business Data Science Courses in 2020
  • Summer School 2020
    • Introduction in Genome-Wide Data Analysis
    • Behavioral Decision Making
    • Econometric Methods for Forecasting and Data Science with Applications in Finance, Economics and Business
    • Deep Learning
  • Research
  • Events
Home | Business Data Science Courses in 2020 | Natural Language Processing

Natural Language Processing

Course is scheduled in the period May 4- July 15 (day and hour to be announced).  Course locations: Zuidas Amsterdam and Erasmus University Rotterdam campus. 

Natural Language Processing

Registration for this course

Deadline for registration is April 19, 2020. Go the the registration pageMax. capacity: 25 students. 

Course Content

Natural Language Processing (NLP) comprises statistical and machine learning tools for automatically analysing text data to derive useful insights from it. Vast amounts of information are stored in this form, and hence NLP has become one of the essential technologies of the big data age. In this course, core concepts and techniques from the area will be studied, with a focus on methods that are popular in business applications. These include n-gram models, word vectors, sentiment analysis and topic modelling.

This course offers students a theoretically informed understanding of NLP. It aims at broadening the knowledge of the methods involved in NLP, as well as a hands-on experience with the steps that need to be taken in a NLP project. We focus on three aspects:

  • to create deep(er) understanding of the main methods in NLP (n-gram, lexicon approach, word2vec and other advanced machine learning methods);
  • to obtain an experience to scrape and clean the data yourself;
  • to apply this knowledge and experience in a group assignment which gives you the possibility to show your creativity.

By the end of this course, you will be able to analyse and evaluate NLP approaches. Moreover, you will apply this knowledge and skills in a real-life setting, enabling you to translate and apply theoretical knowledge into practice.

Topics covered:

  • Information theory, regular expressions and scraping (tokenization, stemming, lemmatization, parsing).
  • Word vectors and dimension reduction based on bag of words (n-grams, PCA)
  • Sentiment analysis (lexicon-based vs model-based)
  • Sentence completion (hidden Markov model, GPT and BERT)
  • Topic models (LDA) and word embeddings (GloVe, Word2Vec)

Learning Objectives

By the end of the course students will be able to:

  • understand the fundamentals of natural language processing including different ways of representing text data for statistical analysis,
  • discuss and apply different sentiment analysis and topic modelling techniques,
  • program selected algorithms involved in these methods, and
  • be acquainted with NLP research areas. 


Coordinator/Lecturer: prof. dr. Bas Donkers (EUR)
Lecturer: dr. Meike Morren (VU)

Course Fees

Course fee for internal research master and PhD students: € 1.000,-
Course fee for external PhD students: € 1.250,-

Entrance Requirements

Recommended knowledge: Programming Basics, Mathematics, Statistics, Econometrics I, Supervised and Unsupervised Machine Learning
Required knowledge: Linear algebra, Regression, Machine Learning (e.g., classification, random forests, support vector machines)

Link to course manual


Registration for this course

Go to the registration page