• Graduate Program
    • Why study Business Data Science?
    • Program outline
    • Courses
    • Course registration
    • Admissions
    • Facilities
      • Student Offices
      • Location
      • Housing
      • Student Council
  • Research
  • Events
    • Events Calendar
    • Events archive
    • Summer School
      • Deep Learning
      • Parallel Computing and Big Data
      • Tinbergen Institute Summer School Program
  • Summer School
    • Parallel Computing and Big Data
    • Deep Learning
  • News
Home | Events | Summer School | Parallel Computing and Big Data

Parallel Computing and Big Data

July 12-16, 2021 Online

 


Faculty

Jeroen Engelberts works as Research Software Engineer for the Rotterdam School of Management (RSM) at Erasmus University Rotterdam since 2018. 
Meet the Lecturer

Course

In this course, we provide the theory and applications of state-of-the-art techniques for parallel computing. 

Level

The summer course welcomes Master’s and PhD students, alumni, professionals in economics and related fields, who are interested in parallel computing. The level is introductory, targeted at participants who would like to familiarize themselves with the topic, and acquire a good basis from which to approach parallel computing potential applications.

Topics Covered

Nowadays, even mobile phones and tablets have multiple core central processing units (CPUs), as do have the simplest laptop and desktop PCs. Using their combined compute power, however, is not trivial. This is as true for the small systems, as well as (worlds) largest compute systems. In data science, making use efficiently of all compute power is a required skill that needs to be learned. In this course you will be taught how to have all cores take part in a single task, or to have each core working on its own share of the total task.

Modern day researchers quite often have to rely on larger systems than their own. In the Netherlands many use the national supercomputer clusters, Lisa and Cartesius, at SURFsara. Like most other large shared computer systems in research, these systems have UNIX, or Linux, running as operating system. On top of that, many of them make use of a batch system to give multiple users a fair share of the total resources. The Lisa system omf SURFsara will be used during the first part of the course.

After working with Lisa, the different types of parallel programming will be taught with Python as programming language. Although C and Fortran are very common in high-performance computing (HPC), it is also possible to use parallelism in Python, the language of choice for many researchers in the data science field.
The contents of this course comprise a BASH (Unix shell) course, a Python recap, an introduction to Jupyter Notebooks and a programming course to learn how to work with different parallel modules and packages in Python. For the latter, the “Python Parallel Programming Cookbook” is used (2nd edition, 2019). Although referred to as a cookbook, it has a decent amount of theory to build a foundation for deeper understanding of parallel paradigms.

Literature:
Python Parallel Programming Cookbook, 2nd ed., 2019 (not included in course fee).

Admission requirements

Required knowledge: Programming Basics
Recommended: Mathematics, Statistics

Students are expected to have a background in calculus and in linear algebra. Familiarity with open source languages such as R or Python is a must).

 

Academic director Jeroen Engelberts
Degree programme Certificate
Credits Participants who joined at least 80% of all sessions and hand in the assignment will receive a certificate of participation stating that the summer school is equivalent to a work load of 3 ECTS. Note that it is the student’s own responsibility to get these credits registered at their university.
Mode Short-term
Language English
Venue Zoom
Capacity 50 participants (minimum of 15)
Fees Tuition Fees and Payments
Application deadline June 14, 2021
Apply here Application Form

Contact

Summer School