Explanatory Data Analysis group
Research projects
Below is a (non-exhaustive) list of concrete research projects in which our group is currently involved; also see the overview of research themes and valorisation examples.
Data Science for State-of-the-Art Blood Banking
- This project is part of the university-wide Data Science Research Programme.
- Project members
- Marieke Vinkenoog (Sanquin, LIACS), Matthijs van Leeuwen (LIACS), Aske Plaat (LIACS), Mart Janssen (Sanquin, LIACS; PI)
- External partner
- Sanquin (Amsterdam)
- Period
- 2018 – 2022
- Description
- The mission of Sanquin is to be a knowledge-driven organization that provides lifesaving products while committing itself to careful, responsible and efficient processing of the free and voluntary donor gift. At present, however, still around 10% of donors are being deferred when tested at the donation site. Apart from a substantial loss of effectiveness of productivity, this leads to a substantial loss of donors. Recent advances in data science have the potential to substantially improve the understanding and control of blood bank processes by uncovering and utilizing known and unknown patterns in historical donation data. We will use (recurrent) neural networks to capture complex association structures within historical data and apply these to predict, for instance, Hb levels and/or no-show probabilities for future donations. A key challenge will be to correctly interpret and explain the obtained models and predictions.
Dementia back in the heart of the community
- This project is part of the university-wide Data Science Research Programme.
- Project members
- Daniela Gawehns (LIACS), Matthijs van Leeuwen (LIACS), Martine Huygens (NIVEL), Sandra van Beek (NIVEL), Peter Groenewegen (NIVEL), Joost Kok (UT), Janke de Groot (NIVEL; PI)
- External partners
- NIVEL (Utrecht), Stichting Maasduinen (Kaatsheuvel)
- Period
- 2018 – 2022
- Description
- Park Vossenberg is a long-term care organization that is currently rebuilding its facilities and redesigning the surrounding park area to make it possible that residents with dementia freely use the outdoor park area. There will be no gates and people from the surrounding residential area will also use the park. This project accompanies this change in care for people with dementia, by monitoring activities and changes in persons with dementia, family members, nursing staff, volunteers, and people from the local community. We will do this through a process evaluation and a before-after design with comparison to other locations where these changes have not been implemented yet. Specifically, we will analyse the activity patterns of persons with dementia measured by sensors and observations. The results will provide a deeper understanding of interaction patterns between persons with dementia and their environment and how these interactions are related to health and quality of life of people with dementia and social cohesion in the local community.
The international tax system as a complex system
- This project is part of the university-wide Data Science Research Programme.
- Project members
- Manon Wintgens (Leiden Law School), Matthijs van Leeuwen (LIACS), Irma Mosquera Valderrama (Leiden Law School), Rex Arendsen (Leiden Law School; PI)
- Academic partner
- Leiden Law School (Leiden)
- Period
- 2017 – 2021
- Description
- The international tax system is composed of multiple layers, i.e., law and regulations, jurisdictions, and businesses. Previously, these inherently different layers were often analysed from a fiscal perspective. In contrast, this data-driven research project aims to study the international tax system in its entirety from a complex systems perspective. The main goal will be to investigate if and how the international tax system can be defined and modelled as a complex system. Approaching the international tax system from this perspective aims to gain new insights on all layers, e.g., 1) on the effect of new and modified tax treaties; 2) on the interaction between jurisdictions; and 3) on the behaviour of business strategies over time. In addition, the project will address questions related to the existence of, e.g., tax gaps, legislative patterns, and tax havens and how these are reflected in observational data. By applying, e.g., network modelling and pattern discovery, the researchers aim to understand the behaviour of the international tax system as a complex system.
Meta-modelling for privacy-preserving mining of medical data
- This project is part of the university-wide Data Science Research Programme.
- Project members
- Shannon Kroes (Sanquin, LIACS), Matthijs van Leeuwen (LIACS), Rolf Groenewold (LUMC), Rutger Middelburg (LUMC), Mart Janssen (Sanquin, LIACS; PI)
- External partner
- Sanquin (Leiden)
- Period
- 2017 – 2020
- Description
- In many domains and in the medical domain in particular, it is important to protect the privacy of individuals. This implies that medical data often cannot be shared, even when scientific progress could strongly benefit from this. The goal of this ambitious project is to develop methods that construct meta-models of the data that 1) allow to perform data analysis tasks, such as building predictive models, while 2) not containing any sensitive information—thus guaranteeing privacy. By publishing these meta-models instead of the data, the scientific community can exploit the data without breaching the privacy of the individuals represented in the data.
SAPPAO – A Systems Approach towards Data Mining and Prediction in Airlines Operations
- Joint project with the Natural Computing Group.
- Project members
- Hugo Manuel Proença (LIACS), Sarang Kapoor (IIT Roorkee), Matthijs van Leeuwen (LIACS), Dhish Saxena (IIT Roorkee), Michael Emmerich (LIACS), Divyam Aggarwal (IIT Roorkee), Thomas Bäck (LIACS; PI)
- Industrial partner
- GE Aviation (Bangalore, India)
- Period
- 2016 – 2020
- Description
- By analysing historical flight data and data on the associated disruptive events on the flight network, the NWO-DeitY SAPPAO project aims to optimise the accuracy and reliability of predicting scheduled flight times, thereby potentially saving millions of Euro’s on better utilisation of airplanes, decreased fuel consumption, decreased CO2‐emissions, decrease of ambient noise and better use of time for passengers and airports. At LIACS we will focus on feature construction for improved flight predictability and reduced airline operating cost. The challenge in this prediction is that it is not clear which features should be used to obtain the best estimates. There is a wide range of available data, including network data, time series data, and so on, which is not straightforwardly used in existing attribute‐value based machine learning and statistical techniques. This project will deal with these challenges.
> More information
DAMIOSO – Data Mining on High Volume Simulation Output
- Joint project with the Natural Computing Group.
- Project members
- Sander van Rijn (LIACS), Matthijs van Leeuwen (LIACS), Stefan Manegold (LIACS, CWI), Michael Lew (LIACS), Thodoris Georgiou (LIACS), Pedro Holanda (CWI), Thomas Bäck (LIACS; PI)
- Other partners
- Honda Research Institute Europe (Offenbach, Germany)
- Period
- 2016 – 2020
- Description
- The DAMIOSO project, funded by NWO and Honda Research Europe, focuses on developing algorithms and tools for data management, data mining and knowledge extraction from massive volumes of data, as generated by modern simulation tools, which are being used in a wide range of industries (aerospace, automotive, shipping, and others), in order to deliver advanced design and process optimisation to support engineers in their design processes.
> More information