Explanatory Data Analysis group

Matthijs van Leeuwen

Matthijs van Leeuwen
Matthijs van Leeuwen
Associate professor & group leader

Associate professor & group leader Website Google Scholar profile LinkedIn profile Twitter profile

The short (compressed) version

Matthijs likes data, patterns, algorithms, and information theory. He strives for data mining and machine learning methods and results that are principled, interpretable, and incorporate existing knowledge.

The longer version

Matthijs is associate professor and group leader at the Leiden Institute of Advanced Computer Science (LIACS), Leiden University. He is Programme Manager of the Master Computer Science, and affiliated with both the Leiden Centre of Data Science (LCDS) and university-wide Data Science Research Programme (DSRP). His primary research interest is exploratory data mining: how can we enable domain experts to explore and analyse their data, to discover structure and—ultimately—novel knowledge?

For this it is important that methods and results are explainable to domain experts, who may not be data scientists. His signature approach is to define and identify patterns that matter, i.e., succinct descriptions that characterise relevant structure present in the data. Which patterns matter strongly depends on the data and task at hand, hence defining the problem is one of the key challenges of exploratory data mining. Information theoretic concepts such as the Minimum Description Length (MDL) principle have proven very useful to this end. Matthijs is also interested in interactive data mining, i.e., involving humans in the loop. Finally, he is interested in fundamental data mining research for real-world applications, both in science (e.g., life sciences, social sciences) and industry (e.g., manufacturing and engineering, aviation), as this is the best way to show that the theory works in practice.


Matthijs was previously a (tenure track) assistant professor (2017-2020) and senior researcher (2015-2017) at Leiden University, and a postdoctoral researcher at KU Leuven (2011-2015) and Universiteit Utrecht (2009-2011). He defended his Ph.D. thesis, titled Patterns that Matter, in February 2010, at Universiteit Utrecht. He won several best paper awards at international conferences and was awarded NWO Rubicon, FWO Postdoc, and NWO TOP2 grants. He is General Chair of the IDA Council and editorial board member of Data Mining and Knowledge Discovery. Further, he co-organised a number of international conferences and workshops, and co-lectured tutorials on 'Information Theoretic Methods in Data Mining'.

More information, including CV, at www.patternsthatmatter.org

Selected recent publications

In press
Kroes, SKS, Janssen, MP, Groenwold, RHH & van Leeuwen, M Evaluating privacy of individuals in medical data. Health Informatics Journal, SAGE Publications
Marx, A, Yang, L & van Leeuwen, M Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms. In: Proceedings of the SIAM Conference on Data Mining 2021 (SDM'21), SIAM, 2021.website
Kapoor, S, Saxena, DK & van Leeuwen, M Online Summarization of Dynamic Graphs using Subjective Interestingness for Sequential Data. Data Mining and Knowledge Discovery vol.35(1), pp 88-126, 2021. (ECML PKDD journal track)
Gawehns, D & van Leeuwen, M Social Fluidity in Children's Face-to-Face Interaction Networks. In: Proceedings of the Graph Embedding and Mining (GEM) Workshop at ECML PKDD 2020, 2020.website
Proença, HM, Grünwald, P, Bäck, T & van Leeuwen, M Discovering Outstanding Subgroup Lists for Numeric Targets using MDL. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2020), Springer, 2020.implementationwebsite
Vinkenoog, M, van den Hurk, K, van Kraaij, M, van Leeuwen, M & Janssen, M First results of a ferritin-based blood donor deferral policy in the Netherlands. Transfusion vol.60(8), pp 1785-1792, Wiley, 2020.
Faas, M & van Leeuwen, M Vouw: Geometric Pattern Mining using the MDL Principle. In: Proceedings of the Eighteenth International Symposium on Intelligent Data Analysis (IDA 2020), Springer, 2020.
Gautrais, C, Cellier, P, van Leeuwen, M & Termier, A Widening for MDL-based Retail Signature Discovery. In: Proceedings of the Eighteenth International Symposium on Intelligent Data Analysis (IDA 2020), Springer, 2020.
Kapoor, S, Saxena, DK & van Leeuwen, M Discovering Subjectively Interesting Multigraph Patterns. Machine Learning, pp 1-28, Springer
Proença, HM & van Leeuwen, M Interpretable multiclass classification by MDL-based rule lists. Information Sciences vol.512, pp 1372-1393, Elsevier, 2020.implementationwebsite