Explanatory Data Analysis group

Matthijs van Leeuwen

Matthijs van Leeuwen
Matthijs van Leeuwen
Group leader & assistant professor

Group leader & assistant professor Website Google Scholar profile LinkedIn profile

The short (compressed) version

Matthijs likes data, patterns, algorithms, and information theory. He strives for data mining and machine learning methods and results that are principled, interpretable, and incorporate existing knowledge.

The longer version

Matthijs is assistant professor and group leader at the Leiden Institute of Advanced Computer Science (LIACS), the computer science institute of Leiden University. He is also affiliated with the Leiden Centre of Data Science (LCDS) and university-wide Data Science Research Programme (DSRP). His primary research interest is exploratory data mining: how can we enable domain experts to explore and analyse their data, to discover structure and—ultimately—novel knowledge?

For this it is important that methods and results are explainable to domain experts, who may not be data scientists. His signature approach is to define and identify patterns that matter, i.e., succinct descriptions that characterise relevant structure present in the data. Which patterns matter strongly depends on the data and task at hand, hence defining the problem is one of the key challenges of exploratory data mining. Information theoretic concepts such as the Minimum Description Length (MDL) principle have proven very useful to this end. Matthijs is also interested in interactive data mining, i.e., involving humans in the loop. Finally, he is interested in fundamental data mining research for real-world applications, both in science (e.g., life sciences, social sciences) and industry (e.g., manufacturing and engineering, aviation), as this is the best way to show that the theory works in practice.


Matthijs was previously a senior researcher at Leiden University (2015-2017), and a postdoctoral researcher at KU Leuven (2011-2015) and Universiteit Utrecht (2009-2011). He defended his Ph.D. thesis, titled Patterns that Matter, in February 2010, at Universiteit Utrecht. He won several best paper awards at international conferences and was awarded NWO Rubicon, FWO Postdoc, and NWO TOP2 grants. He co-organised a number of international conferences and workshops, such as IDA and IDEA, and co-lectured tutorials on 'Information Theoretic Methods in Data Mining'.

More information, including CV, at www.patternsthatmatter.org

Selected recent publications

Gawehns, D, Veiga, G & van Leeuwen, M Focus on dynamics: a proof of principle in exploratory data mining of face-to-face interactions. In: Proceedings of the 5th International Conference on Computational Social Science (IC2S2), 2019. (Poster presentation)
Proença, HM, Klijn, R, Bäck, T & van Leeuwen, M Identifying flight delay patterns using diverse subgroup discovery. In: Proceedings of the Symposium Series on Computational Intelligence (SSCI'18), IEEE, 2018.
van Rijn, S, van Leeuwen, M, Schmitt, S, Olhofer, M & Bäck, T Multi-Fidelity Surrogate Model Approach to Optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'18), ACM, 2018.
van Os, H, Ramos, L, Hilbert, A, van Leeuwen, M, van Walderveen, M, Kruyt, N, Dippel, D, Steyerberg, E, van der Schaaf, I, Lingsma, H, Schonewille, W, Majoie, C, Olabarriaga, S, Zwinderman, K, Venema, E, Marquering, H & Wermer, M Predicting outcome of endovascular treatment for acute ischemic stroke: potential value of machine learning algorithms. Frontiers in Neurology vol.9(784), Frontiers, 2018.
van Leeuwen, M, Chau, DH, Vreeken, J, Shahaf, D & Faloutsos, C Editorial: TKDD Special Issue on Interactive Data Exploration and Analytics. Transactions on Knowledge Discovery from Data vol.12(1), ACM, 2018.
Ukkonen, A, Dzyuba, V & van Leeuwen, M Explaining Deviating Subsets through Explanation Networks. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'17), Springer, 2017.
Dzyuba, V & van Leeuwen, M Learning what matters – Sampling interesting patterns. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17), pp 534-546, Springer, 2017.
Paramonov, S, van Leeuwen, M & De Raedt, L Relational Data Factorization. Machine Learning vol.106(12), pp 1867-1904, Springer, 2017.
Dzyuba, V, van Leeuwen, M & De Raedt, L Flexible constrained sampling with guarantees for pattern mining. Data Mining and Knowledge Discovery vol.31(5), pp 1266-1293, Springer, 2017. (ECMLPKDD'17 Special Issue)implementation
Le Van, T, Nijssen, S, van Leeuwen, M & De Raedt, L Semiring Rank Matrix Factorisation. Transactions on Knowledge and Data Engineering vol.29(8), pp 1737-1750, IEEE, 2017.