Explanatory Data Analysis group

Matthijs van Leeuwen

Matthijs van Leeuwen
Matthijs van Leeuwen
Group leader & assistant professor

Group leader & assistant professor Website Google Scholar profile LinkedIn profile

The short (compressed) version

Matthijs likes data, patterns, algorithms, and information theory. He thinks that data mining and machine learning results should be explainable and interpretable.

The longer version

Matthijs is assistant professor and group leader at the Leiden Institute of Advanced Computer Science (LIACS), the computer science institute of Leiden University. He is also affiliated with the Leiden Centre of Data Science (LCDS). His primary research interest is exploratory data mining: how can we enable domain experts to explore and analyse their data, to discover structure and—ultimately—novel knowledge?

For this it is very important that all methods and results are explainable to domain experts, who may not be data scientists. His approach is therefore to define and identify patterns that matter, i.e., succinct descriptions that characterise relevant structure present in the data. Which patterns matter strongly depends on the data and task at hand, hence defining the problem is one of the key challenges of exploratory data mining. Information theoretic concepts such as the Minimum Description Length (MDL) principle have proven very useful to this end. Matthijs is also interested in interactive data mining, i.e., involving humans in the loop. Finally, he finds it very interesting to do fundamental data mining research for real-world applications, both in science (e.g., life sciences, social sciences) and industry (e.g., manufacturing and engineering, aviation).


Matthijs was previously senior researcher at Leiden University (2015-2017), and postdoctoral researcher at KU Leuven (2011-2015) and Universiteit Utrecht (2009-2011). He defended his Ph.D. thesis, titled Patterns that Matter, in February 2010, at Universiteit Utrecht. He won several best paper awards at international conferences and was awarded NWO Rubicon and FWO Postdoc grants. He co-organised a number of international conferences and workshops, such as IDA and IDEA, and co-lectured tutorials on 'Information Theoretic Methods in Data Mining'.

More information, including CV, at www.patternsthatmatter.org

Selected recent publications

In press
Paramonov, S, van Leeuwen, M & De Raedt, L Relational Data Factorization. Machine Learning, Springer
Ukkonen, A, Dzyuba, V & van Leeuwen, M Explaining Deviating Subsets through Explanation Networks. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'17), Springer, 2017.
Dzyuba, V & van Leeuwen, M Learning what matters – Sampling interesting patterns. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17), pp 534-546, Springer, 2017.
Dzyuba, V, van Leeuwen, M & De Raedt, L Flexible constrained sampling with guarantees for pattern mining. Data Mining and Knowledge Discovery vol.31(5), pp 1266-1293, Springer, 2017. (ECMLPKDD'17 Special Issue)implementation
Le Van, T, Nijssen, S, van Leeuwen, M & De Raedt, L Semiring Rank Matrix Factorisation. Transactions on Knowledge and Data Engineering vol.29(8), pp 1737-1750, IEEE, 2017.
van Stein, B, van Leeuwen, M, Wang, H, Purr, S, Kreissl, S, Meinhardt, J & Bäck, T Towards Data Driven Process Control in Manufacturing Car Body Parts. In: Proceedings of IEEE International Conference on Computational Science and Computational Intelligence (IEEE CSCI-ISBD'16), IEEE, 2016.
van Rijn, S, Wang, H, van Leeuwen, M & Bäck, T Evolving the Structure of Evolution Strategies. In: Proceedings of IEEE Symposium Series on Computational Intelligence (IEEE SSCI'16), IEEE, 2016.
van Stein, B, van Leeuwen, M & Bäck, T Local Subspace-Based Outlier Detection using Global Neighbourhoods. In: Proceedings of IEEE International Conference on Big Data (IEEE BigData'16), IEEE, 2016.
van Leeuwen, M & Ukkonen, A Expect the Unexpected - On the Significance of Subgroups. In: Proceedings of Discovery Science (DS'16), pp 51-66, Springer, 2016.
Le Van, T, van Leeuwen, M, Fierro, AC, De Maeyer, D, Van den Eynden, J, Verbeke, L, De Raedt, L, Marchal, K & Nijssen, S Simultaneous discovery of cancer subtypes and subtype features by molecular data integration. Bioinformatics vol.32(17), pp 445-454, Oxford University Press, 2016.implementation
Copmans, D, Meinl, T, Dietz, C, van Leeuwen, M, Ortmann, J, Berthold, M & de Witte, PAM A KNIME-based Analysis of the Zebrafish Photomotor Response Clusters the Phenotypes of 14 Classes of Neuroactive Molecules. Journal of Biomolecular Screening vol.21(5), pp 427-436, SAGE Publishing, 2016.implementation
van Leeuwen, M, De Bie, T, Spyropoulou, E & Mesnage, C Subjective Interestingness of Subgraph Patterns. Machine Learning vol.105(1), pp 41-75, Springer, 2016.implementation