Explanatory Data Analysis group

Publications

Publications by type (by year)

Journal articles

In press
Li, Z, Liang, , Shi, J & van Leeuwen, M Cross-Domain Graph Level Anomaly Detection. Transactions on Knowledge and Data Engineering, ACM
2024
Salehkaleybar, S, Khorasani, S, Kiyavash, N, He, N & Thiran, P A unified experiment design approach forcyclic and acyclic causal models. Transactions on Machine Learning Research vol.1(1), pp 1-1, 2024.
Li, Z, Zhu, Y & van Leeuwen, M A Survey on Explainable Anomaly Detection. Transactions on Knowledge Discovery from Data vol.18(1), ACM, 2024.website
2023
Mokhtarian, E, Salehkaleybar, S, Ghassami, A & Kiyavash, N A unified experiment design approach forcyclic and acyclic causal models. Journal of Machine Learning Research vol.24(354), pp 1-31, 2023.
Kroes, SKS, van Leeuwen, M, Groenwold, RHH & Janssen, MP Evaluating Cluster-Based Synthetic Data Generation for Blood-Transfusion Analysis. Journal of Cybersecurity and Privacy vol.3(4), pp 882-894, MDPI, 2023.
van Dijk, R, Gawehns, D & van Leeuwen, M WEARDA: recording wearable sensor data for human activity monitoring. Journal of Open Research Software vol.11(1), 2023.website
Vinkenoog, M, Toivonen, J, van Leeuwen, M, Janssen, M & Arvas, M The added value of ferritin levels and genetic markers for the prediction of haemoglobin deferral. Vox Sanguinis vol.118(10), pp 825-834, 2023.
Li, Z & an Leeuwen, M Explainable Contextual Anomaly Detection using Quantile Regression Forests. Data Mining and Knowledge Discovery, Springerwebsite
van der Arend, B, Verhagen, I, van Leeuwen, M, van der Arend, M, van Casteren, D & Terwindt, G Defining migraine days, based on longitudinal E-diary data. Cephalalgia
Yang, L, Baratchi, M & van Leeuwen, M Unsupervised Discretization by Two-dimensional MDL-based Histogram. Machine Learning, Springerwebsite
Kroes, SKS, van Leeuwen, M, Groenwold, RHH & Janssen, MP Generating synthetic mixed discrete-continuous health records with mixed sum-product networks. Journal of the American Medical Informatics Association vol.30(1), Oxford University Press, 2023.
2022
Li, Z & van Leeuwen, M Feature Selection for Fault Detection and Prediction based on Event Log Analysis. ACM SIGKDD Explorations vol.24(2), ACM, 2022.
Proença, HM, Grünwald, P, Bäck, T & van Leeuwen, M Robust subgroup discovery - Discovering subgroup lists using MDL. Data Mining and Knowledge Discoveryimplementationwebsite
van Rijn, S, Schmitt, S, van Leeuwen, M & Bäck, T Finding Efficient Trade-offs in Multi-Fidelity Response Surface Modeling. Engineering Optimizationwebsite
Vinkenoog, M, Steenhuis, M, ten Brinke, A, van Hasselt, C, Janssen, M, van Leeuwen, M, Swaneveld, F, Vrielink, H, van de Watering, L, Quee, F, van cen Hurk, K, Rispens, T, Hogema, B & van der Schoot, E Associations between symptoms, donor characteristics and IgG antibody response in 2082 COVID-19 convalescent plasma donors. Frontiers in Immunology, Frontiers
2021
Kroes, SKS, Janssen, MP, Groenwold, RHH & van Leeuwen, M Evaluating privacy of individuals in medical data. Health Informatics Journal, SAGE Publications
Kapoor, S, Saxena, DK & van Leeuwen, M Online Summarization of Dynamic Graphs using Subjective Interestingness for Sequential Data. Data Mining and Knowledge Discovery vol.35(1), pp 88-126, 2021. (ECML PKDD journal track)implementation
2020
van Rijn, S & Schmitt, S MF2: A Collection of Multi-Fidelity Benchmark Functions in Python. Journal of Open Source Software vol.5(52), 2020.
Vinkenoog, M, van den Hurk, K, van Kraaij, M, van Leeuwen, M & Janssen, M First results of a ferritin-based blood donor deferral policy in the Netherlands. Transfusion vol.60(8), pp 1785-1792, Wiley, 2020.
Kapoor, S, Saxena, DK & van Leeuwen, M Discovering Subjectively Interesting Multigraph Patterns. Machine Learning, pp 1-28, Springer
Proença, HM & van Leeuwen, M Interpretable multiclass classification by MDL-based rule lists. Information Sciences vol.512, pp 1372-1393, Elsevier, 2020.implementationwebsite
2019
van Leeuwen, M, Chau, DH, Vreeken, J, Shahaf, D & Faloutsos, C Addendum to the Special Issue on Interactive Data Exploration and Analytics (TKDD, Vol. 12, Iss. 1): Introduction by the Guest Editors. Transactions on Knowledge Discovery from Data vol.13(1), ACM, 2019.
2018
van Os, H, Ramos, L, Hilbert, A, van Leeuwen, M, van Walderveen, M, Kruyt, N, Dippel, D, Steyerberg, E, van der Schaaf, I, Lingsma, H, Schonewille, W, Majoie, C, Olabarriaga, S, Zwinderman, K, Venema, E, Marquering, H & Wermer, M Predicting outcome of endovascular treatment for acute ischemic stroke: potential value of machine learning algorithms. Frontiers in Neurology vol.9(784), Frontiers, 2018.
van Leeuwen, M, Chau, DH, Vreeken, J, Shahaf, D & Faloutsos, C Editorial: TKDD Special Issue on Interactive Data Exploration and Analytics. Transactions on Knowledge Discovery from Data vol.12(1), ACM, 2018.
2017
Paramonov, S, van Leeuwen, M & De Raedt, L Relational Data Factorization. Machine Learning vol.106(12), pp 1867-1904, Springer, 2017.
Dzyuba, V, van Leeuwen, M & De Raedt, L Flexible constrained sampling with guarantees for pattern mining. Data Mining and Knowledge Discovery vol.31(5), pp 1266-1293, Springer, 2017. (ECMLPKDD'17 Special Issue)implementation
Le Van, T, Nijssen, S, van Leeuwen, M & De Raedt, L Semiring Rank Matrix Factorisation. Transactions on Knowledge and Data Engineering vol.29(8), pp 1737-1750, IEEE, 2017.
2016
Le Van, T, van Leeuwen, M, Fierro, AC, De Maeyer, D, Van den Eynden, J, Verbeke, L, De Raedt, L, Marchal, K & Nijssen, S Simultaneous discovery of cancer subtypes and subtype features by molecular data integration. Bioinformatics vol.32(17), pp 445-454, Oxford University Press, 2016.implementation
Copmans, D, Meinl, T, Dietz, C, van Leeuwen, M, Ortmann, J, Berthold, M & de Witte, PAM A KNIME-based Analysis of the Zebrafish Photomotor Response Clusters the Phenotypes of 14 Classes of Neuroactive Molecules. Journal of Biomolecular Screening vol.21(5), pp 427-436, SAGE Publishing, 2016.implementation
van Leeuwen, M, De Bie, T, Spyropoulou, E & Mesnage, C Subjective Interestingness of Subgraph Patterns. Machine Learning vol.105(1), pp 41-75, Springer, 2016.implementation

Conference papers

2024
Yang, L & van Leeuwen, M Conditional Density Estimation with Histogram Trees. In: Proceedings of the Conference on Neural Information Processing Systems (NeurIPS 2024), 2024.
Yang, Y, Salehkaleybar, S & Kiyavash, N Learning Unknown Intervention Targets in Structural Causal Models from Heterogeneous Data. In: International Conference on Artificial Intelligence and Statistics, pp 3187-3195, PMLR, 2024.
Li, Z, Shi, J & van Leeuwen, M Graph Neural Networks based Log Anomaly Detection and Explanation. In: Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings, pp 306-307, ACM, 2024.
2023
Lopez-Martinez-Carrasco, A, Proença, HM, Juarez, JM, van Leeuwen, M & Campos, M Novel approach for phenotyping based on diverse top-k subgroup lists. In: Proceedings of the Conference on Artificial Intelligence In Medicine (AIME 2023), Springer, 2023.
Lopez-Martinez-Carrasco, A, Proença, HM, Juarez, JM, van Leeuwen, M & Campos, M Discovering Diverse Top-k Characteristic Lists. In: Proceedings of the 21st International Symposium on Intelligent Data Analysis (IDA 2023), Springer, 2023.
Papagianni, I & van Leeuwen, M Discovering Rule Lists with Preferred Variables. In: Proceedings of the 21st International Symposium on Intelligent Data Analysis (IDA 2023), Springer, 2023.
2022
Yang, L & van Leeuwen, M Truly Unordered Probabilistic Rule Sets for Multi-class Classification. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2022), Springer, 2022.implementationwebsite
2021
Marx, A, Yang, L & van Leeuwen, M Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms. In: Proceedings of the SIAM Conference on Data Mining 2021 (SDM'21), SIAM, 2021.website
2020
Proença, HM, Grünwald, P, Bäck, T & van Leeuwen, M Discovering Outstanding Subgroup Lists for Numeric Targets using MDL. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2020), Springer, 2020.implementationwebsite
Faas, M & van Leeuwen, M Vouw: Geometric Pattern Mining using the MDL Principle. In: Proceedings of the Eighteenth International Symposium on Intelligent Data Analysis (IDA 2020), Springer, 2020.
Gautrais, C, Cellier, P, van Leeuwen, M & Termier, A Widening for MDL-based Retail Signature Discovery. In: Proceedings of the Eighteenth International Symposium on Intelligent Data Analysis (IDA 2020), Springer, 2020.
2018
Proença, HM, Klijn, R, Bäck, T & van Leeuwen, M Identifying flight delay patterns using diverse subgroup discovery. In: Proceedings of the Symposium Series on Computational Intelligence (SSCI'18), IEEE, 2018.
van Rijn, S, Doerr, C & Bäck, T Towards an Adaptive CMA-ES Configurator. In: Proceedings of the International Conference on Parallel Problem Solving from Nature (PPSN'18), 2018.
Doerr, C, Ye, F, van Rijn, S, Wang, H & Bäck, T Towards a Theory-Guided Benchmarking Suite for Discrete Black-Box Optimization: Profiling (1+λ) EA Variants on OneMax and LeadingOnes. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'18), pp 951-958, ACM, 2018.
van Rijn, S, van Leeuwen, M, Schmitt, S, Olhofer, M & Bäck, T Multi-Fidelity Surrogate Model Approach to Optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'18), ACM, 2018.
2017
Ukkonen, A, Dzyuba, V & van Leeuwen, M Explaining Deviating Subsets through Explanation Networks. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'17), Springer, 2017.
van Rijn, S, Wang, H, van Stein, B & Bäck, T Algorithm configuration data mining for CMA evolution strategies. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'17), pp 737-744, ACM, 2017.
Dzyuba, V & van Leeuwen, M Learning what matters – Sampling interesting patterns. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17), pp 534-546, Springer, 2017.
2016
van Stein, B, van Leeuwen, M, Wang, H, Purr, S, Kreissl, S, Meinhardt, J & Bäck, T Towards Data Driven Process Control in Manufacturing Car Body Parts. In: Proceedings of IEEE International Conference on Computational Science and Computational Intelligence (IEEE CSCI-ISBD'16), IEEE, 2016.
van Rijn, S, Wang, H, van Leeuwen, M & Bäck, T Evolving the Structure of Evolution Strategies. In: Proceedings of IEEE Symposium Series on Computational Intelligence (IEEE SSCI'16), IEEE, 2016.
van Stein, B, van Leeuwen, M & Bäck, T Local Subspace-Based Outlier Detection using Global Neighbourhoods. In: Proceedings of IEEE International Conference on Big Data (IEEE BigData'16), IEEE, 2016.
van Leeuwen, M & Ukkonen, A Expect the Unexpected - On the Significance of Subgroups. In: Proceedings of Discovery Science (DS'16), pp 51-66, Springer, 2016.
van Leeuwen, M & Galbrun, E Association Discovery in Two-View Data (extended abstract). In: TKDE Poster Track of ICDE 2016, IEEE, 2016.implementation

Workshop and demo papers

2024
Yang, L & van Leeuwen, M Human-guided Rule Learning for ICU Readmission Risk Analysis. In: Proceedings of the Workshop on AI and Data Science for Healthcare (AIDSH) at KDD 2024, 2024.
2022
Li, Z & van Leeuwen, M Feature Selection for Fault Detection and Prediction based on Log Analysis. In: Proceedings of the international workshop on AI for Manufacturing Workshop at ECMLPKDD 2022, 2022.
Yang, L, Opdam, T & van Leeuwen, M Histogram-based Probabilistic Rule Lists for Numeric Targets. In: Proceedings of the 20th anniversary Workshop on Knowledge Discovery in Inductive Databases (KDID 2022) at ECMLPKDD 2022, CEUR Workshop Proceedings, 2022.
2020
Gawehns, D & van Leeuwen, M Social Fluidity in Children's Face-to-Face Interaction Networks. In: Proceedings of the Graph Embedding and Mining (GEM) Workshop at ECML PKDD 2020, 2020.website
2019
Vinkenoog, M, Janssen, M & van Leeuwen, M Challenges and Limitations in Clustering Blood Donor Hemoglobin Trajectories. In: Proceedings of 4th Workshop on Advanced Analytics and Learning on Temporal Data at ECMLPKDD 2019, Springer, 2019.

Extended abstracts (peer-reviewed)

2022
Spaink, HA, Verhagen, IE, van Leeuwen, M & Terwindt, GM Methodological considerations in predicting migraine attacks using machine learning. In: MTIS 2022 Cephalalgia Abstracts, Sage Publications, 2022.
Yang, L & van Leeuwen, M Probabilistic Rule Sets Ready for Interactive Machine Learning. In: AAAI'22-Workshop on Interactive Machine Learning, 2022.
2019
Gawehns, D, Veiga, G & van Leeuwen, M Focus on dynamics: a proof of principle in exploratory data mining of face-to-face interactions. In: Proceedings of the 5th International Conference on Computational Social Science (IC2S2), 2019. (Poster presentation)

Proceedings (edited volumes)

2017
Chau, DH, Vreeken, J, van Leeuwen, M, Shahaf, D & Faloutsos, C (eds) Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics (IDEA'17). 2017.website
2016
Chau, DH, Vreeken, J, van Leeuwen, M, Shahaf, D & Faloutsos, C (eds) Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics (IDEA'16). 2016.website

Theses

2021
Proença, HM Robust rules for prediction and description. PhD thesis, Leiden University, 2021.implementation
Kapoor, S Subjectively Interesting Patterns in Networks. PhD thesis, IIT Roorkee, 2021.implementation

Technical reports

2022
Li, Z, Zhu, Y & van Leeuwen, M A Survey on Explainable Anomaly Detection. Technical Report arXiv:2210.06959, arXiv, 2022.
2021
Proença, HM, Bäck, T & van Leeuwen, M Robust subgroup discovery. Technical Report arXiv:2103.13686, arXiv, 2021.implementation
van Rijn, S, Schmitt, S, van Leeuwen, M & Bäck, T Finding Efficient Trade-offs in Multi-Fidelity Response Surface Modeling. Technical Report arXiv:2103.03280, arXiv, 2021.
Marx, A, Yang, L & van Leeuwen, M Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms. Technical Report arXiv:2101.05009, arXiv, 2021.
2020
Proença, HM, Grünwald, P, Bäck, T & van Leeuwen, M Discovering outstanding subgroup lists for numeric targets using MDL. Technical Report arXiv:2006.09186, arXiv, 2020.
Yang, L, Baratchi, M & van Leeuwen, M Unsupervised Discretization by Two-dimensional MDL-based Histogram. Technical Report arXiv:2006.01893, arXiv, 2020.
2019
Faas, M & van Leeuwen, M Vouw: Geometric Pattern Mining using the MDL Principle. Technical Report arXiv:1911.09587, arXiv, 2019.
Proença, HM & van Leeuwen, M Interpretable multiclass classification by MDL-based rule lists. Technical Report arXiv:1905.00328, arXiv, 2019.
2017
Dzyuba, V & van Leeuwen, M Learning what matters - Sampling interesting patterns. Technical Report arXiv:1702.01975, arXiv, 2017.
2016
van Stein, B, van Leeuwen, Mv & Bäck, T Local Subspace-Based Outlier Detection using Global Neighbourhoods. Technical Report arXiv:1611.00183, arXiv, 2016.
Dzyuba, V, van Leeuwen, M & De Raedt, L Flexible constrained sampling with guarantees for pattern mining. Technical Report arXiv:1610.09263, arXiv, 2016.
van Rijn, S, Wang, H, van Leeuwen, M & Bäck, T Evolving the Structure of Evolution Strategies. Technical Report arXiv:1610.05231, arXiv, 2016.