Explanatory Data Analysis group
Publications
Publications by year (by type)
In press
Interpretable Machine Learning for Identifying ICU Readmission Risk in Subgroups with Probabilistic Rules. Journal of the American Medical Informatics Association, Oxford Journals |
2025
Scalable, Explainable and Provably Robust Anomaly Detection with One-Step Flow Matching. In: Proceedings of the Conference on Neural Information Processing Systems (NeurIPS 2025), 2025. |
|
Challenges and Algorithms for Knowledge Discovery from Data - Essays Dedicated to Arno Siebes on the Occasion of His 67th Birthday. Springer, 2025. |
|
Snor: Simpler Descriptions Through Overlapping Patterns. In: van Leeuwen, M & Vreeken, J (eds) Challenges and Algorithms for Knowledge Discovery from Data, pp 56-74, Springer, 2025. |
|
Discovering multiple antibiotic resistance phenotypes using diverse top-k subgroup list discovery. Artificial Intelligence In Medicine vol.167, Elsevier, 2025. |
|
Efficiently Escaping Saddle Points for Non-Convex Policy Optimization. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI 2025), 2025. |
|
Causal Effect Identification in Heterogeneous Environments from Higher-Order Moments. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI 2025), 2025. |
|
MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters. In: Proceedings of the International Conference on Machine Learning (ICML 2025), 2025. |
|
Causal Effect Identification in LiNGAM from Higher-Order Cumulants. In: Proceedings of the International Conference on Machine Learning (ICML 2025), 2025. |
|
Hierarchical Reinforcement Learning with Targeted Causal Interventions. In: Proceedings of the International Conference on Machine Learning (ICML 2025), 2025. |
|
Towards Automated Self-Supervised Learning for Truly Unsupervised Graph Anomaly Detection. Data Mining and Knowledge Discovery vol.39(44), Springer, 2025. |
|
Multi-Domain Causal Discovery in Bijective Causal Models. In: Proceedings of the Conference on Causal Learning and Reasoning (CLeaR 2025), 2025. |
|
Using consumer wearables to estimate physical activity of nursing home residents with dementia. Exploration of Digital Health Technologies vol.3, Open Exploration, 2025. |
2024
Conditional Density Estimation with Histogram Trees. In: Proceedings of the Conference on Neural Information Processing Systems (NeurIPS 2024), 2024. |
|
Cross-Domain Graph Level Anomaly Detection. Transactions on Knowledge and Data Engineering vol.36(12), ACM, 2024. |
|
Causal Effect Identification in LiNGAM Models with Latent Confounders. In: Proceedings of the International Conference on Machine Learning (ICML 2024), 2024. |
|
Learning Unknown Intervention Targets in Structural Causal Models from Heterogeneous Data. In: International Conference on Artificial Intelligence and Statistics, pp 3187-3195, PMLR, 2024. |
|
A unified experiment design approach forcyclic and acyclic causal models. Transactions on Machine Learning Research vol.1(1), pp 1-1, 2024. |
|
Momentum-Based Policy Gradient with Second-Order Information. Transactions on Machine Learning Research |
|
A Survey on Explainable Anomaly Detection. Transactions on Knowledge Discovery from Data vol.18(1), ACM, 2024. |
|
Human-guided Rule Learning for ICU Readmission Risk Analysis. In: Proceedings of the Workshop on AI and Data Science for Healthcare (AIDSH) at KDD 2024, 2024. |
|
Graph Neural Networks based Log Anomaly Detection and Explanation. In: Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings, pp 306-307, ACM, 2024. |
2023
A unified experiment design approach forcyclic and acyclic causal models. Journal of Machine Learning Research vol.24(354), pp 1-31, 2023. |
|
Evaluating Cluster-Based Synthetic Data Generation for Blood-Transfusion Analysis. Journal of Cybersecurity and Privacy vol.3(4), pp 882-894, MDPI, 2023. |
|
WEARDA: recording wearable sensor data for human activity monitoring. Journal of Open Research Software vol.11(1), 2023. |
|
The added value of ferritin levels and genetic markers for the prediction of haemoglobin deferral. Vox Sanguinis vol.118(10), pp 825-834, 2023. |
|
Explainable Contextual Anomaly Detection using Quantile Regression Forests. Data Mining and Knowledge Discovery, Springer |
|
Discovering Diverse Top-k Characteristic Lists. In: Proceedings of the 21st International Symposium on Intelligent Data Analysis (IDA 2023), Springer, 2023. |
|
Discovering Rule Lists with Preferred Variables. In: Proceedings of the 21st International Symposium on Intelligent Data Analysis (IDA 2023), Springer, 2023. |
|
Defining migraine days, based on longitudinal E-diary data. Cephalalgia |
|
Unsupervised Discretization by Two-dimensional MDL-based Histogram. Machine Learning, Springer |
|
Generating synthetic mixed discrete-continuous health records with mixed sum-product networks. Journal of the American Medical Informatics Association vol.30(1), Oxford University Press, 2023. |
2022
Feature Selection for Fault Detection and Prediction based on Event Log Analysis. ACM SIGKDD Explorations vol.24(2), ACM, 2022. |
|
A Survey on Explainable Anomaly Detection. Technical Report arXiv:2210.06959, arXiv, 2022. |
|
Methodological considerations in predicting migraine attacks using machine learning. In: MTIS 2022 Cephalalgia Abstracts, Sage Publications, 2022. |
|
Feature Selection for Fault Detection and Prediction based on Log Analysis. In: Proceedings of the international workshop on AI for Manufacturing Workshop at ECMLPKDD 2022, 2022. |
|
Histogram-based Probabilistic Rule Lists for Numeric Targets. In: Proceedings of the 20th anniversary Workshop on Knowledge Discovery in Inductive Databases (KDID 2022) at ECMLPKDD 2022, CEUR Workshop Proceedings, 2022. |
|
Truly Unordered Probabilistic Rule Sets for Multi-class Classification. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2022), Springer, 2022. |
|
Robust subgroup discovery - Discovering subgroup lists using MDL. Data Mining and Knowledge Discovery |
|
Finding Efficient Trade-offs in Multi-Fidelity Response Surface Modeling. Engineering Optimization |
|
Probabilistic Rule Sets Ready for Interactive Machine Learning. In: AAAI'22-Workshop on Interactive Machine Learning, 2022. |
|
Associations between symptoms, donor characteristics and IgG antibody response in 2082 COVID-19 convalescent plasma donors. Frontiers in Immunology, Frontiers |
2021
Evaluating privacy of individuals in medical data. Health Informatics Journal, SAGE Publications |
|
Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms. In: Proceedings of the SIAM Conference on Data Mining 2021 (SDM'21), SIAM, 2021. |
|
Robust subgroup discovery. Technical Report arXiv:2103.13686, arXiv, 2021. |
|
Finding Efficient Trade-offs in Multi-Fidelity Response Surface Modeling. Technical Report arXiv:2103.03280, arXiv, 2021. |
|
Online Summarization of Dynamic Graphs using Subjective Interestingness for Sequential Data. Data Mining and Knowledge Discovery vol.35(1), pp 88-126, 2021. (ECML PKDD journal track) |
|
Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms. Technical Report arXiv:2101.05009, arXiv, 2021. |
2020
Social Fluidity in Children's Face-to-Face Interaction Networks. In: Proceedings of the Graph Embedding and Mining (GEM) Workshop at ECML PKDD 2020, 2020. |
|
Discovering Outstanding Subgroup Lists for Numeric Targets using MDL. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2020), Springer, 2020. |
|
MF2: A Collection of Multi-Fidelity Benchmark Functions in Python. Journal of Open Source Software vol.5(52), 2020. |
|
First results of a ferritin-based blood donor deferral policy in the Netherlands. Transfusion vol.60(8), pp 1785-1792, Wiley, 2020. |
|
Discovering outstanding subgroup lists for numeric targets using MDL. Technical Report arXiv:2006.09186, arXiv, 2020. |
|
Unsupervised Discretization by Two-dimensional MDL-based Histogram. Technical Report arXiv:2006.01893, arXiv, 2020. |
|
Vouw: Geometric Pattern Mining using the MDL Principle. In: Proceedings of the Eighteenth International Symposium on Intelligent Data Analysis (IDA 2020), Springer, 2020. |
|
Widening for MDL-based Retail Signature Discovery. In: Proceedings of the Eighteenth International Symposium on Intelligent Data Analysis (IDA 2020), Springer, 2020. |
|
Discovering Subjectively Interesting Multigraph Patterns. Machine Learning, pp 1-28, Springer |
|
Interpretable multiclass classification by MDL-based rule lists. Information Sciences vol.512, pp 1372-1393, Elsevier, 2020. |
2019
Vouw: Geometric Pattern Mining using the MDL Principle. Technical Report arXiv:1911.09587, arXiv, 2019. |
|
Challenges and Limitations in Clustering Blood Donor Hemoglobin Trajectories. In: Proceedings of 4th Workshop on Advanced Analytics and Learning on Temporal Data at ECMLPKDD 2019, Springer, 2019. |
|
Focus on dynamics: a proof of principle in exploratory data mining of face-to-face interactions. In: Proceedings of the 5th International Conference on Computational Social Science (IC2S2), 2019. (Poster presentation) |
|
Interpretable multiclass classification by MDL-based rule lists. Technical Report arXiv:1905.00328, arXiv, 2019. |
|
Addendum to the Special Issue on Interactive Data Exploration and Analytics (TKDD, Vol. 12, Iss. 1): Introduction by the Guest Editors. Transactions on Knowledge Discovery from Data vol.13(1), ACM, 2019. |
2018
Identifying flight delay patterns using diverse subgroup discovery. In: Proceedings of the Symposium Series on Computational Intelligence (SSCI'18), IEEE, 2018. |
|
Predicting outcome of endovascular treatment for acute ischemic stroke: potential value of machine learning algorithms. Frontiers in Neurology vol.9(784), Frontiers, 2018. |
|
Towards an Adaptive CMA-ES Configurator. In: Proceedings of the International Conference on Parallel Problem Solving from Nature (PPSN'18), 2018. |
|
Towards a Theory-Guided Benchmarking Suite for Discrete Black-Box Optimization: Profiling (1+λ) EA Variants on OneMax and LeadingOnes. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'18), pp 951-958, ACM, 2018. |
|
Multi-Fidelity Surrogate Model Approach to Optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'18), ACM, 2018. |
|
Editorial: TKDD Special Issue on Interactive Data Exploration and Analytics. Transactions on Knowledge Discovery from Data vol.12(1), ACM, 2018. |
2017
Explaining Deviating Subsets through Explanation Networks. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'17), Springer, 2017. |
|
Flexible constrained sampling with guarantees for pattern mining. Data Mining and Knowledge Discovery vol.31(5), pp 1266-1293, Springer, 2017. (ECMLPKDD'17 Special Issue) |
|
Semiring Rank Matrix Factorisation. Transactions on Knowledge and Data Engineering vol.29(8), pp 1737-1750, IEEE, 2017. |
|
Algorithm configuration data mining for CMA evolution strategies. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'17), pp 737-744, ACM, 2017. |
|
Learning what matters – Sampling interesting patterns. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17), pp 534-546, Springer, 2017. |
|
Learning what matters - Sampling interesting patterns. Technical Report arXiv:1702.01975, arXiv, 2017. |
2016
Towards Data Driven Process Control in Manufacturing Car Body Parts. In: Proceedings of IEEE International Conference on Computational Science and Computational Intelligence (IEEE CSCI-ISBD'16), IEEE, 2016. |
|
Evolving the Structure of Evolution Strategies. In: Proceedings of IEEE Symposium Series on Computational Intelligence (IEEE SSCI'16), IEEE, 2016. |
|
Local Subspace-Based Outlier Detection using Global Neighbourhoods. In: Proceedings of IEEE International Conference on Big Data (IEEE BigData'16), IEEE, 2016. |
|
Local Subspace-Based Outlier Detection using Global Neighbourhoods. Technical Report arXiv:1611.00183, arXiv, 2016. |
|
Flexible constrained sampling with guarantees for pattern mining. Technical Report arXiv:1610.09263, arXiv, 2016. |
|
Evolving the Structure of Evolution Strategies. Technical Report arXiv:1610.05231, arXiv, 2016. |
|
Expect the Unexpected - On the Significance of Subgroups. In: Proceedings of Discovery Science (DS'16), pp 51-66, Springer, 2016. |
|
Simultaneous discovery of cancer subtypes and subtype features by molecular data integration. Bioinformatics vol.32(17), pp 445-454, Oxford University Press, 2016. |
|
A KNIME-based Analysis of the Zebrafish Photomotor Response Clusters the Phenotypes of 14 Classes of Neuroactive Molecules. Journal of Biomolecular Screening vol.21(5), pp 427-436, SAGE Publishing, 2016. |
|
Association Discovery in Two-View Data (extended abstract). In: TKDE Poster Track of ICDE 2016, IEEE, 2016. |
|
Subjective Interestingness of Subgraph Patterns. Machine Learning vol.105(1), pp 41-75, Springer, 2016. |