Explanatory Data Analysis group
Publications
Publications by type (by year)
Journal articles
In press |
|
Cross-Domain Graph Level Anomaly Detection. Transactions on Knowledge and Data Engineering, ACM |
|
2024 |
|
A unified experiment design approach forcyclic and acyclic causal models. Transactions on Machine Learning Research vol.1(1), pp 1-1, 2024. |
|
A Survey on Explainable Anomaly Detection. Transactions on Knowledge Discovery from Data vol.18(1), ACM, 2024. |
|
2023 |
|
A unified experiment design approach forcyclic and acyclic causal models. Journal of Machine Learning Research vol.24(354), pp 1-31, 2023. |
|
Evaluating Cluster-Based Synthetic Data Generation for Blood-Transfusion Analysis. Journal of Cybersecurity and Privacy vol.3(4), pp 882-894, MDPI, 2023. |
|
WEARDA: recording wearable sensor data for human activity monitoring. Journal of Open Research Software vol.11(1), 2023. |
|
The added value of ferritin levels and genetic markers for the prediction of haemoglobin deferral. Vox Sanguinis vol.118(10), pp 825-834, 2023. |
|
Explainable Contextual Anomaly Detection using Quantile Regression Forests. Data Mining and Knowledge Discovery, Springer |
|
Defining migraine days, based on longitudinal E-diary data. Cephalalgia |
|
Unsupervised Discretization by Two-dimensional MDL-based Histogram. Machine Learning, Springer |
|
Generating synthetic mixed discrete-continuous health records with mixed sum-product networks. Journal of the American Medical Informatics Association vol.30(1), Oxford University Press, 2023. |
|
2022 |
|
Feature Selection for Fault Detection and Prediction based on Event Log Analysis. ACM SIGKDD Explorations vol.24(2), ACM, 2022. |
|
Robust subgroup discovery - Discovering subgroup lists using MDL. Data Mining and Knowledge Discovery |
|
Finding Efficient Trade-offs in Multi-Fidelity Response Surface Modeling. Engineering Optimization |
|
Associations between symptoms, donor characteristics and IgG antibody response in 2082 COVID-19 convalescent plasma donors. Frontiers in Immunology, Frontiers |
|
2021 |
|
Evaluating privacy of individuals in medical data. Health Informatics Journal, SAGE Publications |
|
Online Summarization of Dynamic Graphs using Subjective Interestingness for Sequential Data. Data Mining and Knowledge Discovery vol.35(1), pp 88-126, 2021. (ECML PKDD journal track) |
|
2020 |
|
MF2: A Collection of Multi-Fidelity Benchmark Functions in Python. Journal of Open Source Software vol.5(52), 2020. |
|
First results of a ferritin-based blood donor deferral policy in the Netherlands. Transfusion vol.60(8), pp 1785-1792, Wiley, 2020. |
|
Discovering Subjectively Interesting Multigraph Patterns. Machine Learning, pp 1-28, Springer |
|
Interpretable multiclass classification by MDL-based rule lists. Information Sciences vol.512, pp 1372-1393, Elsevier, 2020. |
|
2019 |
|
Addendum to the Special Issue on Interactive Data Exploration and Analytics (TKDD, Vol. 12, Iss. 1): Introduction by the Guest Editors. Transactions on Knowledge Discovery from Data vol.13(1), ACM, 2019. |
|
2018 |
|
Predicting outcome of endovascular treatment for acute ischemic stroke: potential value of machine learning algorithms. Frontiers in Neurology vol.9(784), Frontiers, 2018. |
|
Editorial: TKDD Special Issue on Interactive Data Exploration and Analytics. Transactions on Knowledge Discovery from Data vol.12(1), ACM, 2018. |
|
2017 |
|
Flexible constrained sampling with guarantees for pattern mining. Data Mining and Knowledge Discovery vol.31(5), pp 1266-1293, Springer, 2017. (ECMLPKDD'17 Special Issue) |
|
Semiring Rank Matrix Factorisation. Transactions on Knowledge and Data Engineering vol.29(8), pp 1737-1750, IEEE, 2017. |
|
2016 |
|
Simultaneous discovery of cancer subtypes and subtype features by molecular data integration. Bioinformatics vol.32(17), pp 445-454, Oxford University Press, 2016. |
|
A KNIME-based Analysis of the Zebrafish Photomotor Response Clusters the Phenotypes of 14 Classes of Neuroactive Molecules. Journal of Biomolecular Screening vol.21(5), pp 427-436, SAGE Publishing, 2016. |
|
Subjective Interestingness of Subgraph Patterns. Machine Learning vol.105(1), pp 41-75, Springer, 2016. |
Conference papers
2024 |
|
Conditional Density Estimation with Histogram Trees. In: Proceedings of the Conference on Neural Information Processing Systems (NeurIPS 2024), 2024. |
|
Learning Unknown Intervention Targets in Structural Causal Models from Heterogeneous Data. In: International Conference on Artificial Intelligence and Statistics, pp 3187-3195, PMLR, 2024. |
|
Graph Neural Networks based Log Anomaly Detection and Explanation. In: Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings, pp 306-307, ACM, 2024. |
|
2023 |
|
Novel approach for phenotyping based on diverse top-k subgroup lists. In: Proceedings of the Conference on Artificial Intelligence In Medicine (AIME 2023), Springer, 2023. |
|
Discovering Diverse Top-k Characteristic Lists. In: Proceedings of the 21st International Symposium on Intelligent Data Analysis (IDA 2023), Springer, 2023. |
|
Discovering Rule Lists with Preferred Variables. In: Proceedings of the 21st International Symposium on Intelligent Data Analysis (IDA 2023), Springer, 2023. |
|
2022 |
|
Truly Unordered Probabilistic Rule Sets for Multi-class Classification. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2022), Springer, 2022. |
|
2021 |
|
Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms. In: Proceedings of the SIAM Conference on Data Mining 2021 (SDM'21), SIAM, 2021. |
|
2020 |
|
Discovering Outstanding Subgroup Lists for Numeric Targets using MDL. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2020), Springer, 2020. |
|
Vouw: Geometric Pattern Mining using the MDL Principle. In: Proceedings of the Eighteenth International Symposium on Intelligent Data Analysis (IDA 2020), Springer, 2020. |
|
Widening for MDL-based Retail Signature Discovery. In: Proceedings of the Eighteenth International Symposium on Intelligent Data Analysis (IDA 2020), Springer, 2020. |
|
2018 |
|
Identifying flight delay patterns using diverse subgroup discovery. In: Proceedings of the Symposium Series on Computational Intelligence (SSCI'18), IEEE, 2018. |
|
Towards an Adaptive CMA-ES Configurator. In: Proceedings of the International Conference on Parallel Problem Solving from Nature (PPSN'18), 2018. |
|
Towards a Theory-Guided Benchmarking Suite for Discrete Black-Box Optimization: Profiling (1+λ) EA Variants on OneMax and LeadingOnes. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'18), pp 951-958, ACM, 2018. |
|
Multi-Fidelity Surrogate Model Approach to Optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'18), ACM, 2018. |
|
2017 |
|
Explaining Deviating Subsets through Explanation Networks. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'17), Springer, 2017. |
|
Algorithm configuration data mining for CMA evolution strategies. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'17), pp 737-744, ACM, 2017. |
|
Learning what matters – Sampling interesting patterns. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17), pp 534-546, Springer, 2017. |
|
2016 |
|
Towards Data Driven Process Control in Manufacturing Car Body Parts. In: Proceedings of IEEE International Conference on Computational Science and Computational Intelligence (IEEE CSCI-ISBD'16), IEEE, 2016. |
|
Evolving the Structure of Evolution Strategies. In: Proceedings of IEEE Symposium Series on Computational Intelligence (IEEE SSCI'16), IEEE, 2016. |
|
Local Subspace-Based Outlier Detection using Global Neighbourhoods. In: Proceedings of IEEE International Conference on Big Data (IEEE BigData'16), IEEE, 2016. |
|
Expect the Unexpected - On the Significance of Subgroups. In: Proceedings of Discovery Science (DS'16), pp 51-66, Springer, 2016. |
|
Association Discovery in Two-View Data (extended abstract). In: TKDE Poster Track of ICDE 2016, IEEE, 2016. |
Workshop and demo papers
2024 |
|
Human-guided Rule Learning for ICU Readmission Risk Analysis. In: Proceedings of the Workshop on AI and Data Science for Healthcare (AIDSH) at KDD 2024, 2024. |
|
2022 |
|
Feature Selection for Fault Detection and Prediction based on Log Analysis. In: Proceedings of the international workshop on AI for Manufacturing Workshop at ECMLPKDD 2022, 2022. |
|
Histogram-based Probabilistic Rule Lists for Numeric Targets. In: Proceedings of the 20th anniversary Workshop on Knowledge Discovery in Inductive Databases (KDID 2022) at ECMLPKDD 2022, CEUR Workshop Proceedings, 2022. |
|
2020 |
|
Social Fluidity in Children's Face-to-Face Interaction Networks. In: Proceedings of the Graph Embedding and Mining (GEM) Workshop at ECML PKDD 2020, 2020. |
|
2019 |
|
Challenges and Limitations in Clustering Blood Donor Hemoglobin Trajectories. In: Proceedings of 4th Workshop on Advanced Analytics and Learning on Temporal Data at ECMLPKDD 2019, Springer, 2019. |
Extended abstracts (peer-reviewed)
2022 |
|
Methodological considerations in predicting migraine attacks using machine learning. In: MTIS 2022 Cephalalgia Abstracts, Sage Publications, 2022. |
|
Probabilistic Rule Sets Ready for Interactive Machine Learning. In: AAAI'22-Workshop on Interactive Machine Learning, 2022. |
|
2019 |
|
Focus on dynamics: a proof of principle in exploratory data mining of face-to-face interactions. In: Proceedings of the 5th International Conference on Computational Social Science (IC2S2), 2019. (Poster presentation) |
Proceedings (edited volumes)
Theses
2021 |
|
Technical reports
2022 |
|
A Survey on Explainable Anomaly Detection. Technical Report arXiv:2210.06959, arXiv, 2022. |
|
2021 |
|
Robust subgroup discovery. Technical Report arXiv:2103.13686, arXiv, 2021. |
|
Finding Efficient Trade-offs in Multi-Fidelity Response Surface Modeling. Technical Report arXiv:2103.03280, arXiv, 2021. |
|
Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms. Technical Report arXiv:2101.05009, arXiv, 2021. |
|
2020 |
|
Discovering outstanding subgroup lists for numeric targets using MDL. Technical Report arXiv:2006.09186, arXiv, 2020. |
|
Unsupervised Discretization by Two-dimensional MDL-based Histogram. Technical Report arXiv:2006.01893, arXiv, 2020. |
|
2019 |
|
Vouw: Geometric Pattern Mining using the MDL Principle. Technical Report arXiv:1911.09587, arXiv, 2019. |
|
Interpretable multiclass classification by MDL-based rule lists. Technical Report arXiv:1905.00328, arXiv, 2019. |
|
2017 |
|
Learning what matters - Sampling interesting patterns. Technical Report arXiv:1702.01975, arXiv, 2017. |
|
2016 |
|
Local Subspace-Based Outlier Detection using Global Neighbourhoods. Technical Report arXiv:1611.00183, arXiv, 2016. |
|
Flexible constrained sampling with guarantees for pattern mining. Technical Report arXiv:1610.09263, arXiv, 2016. |
|
Evolving the Structure of Evolution Strategies. Technical Report arXiv:1610.05231, arXiv, 2016. |