EDML
Evaluation and Experimental Design in Data Mining and Machine Learning
Workshops
Special Issue
Description
A vital part of proposing new machine learning and data mining approaches is evaluating them empirically to allow an assessment of their capabilities. Numerous choices go into setting up such experiments: how to choose the data, how to preprocess them (or not), potential problems associated with the selection of datasets, what other techniques to compare to (if any), what metrics to evaluate, etc. and last but not least how to present and interpret the results. Learning how to make those choices on-the-job, often by copying the evaluation protocols used in the existing literature, can easily lead to the development of problematic habits. Numerous, albeit scattered, publications have called attention to those questions [1-5] and have occasionally called into question published results, or the usability of published methods. At a time of intense discussions about a reproducibility crisis in natural, social, and life sciences, and conferences such as SIGMOD, KDD, and ECML/PKDD encouraging researchers to make their work as reproducible as possible, we therefore feel that it is important to bring researchers together, and discuss those issues on a fundamental level.
An issue directly related to the first choice mentioned above is the following: even the best-designed experiment carries only limited information if the underlying data are lacking. We therefore also want to discuss questions related to the availability of data, whether they are reliable, diverse, and whether they correspond to realistic and/or challenging problem settings.
Topics
In this workshop-series, we mainly solicit contributions that discuss those questions on a fundamental level, take stock of the state-of-the-art, offer theoretical arguments, or take well-argued positions, as well as actual evaluation papers that offer new insights, e.g. question published results, or shine the spotlight on the characteristics of existing benchmark data sets.
As such, topics include, but are not limited to- Benchmark datasets for data mining tasks: are they diverse/realistic/challenging?
- Impact of data quality (redundancy, errors, noise, bias, imbalance, ...) on qualitative evaluation
- Propagation/amplification of data quality issues on the data mining results (also interplay between data and algorithms)
- Evaluation of unsupervised data mining (dilemma between novelty and validity)
- Evaluation measures
- (Automatic) data quality evaluation tools: What are the aspects one should check before starting to apply algorithms to given data?
- Issues around runtime evaluation (algorithm vs. implementation, dependency on hardware, algorithm parameters, dataset characteristics)
- Design guidelines for crowd-sourced evaluations
Organizers
- Eirini Ntoutsi
Leibniz University Hannover & L3S Research Center, Germany
ntoutsi@kbs.uni-hannover.deEirini Ntoutsi is an Associate Professor in Intelligent Systems at the Faculty of Electrical Engineering and Computer Science at the Leibniz Universität Hannover (LUH) and member of the L3S research center. Prior to joining LUH, she was a postdoctoral researcher at the Ludwig-Maximilians-University (LMU) in Munich, Germany. She obtained her PhD from the University of Piraeus, Greece and she holds a diploma and a master in computer science from the Computer Engineering & Informatics Department (CEID), University of Patras, Greece. Her research can be summarized as learning over complex data (like high-dimensional, multi-view, with limited labels, ...) and data streams. She published more than 70 papers at international data mining and machine learning venues. She has organized several workshops, a Dagstuhl perspective workshop and served as publicity co-chair for ICDM 2017.
- Erich Schubert
Technical University Dortmund
erich.schubert@cs.tu-dortmund.deErich Schubert is associate professor at the Technical University Dortmund, Germany in the Artificial Intelligence group. He joined TU Dortmund in 2018 after previous positions at Heidelberg University, Germany and Ludwig-Maximilians-Universität München (LMU Munich), Germany, where he obtained his PhD with Prof. Kriegel. His research interests include unsupervised learning, in particular clustering and outlier detection, along with index-based acceleration techniques for these approaches and evaluation methods. He has published over 38 peer reviewed papers at international conferences and in journals, and served as proceedings chair for the SISAP 2016 conference, and assisted with the GIR’17 workshop in Heidelberg.
- Arthur Zimek
University of Southern Denmark
zimek@imada.sdu.dkArthur Zimek is Professor and Head of the Section "Data Science and Statistics" in the Department of Mathematics and Computer Science (IMADA) at University of Southern Denmark (SDU), in Odense, Denmark. He joined SDU in 2016 after previous positions at Ludwig-Maximilians-University Munich (LMU), Germany, Technical University Vienna, Austria, and University of Alberta, Edmonton, Canada. Arthur holds master-level degrees in bioinformatics, philosophy, and theology, involving studies at universities in Germany (TUM, HfPh, LMU Munich, and JGU Mainz) as well as Austria (LFU Innsbruck). His research interests include ensemble techniques for unsupervised learning, clustering, outlier detection, and high dimensional data, developing data mining methods as well as evaluation methodology. He published more than 80 papers at peer reviewed international conferences and in international journals. He co-organized several workshops and mini-symposia at SDM, KDD, and Shonan and served as workshop co-chair for SDM 2017.
- Albrecht Zimmermann
University Caen Normandy, France
albrecht.zimmermann@unicaen.frAlbrecht Zimmermann is associate professor at the University Caen Normandie, France in the CoDaG (Constraints, Data Mining, and Graphs) group. He joined the group in 2015 after previous stays at INSA Lyon, France, the Catholic University of Leuven, Belgium, and the Albert-Ludwigs University Freiburg, Germany. His research interests include pattern and pattern set mining, and their applications to bio- and chemoinformatics settings, result verification of unsupervised data mining methods, data generation, and sports analytics. He has co-organized four editions of the “Machine Learning and Data Mining for Sports Analytics” workshop @ ECML/PKDD (MLSA 2013, 2015, 2107, 2018), as well as three editions of the invitation-only “Spring Workshop on Mining and Learning” (SMiLe 2008, 2010, 2012). He served as registration chair of ICDM 2012, and as workshop and tutorial co-chair of ECML/PKDD 2016.
References
- Basaran, Daniel, Eirini Ntoutsi, and Arthur Zimek. “Redundancies in Data and their Effect on the Evaluation of Recommendation Systems: A Case Study on the Amazon Reviews Datasets.” Proceedings of the 2017 SIAM International Conference on Data Mining. SIAM, 2017.
- Kovács, Ferenc, Csaba Legány, and Attila Babos. "Cluster validity measurement techniques." 6th International symposium of hungarian researchers on computational intelligence. 2005.
- Kriegel, Hans-Peter, Erich Schubert, and Arthur Zimek. "The (black) art of runtime evaluation: Are we comparing algorithms or implementations?." Knowledge and Information Systems 52.2 (2017): 341-378.
- Nijssen, Siegfried, and Joost Kok. "Frequent subgraph miners: runtimes don't say everything." Proceedings of the Workshop on Mining and Learning with Graphs. 2006.
- Zheng, Zijian, Ron Kohavi, and Llew Mason. "Real world performance of association rule algorithms." Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2001.