Tutorial KDD 2018
Data Science for Health
TUTORIAL - Knowledge Discovery from Cohorts, Electronic Health Records and further Patient-related data
KDD 2018 November, London - from August 19 to August 23, The tutorials are on August 19
Tutorialists: Panagiotis Papapetrou (Stockholm) and Myra Spiliopoulou (Magdeburg)
Data mining is intensively used in medicine and healthcare. Electronic Health Records (EHRs) are perceived as big patient data. On them, scientists strive to perform predictions on patients' progress, to understand and predict response to therapy, to detect adverse drug effects, and many other learning tasks. Medical researchers are also interested in learning from cohorts of population-based studies and of experiments. Learning tasks include the identification of disease predictors that can lead to new diagnostic tests and the acquisition of insights on interventions.
In this tutorial, we elaborate on data sources, methods, and case studies in medical mining. Next to conventional data sources, we address the potential of data from mobile devices. We discuss the learning problems that can be solved with those data, we present case studies and investigate the methods needed to prepare and mine those data and to present the results to a medical expert.
Medical research is largely hypothesis-driven: data collection, analysis and acquisition of insights are embedded into workflows that differ from the ways used by data mining scholars for (medical) data analysis. While medical researchers are often willing to offer their data for data-driven learning, it is the task of data mining scholars to analyze the data in a way that can be understood and exploited by medical researchers. The knowledge and techniques that will be presented in this tutorial will also serve as guidelines for novices and experienced data mining researchers, so that their methods and results when mining medical data will be useful to the medical domain and healthcare experts.
PART 1: Introduction (BOTH) – 30 mins
What are patient data? Electronic Health Records (EHRs), social data, data collected in cohort studies
What is a cohort?
Cohorts for clinical studies
Cohorts for population-based studies
PART 2: Learning from EHR data (PANOS) – 30 mins
Unsupervised learning from EHRs
Temporal data mining from EHRs
PART 3: Hypothesis-driven vs exploratory learning on patient data (MYRA) – 40 mins
Cohort specification from EHR data
Expert driven cohort refinement on EHR data
Expert inputs for learning on EHR data
Experiments on clinical cohorts
PART 4: Deep learning on EHR data (PANOS) – 40 mins
Neural networks for EHR data
Recurrent neural networks for diagnosis and treatment prediction
Convolutional neural networks for medical image processing
PART 5: Exploratory learning on patient mobile data (MYRA) – 30 mins
Learning from the data of mobile devices
Monitoring the ecological momentary assessments of patients
PART 6: Conclusions and open challenges – 10 mins
The challenge of finding the data
The challenge of seeing with the expert's eyes
The challenge of preparing the data
Challenges of learning
The challenge of explaining the results
Target audience and prerequisites
The tutorial is intended for all KDD participants, and especially for young researchers, who are interested on how data mining and machine learning can be of benefit to healtchare and to medicine.
Participants are expected to have basic knowledge within the areas of data mining, machine learning, and databases. The audience is expected to be familiar with standard concepts and methods, such as classification models, deep learning, density-based clustering, Hidden Markov Models, frequent pattern and rule mining. Such knowledge can be expected from KDD participants, including students.
Tutor’s short bio and their expertise related to the tutorial
Myra Spiliopoulou is Professor of Business Information Systems at the Otto-von-Guericke-University Magdeburg. Her research is on mining dynamic complex data, with focus on healthcare and social data. She is action editor for DAMI and PC Chair of the Applied Data Science Track of KDD 2018. In the recent past, she was one of the four Journal Track Chairs for ECML PKDD 2017, Panel Chair of IEEE ICDM 2017 and PC Chair of the IEEE Symposium of Computer Based Medical Systems 2016. She has held tutorials on topics of data mining at KDD 2009 and 2015, PAKDD 2013 and 2016 and in many ECML PKDD conferences.
Panagiotis Papapetrou is Professor at the Department of Computer and Systems Sciences at Stockholm University and Adjunct Professor at the Computer Science Department at Aalto University. His area of expertise is algorithmic data mining with particular focus on mining and indexing temporal data and healthcare data. Panagiotis received his PhD in Computer Science at Boston University in 2009, was a post-doctoral researcher at Aalto University during 2009-2013, and lecturer at the University of London during 2012-2013. He has participated in several national and international research projects. He is board member of the Swedish AI Society.
Contact info of the tutors
Prof. Myra Spiliopoulou
Research Group on Knowledge Management and Discovery (KMD),
Faculty of Computer Science, Otto-von-Guericke-University Magdeburg,
PO Box 4120, 39016 Magdeburg, Germany
Prof. Panagiotis Papapetrou
Data Science group
Department of Computer and Systems Sciences
PO Box 7003, 164 07, Stockholm, Sweden