Tutorial KDD 2018

ovgu_logo_png SU-logo


Data Science for Health

TUTORIAL - Knowledge Discovery from Cohorts, Electronic Health Records and further Patient-related data


KDD 2018 November, London - from August 19 to August 23, The tutorials are on August 19


Tutorialists: Panagiotis Papapetrou (Stockholm) and Myra Spiliopoulou (Magdeburg)


Data mining is intensively used in medicine and healthcare. Electronic Health Records (EHRs) are perceived as big patient data. On them, scientists strive to perform predictions on patients' progress, to understand and predict response to therapy, to detect adverse drug effects, and many other learning tasks. Medical researchers are also interested in learning from cohorts of population-based studies and of experiments. Learning tasks include the identification of disease predictors that can lead to new diagnostic tests and the acquisition of insights on interventions.

In this tutorial, we elaborate on data sources, methods, and case studies in medical mining. Next to conventional data sources, we address the potential of data from mobile devices. We discuss the learning problems that can be solved with those data, we present case studies and investigate the methods needed to prepare and mine those data and to present the results to a medical expert.

Medical research is largely hypothesis-driven: data collection, analysis and acquisition of insights are embedded into workflows that differ from the ways used by data mining scholars for (medical) data analysis. While medical researchers are often willing to offer their data for data-driven learning, it is the task of data mining scholars to analyze the data in a way that can be understood and exploited by medical researchers. The knowledge and techniques that will be presented in this tutorial will also serve as guidelines for novices and experienced data mining researchers, so that their methods and results when mining medical data will be useful to the medical domain and healthcare experts.



PART 1: Introduction (BOTH) – 30 mins

  1. What are patient data? Electronic Health Records (EHRs), social data, data collected in cohort studies

  2. What is a cohort?

  3. Cohorts for clinical studies

  4. Cohorts for population-based studies

PART 2: Learning from EHR data (PANOS) – 30 mins

  1. SupervisedlearningfromEHRs

  2. Unsupervised learning from EHRs

  3. Temporal data mining from EHRs

PART 3: Hypothesis-driven vs exploratory learning on patient data (MYRA) – 40 mins

  1. Cohort specification from EHR data

  2. Expert driven cohort refinement on EHR data

  3. Expert inputs for learning on EHR data

  4. Experiments on clinical cohorts

PART 4: Deep learning on EHR data (PANOS) – 40 mins

  1. Neural networks for EHR data

  2. Recurrent neural networks for diagnosis and treatment prediction

  3. Convolutional neural networks for medical image processing

PART 5: Exploratory learning on patient mobile data (MYRA) – 30 mins

  1. Learning from the data of mobile devices

  2. Monitoring the ecological momentary assessments of patients

PART 6: Conclusions and open challenges – 10 mins

  1. The challenge of finding the data

  2. The challenge of seeing with the expert's eyes

  3. The challenge of preparing the data

  4. Challenges of learning

  5. The challenge of explaining the results


Target audience and prerequisites

The tutorial is intended for all KDD participants, and especially for young researchers, who are interested on how data mining and machine learning can be of benefit to healtchare and to medicine.

Participants are expected to have basic knowledge within the areas of data mining, machine learning, and databases. The audience is expected to be familiar with standard concepts and methods, such as classification models, deep learning, density-based clustering, Hidden Markov Models, frequent pattern and rule mining. Such knowledge can be expected from KDD participants, including students.


Tutor’s short bio and their expertise related to the tutorial

Myra Spiliopoulou is Professor of Business Information Systems at the Otto-von-Guericke-University Magdeburg. Her research is on mining dynamic complex data, with focus on healthcare and social data. She is action editor for DAMI and PC Chair of the Applied Data Science Track of KDD 2018. In the recent past, she was one of the four Journal Track Chairs for ECML PKDD 2017, Panel Chair of IEEE ICDM 2017 and PC Chair of the IEEE Symposium of Computer Based Medical Systems 2016. She has held tutorials on topics of data mining at KDD 2009 and 2015, PAKDD 2013 and 2016 and in many ECML PKDD conferences.

Panagiotis Papapetrou is Professor at the Department of Computer and Systems Sciences at Stockholm University and Adjunct Professor at the Computer Science Department at Aalto University. His area of expertise is algorithmic data mining with particular focus on mining and indexing temporal data and healthcare data. Panagiotis received his PhD in Computer Science at Boston University in 2009, was a post-doctoral researcher at Aalto University during 2009-2013, and lecturer at the University of London during 2012-2013. He has participated in several national and international research projects. He is board member of the Swedish AI Society.


Contact info of the tutors

Prof. Myra Spiliopoulou

Research Group on Knowledge Management and Discovery (KMD),

Faculty of Computer Science, Otto-von-Guericke-University Magdeburg,

PO Box 4120, 39016 Magdeburg, Germany

Email: myra@ovgu.de

URL: http://www.kmd.ovgu.de/Team/Academic+Staff/Myra+Spiliopoulou.html


Prof. Panagiotis Papapetrou

Data Science group

Department of Computer and Systems Sciences

PO Box 7003, 164 07, Stockholm, Sweden

Email: panagiotis@dsv.su.se

URL: http://people.dsv.su.se/~panagiotis/



Last Modification: 23.04.2018 - Contact Person:

Sie können eine Nachricht versenden an: Webmaster