Introduction

Biologists are stepping up their efforts in understanding the biological processes that underlie disease pathways in the clinical contexts. This has resulted in a flood of biological and clinical data from genomic and protein sequences, DNA microarrays, protein interactions, to disease pathways, biomedical images, and electronic health records. We are in a scenario where our capability to generate biomedical data has greatly surpassed our abilities to mine and analyze the data.

To exploit these data for discovering new knowledge that can be translated into clinical applications, there are fundamental data analysis difficulties that have to be overcome. Practical issues such as handling noisy and incomplete data (e.g. protein interactions have high false positive and false negative rates), processing compute-intensive tasks (e.g. large scale graph mining), and integrating various data sources (e.g. linking genomic data, proteomics data with clinical databases) are new challenges faced by biologists in the post-genome era.

Data mining has been designed to handle such challenging data analysis problems. We can therefore expect data mining to play an increasingly crucial role in revolutionizing biological research. Data mining will be the next technical innovation employed by biologists to enable them to make meaningful observations and discoveries from a wide array of heterogeneous data from molecular biology to pharmaceutical and clinical domains.

As data mining is poised to become integrated into the next-generation pipeline of biomedical discovery process, there are unprecedented opportunities for data mining researchers from the computer science domain to come together to contribute to this meaningful scientific pursuit with the biologists and clinical scientists. The mission of this workshop is therefore to disseminate the research results and best practices of data mining approaches to the cross-disciplinary researchers and practitioners from both the data mining disciplines and the life sciences domains. We encourage submission of papers using data mining techniques to address the challenging issues in various biological data analysis. In particular, we especially welcome the submissions reporting data mining techniques in healthcare related applications that integrate the use of biological data in a clinical context for translational research.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Call for Papers [Download PDF version]

Dear Colleagues,

We are writing to invite you to submit your papers to the ICDM-2011 workshop on Biological Data Mining and its Applications in Healthcare, which will be held in Vancouver, Canada on December 11 2011. ICDM, the IEEE International Conference on Data Mining, is one of the premier conferences in the field of Data Mining.

By co-locating with ICDM 2011, we hope the workshop will bring better awareness of interesting and challenging biological and medical problems that inspire new data mining solutions, and attract the participation of researchers in the areas of data mining and machine learning who are interested in the real-world applications of data mining in computational biology and healthcare.

We look forward to your submissions. In addition, we will greatly appreciate it if you can distribute the Call for Papers to your colleagues, students and other community members and encourage them to contribute to the workshop.

Thank you!

Sincerely,

Workshop Co-Chairs

Xiao-Li Li, See-Kiong Ng and Jason T. L. Wang

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Topics of interest

The topics of interest include (but are not limited to) the following:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Important Dates

Aug 5, 2011: Due date for paper submission

September 20, 2011: Notification of paper acceptance

October 11, 2011: Camera-ready versions of accepted papers

December 11, 2011: Workshop date

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Submissions

Paper submissions are limited to a maximum of 8 pages (you can submit either full paper 8 pages or short paper 6 pages) in the IEEE 2-column format, which is the same as the camera-ready format (see the IEEE Computer Society Press Proceedings Author Guidelines). All papers will be reviewed by the Program Committee based on technical quality, relevance to data mining, originality, significance, and clarity. A double blind reviewing process will be adopted. Authors should therefore avoid using identifying information in the text of the paper. You are strongly encouraged to print and double check your PDF file before its submission, especially if your paper contains Asian/European language symbols (such as Chinese/Korean characters or English letters with European fonts). All papers should be submitted through the ICDM Workshop Submission Site.

All accepted workshop papers will be published in a separate ICDM workshop proceedings published by the IEEE Computer Society Press. In addition, authors with accepted papers to the workshop will have the opportunity to be invited to publish their extended versions in the following two venues: a) as book chapters in an edited book which will be published by World Scientific and b) as journal papers in International Journal of Knowledge Discovery in Bioinformatics (IJKDB).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PC members

Zhang Aidong, State University of New York at Buffalo (UB), USA

Tatsuya Akutsu, Kyoto University, Japan

Zeyar Aung, Masdar Institute of Science and Technology, UAE

Vladimir Bajic, King Abdullah University of Science and Technology, Saudi Arabia

Jin Chen, Michigan State University, USA

Phoebe Chen, La Trobe University, Australia

Honnian Chua, Harvard University, USA

Juan Cui, University of Georgia, USA

Yang Dai, University of Illinois at Chicago, USA

Xin Gao, King Abdullah University of Science and Technology, Saudi Arabia

Xiaoxu Han, Eastern Michigan University, USA

David Hansen, Australian e-Health Research Centre, Australia

Wen-Lian Hsu, Academia Sinica, Taiwan

Jimmy Huang, York University, Canada

Raphael Isokpehi, Jackson State University, USA

Dawei Li, Yale University, USA

Haiquan Li, University of Chicago, USA

Igor Jurisica, University of Toronto, Canada

Daisuke Kihara, Purdue University, USA

Shonali Krishnaswamy, Monash University, Australia

Chee Keong Kwoh, Nanyang Technological University, Singapore

Hiroshi Mamitsuka, Kyoto University, Japan

Laxmi Parida, IBM T. J. Watson Research Center, USA

George Perry, University of Texas at San Antonio, USA

Mark A. Ragan, The University of Queensland, Australia

Raul Rabadan, Columbia University, USA

Jianhua Ruan, University of Texas at San Antonio, USA

Indra Neil Sarkar, University of Vermont, USA

Ambuj K Singh, University of California at Santa Barbara, USA

Narayanaswamy Srinivasan, Indian Institute of Science, India

Zeeshan Syed, University of Michigan, USA

Vincent S. Tseng, National Cheng Kung University, Taiwan

Alfonso Valencia, Spanish National Cancer Research Centre, Spain

Hong Yan, City University of Hong Kong, China

Philip S. Yu, University of Illinois at Chicago, USA

Erliang Zeng, University of Notre Dame, USA

Xiaoling Zhang, Boston University, Boston, MA

Marketa Zvelebil, Breaktrhough Breast Cancer Research - ICR, UK

 

Co-Reviewer:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Keynotes

We have two keynotes for the workshop:

1. Title: Non-conventional approach to stem cell image classification

Keynote speaker: Prof Ming Li, University of Waterloo, Canada

Abstract: What do we mean by two images being similar? We will give an ultimate mathematical definition to this question. We will show that our definition is optimal and then apply it to stem cell image classification. Previous methods for stem cell image classification require fluorescent to light up the nuclei (to allow the segmentation algorithms work well). However, the fluorescent interferes with cell growth. Our method does not require this and does not do cell segmentation. Our classification results are shown to be very comparable to the traditional approach.

Biography: Ming Li is a Canada Research Chair in Bioinformatics and a University Professor at the University of Waterloo. He is a fellow of the Royal Society of Canada, ACM, and IEEE. He is a recipient of E.W.R. Steacie Fellowship Award in 1996, the 2001 Killam Fellowship, and the 2010 Killam Prize. Together with Paul Vitanyi they have co-authored the book "An Introduction to Kolmogorov Complexity and Its Applications". He is a co-managing editor of Journal of Bioinformatics and Computational Biology. He is an associate editor-in-Chief of Journal of Computer Science and Technology.

2. Title: Combinatorial Biomarker Discovery

Keynote speaker: Prof. Raymond Ng, University of British Columbia, Canada

Abstract: Personalized medicine has been hailed as one of the main directions for medical research in this century. In the first half of the talk, we give an overview on our projects that use gene expression, proteomics and DNA features for biomarker discovery. A biomarker panel is called a combinatorial panel if it includes more than one of the above types of features. In the second half of the talk, we overview some of the challenges in interpreting and analyzing genomics data. The importance of data cleansing and pre-processing is often overlooked. Along this front, we give an overview of several of the techniques we have developed.

Biography: Dr. Raymond Ng is a professor in Computer Science at the University of British Columbia. His main research area for the past two decades is on data mining, with a specific focus on health informatics and text mining. He has published over 150 peer-reviewed publications on data clustering, outlier detection, OLAP processing, health informatics and text mining. He is the recipient of two best paper awards - from 2001 ACM SIGKDD conference, which is the premier data mining conference worldwide, and the 2005 ACM SIGMOD conference, which is one of the top database conferences worldwide. He was one of the program co-chairs of the 2009 International conference on Data Engineering, and one of the program co-chairs of the 2002 ACM SIGKDD conference. He was also one of the general co-chairs of the 2008 ACM SIGMOD conference. He was an editorial board member of the Very large Database Journal and the IEEE Transactions on Knowledge and Data Engineering until 2008. For the past decade, Dr. Ng has co-led several large scale genomic projects, funded by Genome Canada, Genome BC and NSERC. The total amount of funding of those projects well exceeded $40 million Canadian dollars. He now holds the Chief Informatics Officer position of the PROOF Centre of Excellence, which focuses on biomarker development for end-stage organ failures.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Accepted Papers

1. Eshita Mutt and Ramanathan Sowdhamini, Mining protein sequence databases for remote homologues that can display considerable domain length variations

2. Sandra Ortega-Martorell, Paulo J.G. Lisboa, Alfredo Vellido, Rui V. Simoes, Margarida Julia-Sape, and Carles Arus, Brain tumor pathological area delimitation through Non-negative Matrix Factorization

3. Miao Wang, Xuequn Shang, Miao Miao, Zhanhuai Li, and Wenbin Liu, FTCluster: Efficient Mining Fault-Tolerant Biclusters in Microarray Dataset

4. Seth Long and Lawrence Holder, Graph Based Classification of MRI Data Based on the Ventricular System

5. Clyde Phelix, Richard LeBaron, George Perry, Rosa Villanueva, Greg Villareal, Sandra Siedlak, and Xiongwei Zhu, In Vivo and In Silico Evidence: Hippocampal Cholesterol Metabolism Decreases with Aging and Increases with Alzheimer’s Disease

6. Celia Goncalves, Rui Camacho, and Eugenio Oliveira, From sequences to Papers: an Information Retrieval Exercise

7. Ankit Agrawal and Alok Choudhary, Identifying HotSpots in Lung Cancer Data Using Association Rule Mining

8. Truc-Viet Le, Chee-Keong Kwoh, Eng-Soon Teo, and Kheng-Hock Lee, Analyzing trends of hospital length of stay using Phase-type distributions

9. Faisal Khan and Qiuhua Liu, Transduction of Semi-Supervised Regression Targets in Survival Analysis for Medical Prognosis

10. Shobeir Fakhraei, Hamid Soltanian-Zadeh, Kourosh Jafari-Khouzani, Kost Elisevich, and Farshad Fotouhi, Confident Surgical Decision Making in Temporal Lobe Epilepsy by Heterogeneous Classifier Ensembles

11. Yi Mao, Yixin Chen, Gregory Hackermann, Minmin Chen, Chenyang Lu, Marin Kollef, and Thomas Bailey, Medical Data Mining for Early Deterioration Warning in General Hospital Wards

12. Divya Syamaladevi, Naseer Pasha, and Ramanathan Sowdhamini, A three-step validation procedure in genome-wide data mining for myosin family members improves search efficiency

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Workshop Program


Final Workshop Program (we have four audio presentation available)

8:40 - 8:50 Opening: Xiao-Li Li, See-Kiong Ng and Jason T.L. Wang

8:50 - 9:35 Keynote 1: Non-conventional approach to stem cell image classification, Prof Ming Li (Chair: Dr. See-Kiong Ng)

Session 1: Data mining techniques and biomedical applications (1)

9:35 - 10:00 Yi Mao, Yixin Chen, Gregory Hackermann, Minmin Chen, Chenyang Lu, Marin Kollef, and Thomas Bailey, Medical Data Mining for Early Deterioration Warning in General Hospital Wards

10:00 - 10:30 Morning Coffee Break

10:30 - 10:55 Ankit Agrawal and Alok Choudhary, Identifying HotSpots in Lung Cancer Data Using Association Rule Mining

Session 2: Data mining techniques and biomedical applications (2) (Chair: Dr. Clyde F. Phelix)

10:55 - 11:20 Eshita Mutt and Ramanathan Sowdhamini, Mining protein sequence databases for remote homologues that can display considerable domain length variations

11:20 - 11:45 Miao Wang, Xuequn Shang, Miao Miao, Zhanhuai Li, and Wenbin Liu, FTCluster: Efficient Mining Fault-Tolerant Biclusters in Microarray Dataset

11:45 - 12:10 Divya Syamaladevi, Naseer Pasha, and Ramanathan Sowdhamini, A three-step validation procedure in genome-wide data mining for myosin family members improves search efficiency

12:10 - 12:35 Celia Goncalves, Rui Camacho, and Eugenio Oliveira, From sequences to Papers: an Information Retrieval Exercise

12:35 - 2:00 Lunch Break

2:00 - 2:45 Keynote 2: Combinatorial Biomarker Discovery, Prof. Raymond Ng (Chair: Dr. Xiaoli Li)

Session 2: Diseases, classification and Decision Making (1)

2:45 - 3:10 Clyde Phelix, Richard LeBaron, George Perry, Rosa Villanueva, Greg Villareal, Sandra Siedlak, and Xiongwei Zhu, In Vivo and In Silico Evidence: Hippocampal Cholesterol Metabolism Decreases with Aging and Increases with Alzheimer's Disease

3:10 - 3:35 Seth Long and Lawrence Holder, Graph Based Classification of MRI Data Based on the Ventricular System

Session 2: Diseases, classification and Decision Making (2) (Chair: Dr Yixin Chen)

3:35 - 4:00 Sandra Ortega-Martorell, Paulo J.G. Lisboa, Alfredo Vellido, Rui V. Simoes, Margarida Julia-Sape, and Carles Arus, Brain tumor pathological area delimitation through Non-negative Matrix Factorization

4:00 - 4:30 Afternoon Coffee Break

4:30 - 4:55 Shobeir Fakhraei, Hamid Soltanian-Zadeh, Kourosh Jafari-Khouzani, Kost Elisevich, and Farshad Fotouhi, Confident Surgical Decision Making in Temporal Lobe Epilepsy by Heterogeneous Classifier Ensembles

4:55 - 5:20 Truc-Viet Le, Chee-Keong Kwoh, Eng-Soon Teo, and Kheng-Hock Lee, Analyzing trends of hospital length of stay using Phase-type distributions

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Contact us

For questions about submissions or suggestions/comments about the workshop, please contact Workshop Co-Chairs:

Xiao-Li Li: xlli@i2r.a-star.edu.sg

See-Kiong Ng: skng@i2r.a-star.edu.sg

Jason T.L. Wang: wangj@njit.edu