Using Local Lexicalized Rules to Identify Heart Disease Risk Factors in Clinical Notes.
Affiliation
School of Computer Science, University of Manchester, Manchester, UKIssue Date
2015-06-29
Metadata
Show full item recordAbstract
Heart disease is the leading cause of death globally and a significant part of the human population lives with it. A number of risk factors have been recognised as contributing to the disease, including obesity, coronary artery disease (CAD), hypertension, hyperlipidemia, diabetes, smoking, and family history of premature CAD. This paper describes and evaluates a methodology to extract mentions of such risk factors from diabetic clinical notes, which was a task of the i2b2/UTHealth 2014 Challenge in Natural Language Processing for Clinical Data. The methodology is knowledge-driven and the system implements local lexicalised rules (based on syntactical patterns observed in notes) combined with manually constructed dictionaries that characterize the domain. A part of the task was also to detect the time interval in which the risk factors were present in a patient. The system was applied to an evaluation set of 514 unseen notes and achieved a micro-average F-score of 88% (with 86% precision and 90% recall). While the identification of CAD family history, medication and some of the related disease factors (e.g. hypertension, diabetes, hyperlipidemia) showed quite good results, the identification of CAD-specific indicators proved to be more challenging (F-score of 74%). Overall, the results are encouraging and suggested that automated text mining methods can be used to process clinical notes to identify risk factors and monitor progression of heart disease on a large-scale, providing necessary data for clinical and epidemiological studies.Citation
Using Local Lexicalized Rules to Identify Heart Disease Risk Factors in Clinical Notes. 2015: J Biomed InformJournal
Journal of Biomedical InformaticsDOI
10.1016/j.jbi.2015.06.013PubMed ID
26133479Type
ArticleLanguage
enISSN
1532-0480ae974a485f413a2113503eed53cd6c53
10.1016/j.jbi.2015.06.013
Scopus Count
Collections
Related articles
- Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge.
- Authors: Cormack J, Nath C, Milward D, Raja K, Jonnalagadda SR
- Issue date: 2015 Dec
- Coronary artery disease risk assessment from unstructured electronic health records using text mining.
- Authors: Jonnagaddala J, Liaw ST, Ray P, Kumar M, Chang NW, Dai HJ
- Issue date: 2015 Dec
- Identifying risk factors for heart disease over time: Overview of 2014 i2b2/UTHealth shared task Track 2.
- Authors: Stubbs A, Kotfila C, Xu H, Uzuner Ö
- Issue date: 2015 Dec
- Combining glass box and black box evaluations in the identification of heart disease risk factors and their temporal relations from clinical records.
- Authors: Grouin C, Moriceau V, Zweigenbaum P
- Issue date: 2015 Dec
- Mining heart disease risk factors in clinical text with named entity recognition and distributional semantic models.
- Authors: Urbain J
- Issue date: 2015 Dec