• Login
    View Item 
    •   Home
    • The Christie Research Publications Repository
    • All Christie Publications
    • View Item
    •   Home
    • The Christie Research Publications Repository
    • All Christie Publications
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of ChristieCommunitiesTitleAuthorsIssue DateSubmit DateSubjectsThis CollectionTitleAuthorsIssue DateSubmit DateSubjectsProfilesView

    My Account

    LoginRegister

    Local Links

    The Christie WebsiteChristie Library and Knowledge Service

    Statistics

    Display statistics

    Learning to identify Protected Health Information by integrating knowledge- and data-driven algorithms: A case study on psychiatric evaluation notes.

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Authors
    Dehghan, Azad
    Kovacevic, A
    Karystianis, G
    Keane, J
    Nenadic, G
    Affiliation
    School of Computer Science, University of Manchester, Manchester, UK
    Issue Date
    2017-06-07
    
    Metadata
    Show full item record
    Abstract
    De-identification of clinical narratives is one of the main obstacles to making healthcare free text available for research. In this paper we describe our experience in expanding and tailoring two existing tools as part of the 2016 CEGS N-GRID Shared Tasks Track 1, which evaluated de-identification methods on a set of psychiatric evaluation notes for up to 25 different types of Protected Health Information (PHI). The methods we used rely on machine learning on either a large or small feature space, with additional strategies, including two-pass tagging and multi-class models, which both proved to be beneficial. The results show that the integration of the proposed methods can identify Health Information Portability and Accountability Act (HIPAA) defined PHIs with overall F1-scores of ∼90% and above. Yet, some classes (Profession, Organization) proved again to be challenging given the variability of expressions used to reference given information.
    Citation
    Learning to identify Protected Health Information by integrating knowledge- and data-driven algorithms: A case study on psychiatric evaluation notes. 2017 J Biomed Inform
    Journal
    Journal of Biomedical Informatics
    URI
    http://hdl.handle.net/10541/620440
    DOI
    10.1016/j.jbi.2017.06.005
    PubMed ID
    28602908
    Type
    Article
    Language
    en
    ISSN
    1532-0480
    ae974a485f413a2113503eed53cd6c53
    10.1016/j.jbi.2017.06.005
    Scopus Count
    Collections
    All Christie Publications

    entitlement

    Related articles

    • Automated de-identification of free-text medical records.
    • Authors: Neamatullah I, Douglass MM, Lehman LW, Reisner A, Villarroel M, Long WJ, Szolovits P, Moody GB, Mark RG, Clifford GD
    • Issue date: 2008 Jul 24
    • Combining knowledge- and data-driven methods for de-identification of clinical narratives.
    • Authors: Dehghan A, Kovacevic A, Karystianis G, Keane JA, Nenadic G
    • Issue date: 2015 Dec
    • Sensitive Data Detection with High-Throughput Machine Learning Models in Electrical Health Records.
    • Authors: Zhang K, Jiang X
    • Issue date: 2023
    • Automatic de-identification of textual documents in the electronic health record: a review of recent research.
    • Authors: Meystre SM, Friedlin FJ, South BR, Shen S, Samore MH
    • Issue date: 2010 Aug 2
    • De-identification of free text data containing personal health information: a scoping review of reviews.
    • Authors: Negash B, Katz A, Neilson CJ, Moni M, Nesca M, Singer A, Enns JE
    • Issue date: 2023
    DSpace software (copyright © 2002 - 2025)  DuraSpace
    Quick Guide | Contact Us
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.