标识符
计算机科学
鉴定(生物学)
情报检索
唯一标识符
包裹体(矿物)
样品(材料)
保护
数据科学
数据挖掘
医学
心理学
植物
社会心理学
生物
色谱法
护理部
化学
程序设计语言
作者
Clete A. Kushida,Deborah L. Nichols,Rik Jadrnicek,Ric Miller,James J. Walsh,Kara S. Griffin
出处
期刊:Medical Care
[Ovid Technologies (Wolters Kluwer)]
日期:2012-07-01
卷期号:50: S82-S101
被引量:106
标识
DOI:10.1097/mlr.0b013e3182585355
摘要
Background: De-identification and anonymization are strategies that are used to remove patient identifiers in electronic health record data. The use of these strategies in multicenter research studies is paramount in importance, given the need to share electronic health record data across multiple environments and institutions while safeguarding patient privacy. Methods: Systematic literature search using keywords of de-identify, deidentify, de-identification, deidentification, anonymize, anonymization, data scrubbing, and text scrubbing. Search was conducted up to June 30, 2011 and involved 6 different common literature databases. A total of 1798 prospective citations were identified, and 94 full-text articles met the criteria for review and the corresponding articles were obtained. Search results were supplemented by review of 26 additional full-text articles; a total of 120 full-text articles were reviewed. Results: A final sample of 45 articles met inclusion criteria for review and discussion. Articles were grouped into text, images, and biological sample categories. For text-based strategies, the approaches were segregated into heuristic, lexical, and pattern-based systems versus statistical learning-based systems. For images, approaches that de-identified photographic facial images and magnetic resonance image data were described. For biological samples, approaches that managed the identifiers linked with these samples were discussed, particularly with respect to meeting the anonymization requirements needed for Institutional Review Board exemption under the Common Rule. Conclusions: Current de-identification strategies have their limitations, and statistical learning-based systems have distinct advantages over other approaches for the de-identification of free text. True anonymization is challenging, and further work is needed in the areas of de-identification of datasets and protection of genetic information.
科研通智能强力驱动
Strongly Powered by AbleSci AI