计算机科学
判决
人工智能
自然语言处理
图形
卷积神经网络
图形数据库
生物医学文本挖掘
文本图
文本挖掘
情报检索
理论计算机科学
作者
Guishen Wang,Xiaoxue Lou,Fang Guo,Devin Kwok,Chen Cao
出处
期刊:IEEE Journal of Biomedical and Health Informatics
[Institute of Electrical and Electronics Engineers]
日期:2023-12-22
卷期号:28 (3): 1668-1679
被引量:3
标识
DOI:10.1109/jbhi.2023.3346210
摘要
Text classification is a central part of natural language processing, with important applications in understanding the knowledge behind biomedical texts including electronic health records (EHR). In this article, we propose a novel heterogeneous graph convolutional network method for classifying EHR texts. Our method, called EHR-HGCN, is able to combine context-sensitive word and sentence embeddings with structural sentence-level and word-level relation information to perform text classification. EHR-HGCN reframes EHR text classification as a graph classification task to better capture structural information about the document using a heterogeneous graph. To mine contextual information from a document, EHR-HGCN first applies a bidirectional recurrent neural network (BiRNN) on word embeddings obtained via Global Vectors for word representation (GloVe) to obtain context-sensitive word-level and sentence-level embeddings. To mine structural relationships from the document, EHR-HGCN then constructs a heterogeneous graph over the word and sentence embeddings, where sentence-word and word-word relationships are represented by graph edges. Finally, a heterogeneous graph convolutional neural network is used to classify documents by their graph representation. We evaluate EHR-HGCN on a variety of standard text classification benchmarks and find that EHR-HGCN has higher accuracy and F1-score than other representative machine learning and deep learning methods. We also apply EHR-HGCN to the MedLit benchmark and find it performs with high accuracy and F1-score on the task of section classification in EHR texts. Our ablation experiments show that the heterogeneous graph construction and heterogeneous graph convolutional network are critical to the performance of EHR-HGCN.
科研通智能强力驱动
Strongly Powered by AbleSci AI