HyperED: A hierarchy‐aware network based on hyperbolic geometry for event detection

事件（粒子物理）计算机科学人工智能杠杆（统计）等级制度模式识别（心理学）机器学习数据挖掘物理量子力学经济市场经济

作者

Meng Zhang,Zhiwen Xie,Jin Liu,Xiao Liu,Xiao Yu,Bo Huang

出处

期刊：Computational Intelligence [Wiley]
日期：2024-01-04 卷期号：40 (1)

标识

摘要

Computational IntelligenceVolume 40, Issue 1 e12627 SPECIAL ISSUE ARTICLE HyperED: A hierarchy-aware network based on hyperbolic geometry for event detection Meng Zhang, Meng Zhang orcid.org/0000-0002-3649-0940 School of Computer Science, Wuhan University, Wuhan, ChinaSearch for more papers by this authorZhiwen Xie, Zhiwen Xie School of Computer Science, Central China Normal University, Wuhan, ChinaSearch for more papers by this authorJin Liu, Corresponding Author Jin Liu [email protected] School of Computer Science, Wuhan University, Wuhan, China Correspondence Jin Liu, School of Computer Science, Wuhan University. 430072 Wuhan China. Email: [email protected]Search for more papers by this authorXiao Liu, Xiao Liu School of Information Technology, Deakin University, Geelong, AustraliaSearch for more papers by this authorXiao Yu, Xiao Yu School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan, ChinaSearch for more papers by this authorBo Huang, Bo Huang School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, ChinaSearch for more papers by this author Meng Zhang, Meng Zhang orcid.org/0000-0002-3649-0940 School of Computer Science, Wuhan University, Wuhan, ChinaSearch for more papers by this authorZhiwen Xie, Zhiwen Xie School of Computer Science, Central China Normal University, Wuhan, ChinaSearch for more papers by this authorJin Liu, Corresponding Author Jin Liu [email protected] School of Computer Science, Wuhan University, Wuhan, China Correspondence Jin Liu, School of Computer Science, Wuhan University. 430072 Wuhan China. Email: [email protected]Search for more papers by this authorXiao Liu, Xiao Liu School of Information Technology, Deakin University, Geelong, AustraliaSearch for more papers by this authorXiao Yu, Xiao Yu School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan, ChinaSearch for more papers by this authorBo Huang, Bo Huang School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, ChinaSearch for more papers by this author First published: 04 January 2024 https://doi.org/10.1111/coin.12627Read the full textAboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onEmailFacebookTwitterLinkedInRedditWechat Abstract Event detection plays an essential role in the task of event extraction. It aims at identifying event trigger words in a sentence and classifying event types. Generally, multiple event types are usually well-organized with a hierarchical structure in real-world scenarios, and hierarchical correlations between event types can be used to enhance event detection performance. However, such kind of hierarchical information has received insufficient attention which can lead to misclassification between multiple event types. In addition, the most existing methods perform event detection in Euclidean space, which cannot adequately represent hierarchical relationships. To address these issues, we propose a novel event detection network HyperED which embeds the event context and types in Poincaré ball of hyperbolic geometry to help learn hierarchical features between events. Specifically, for the event detection context, we first leverage the pre-trained BERT or BiLSTM in Euclidean space to learn the semantic features of ED sentences. Meanwhile, to make full use of the dependency knowledge, a GNN-based model is applied when encoding event types to learn the correlations between events. Then we use a simple neural-based transformation to project the embeddings into the Poincaré ball to capture hierarchical features, and a distance score in hyperbolic space is computed for prediction. The experiments on MAVEN and ACE 2005 datasets indicate the effectiveness of the HyperED model and prove the natural advantages of hyperbolic spaces in expressing hierarchies in an intuitive way. CONFLICT OF INTEREST STATEMENT The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Open Research DATA AVAILABILITY STATEMENT The data that support the findings of this study are available from the corresponding author upon reasonable request. REFERENCES 1Buyko E, Faessler E, Wermter J, Hahn U. Syntactic simplification and semantic enrichment–trimming dependency graphs for event extraction. Comput Intell. 2011; 27(4): 610-644. 10.1111/j.1467-8640.2011.00402.x Web of Science®Google Scholar 2Liu J, Chen Y, Liu K, Bi W, Liu X. Event extraction as machine reading comprehension. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics; 2020: 1641-1651. 10.18653/v1/2020.emnlp-main.128 Google Scholar 3Li F, Peng W, Chen Y, et al. Event extraction as multi-turn question answering. Findings of the Association for Computational Linguistics (EMNLP). Vol 2020. Association for Computational Linguistics; 2020: 829-838. Google Scholar 4Ahn D. The stages of event extraction. Proceedings of the Workshop on Annotating and Reasoning about Time and Events. Association for Computational Linguistics; 2006: 1-8. 10.3115/1629235.1629236 Google Scholar 5Wang X, Wang Z, Han X, et al. MAVEN: A Massive General Domain Event Detection Dataset. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics; 2020: 1652-1671. 10.18653/v1/2020.emnlp-main.129 Google Scholar 6Walker C, Strassel S, Medero J, Maeda K. ACE 2005 multilingual training corpus ldc2006t06, 2006. https://catalog.ldc.upenn.edu/LDC2006T06 Google Scholar 7Liu S, Chen Y, He S, Liu K, Zhao J. Leveraging framenet to improve automatic event detection. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. The Association for Computer Linguistics; 2016: 2134-2143. 10.18653/v1/P16-1201 Google Scholar 8Chen Y, Xu L, Liu K, Zeng D, Zhao J. Event extraction via dynamic multi-pooling convolutional neural networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. The Association for Computer Linguistics; 2015: 167-176. Google Scholar 9Wang X, Han X, Liu Z, Sun M, Li P. Adversarial training for weakly supervised event detection. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. The Association for Computer Linguistics; 2019: 998-1008. 10.18653/v1/N19-1105 Google Scholar 10Lafferty J, McCallum A, Pereira FC. Conditional random fields: probabilistic models for segmenting and labeling sequence data. 2001. Google Scholar 11Ji H, Grishman R. Refining event extraction through cross-document inference. Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics. The Association for Computer Linguistics; 2008: 254-262. Google Scholar 12Liao S, Grishman R. Using document level cross-event inference to improve event extraction. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. The Association for Computer Linguistics; 2010: 789-797. Google Scholar 13Nguyen TH, Cho K, Grishman R. Joint event extraction via recurrent neural networks. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. The Association for Computational Linguistics; 2016: 300-309. 10.18653/v1/N16-1034 Google Scholar 14Liu S, Chen Y, Liu K, Zhao J. Exploiting argument information to improve event detection via supervised attention mechanisms. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics; 2017: 1789-1798. 10.18653/v1/P17-1164 Google Scholar 15Nguyen T, Grishman R. Graph convolutional networks with argument-aware pooling for event detection. Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press; 2018. Google Scholar 16Chen Y, Yang H, Liu K, Zhao J, Jia Y. Collective event detection via a hierarchical and bias tagging networks with gated multi-level attention mechanisms. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics; 2018: 1267-1276. 10.18653/v1/D18-1158 Google Scholar 17Nickel M, Kiela D. Poincaré embeddings for learning hierarchical representations. Adv Neural Inform Process Syst. 2017; 30:6338-6347. Google Scholar 18López F, Heinzerling B, Strube M. Fine-Grained Entity Typing in Hyperbolic Space. Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), Association for Computational Linguistics; 2019: 169-180. 10.18653/v1/W19-4319 Google Scholar 19Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY. A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst. 2020; 32(1): 4-24. 10.1109/TNNLS.2020.2978386 Web of Science®Google Scholar 20Liu S, Liu K, He S, Zhao J. A probabilistic soft logic based approach to exploiting latent and global information in event classification. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press; 2016. 10.1609/aaai.v30i1.10375 Google Scholar 21Veyseh APB, Van Nguyen M, Trung NN, Min B, Nguyen TH. Modeling document-level context for event detection via important context selection. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics; 2021: 5403-5413. 10.18653/v1/2021.emnlp-main.439 Google Scholar 22Lou D, Liao Z, Deng S, Zhang N, Chen H. MLBiNet: A Cross-Sentence Collective Event Detection Network. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Association for Computational Linguistics; 2021: 4829-4839. 10.18653/v1/2021.acl-long.373 Google Scholar 23Xu S, Li P, Zhu Q. Improving Event Coreference Resolution Using Document-level and Topic-level Information. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics; 2022: 6765-6775. 10.18653/v1/2022.emnlp-main.454 Google Scholar 24Liang Y, Jiang Z, Yin D, Ren B. RAAT: Relation-Augmented Attention Transformer for Relation Modeling in Document-Level Event Extraction. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics; 2022: 4985-4997. 10.18653/v1/2022.naacl-main.367 Google Scholar 25Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. nature. 1986; 323(6088): 533-536. 10.1038/323533a0 Web of Science®Google Scholar 26Kenton JDMWC, Toutanova LK. BERT: pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics; 2019: 4171-4186. Google Scholar 27Yan H, Jin X, Meng X, Guo J, Cheng X. Event detection with multi-order graph convolution and aggregated attention. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics; 2019: 5766-5770. 10.18653/v1/D19-1582 Google Scholar 28Yao L, Mao C, Luo Y. Graph convolutional networks for text classification. Proceedings of the AAAI conference on artificial intelligence. AAAI Press; 2019: 7370-7377. 10.1609/aaai.v33i01.33017370 Google Scholar 29Cui S, Yu B, Liu T, Zhang Z, Wang X, Shi J. Edge-Enhanced Graph Convolution Networks for Event Detection with Syntactic Relation. Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics; 2020: 2329-2339. 10.18653/v1/2020.findings-emnlp.211 Google Scholar 30Hsu I, Huang K, Boschee E, et al. DEGREE: a data-efficient generation-based event extraction model. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2022). Association for Computational Linguistics; 2022: 1890-1908. Google Scholar 31Liu X, Huang H, Shi G, Wang B. Dynamic Prefix-Tuning for Generative Template-based Event Extraction. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol 2022. ACL; 2022: 5216-5228. 10.18653/v1/2022.acl-long.358 Google Scholar 32Ma MD, Taylor A, Wang W, Peng N. DICE: Data-Efficient Clinical Event Extraction with Generative Models. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol 2023. ACL; 2023: 15898-15917. 10.18653/v1/2023.acl-long.886 Google Scholar 33López F, Strube M. A fully hyperbolic neural model for hierarchical multi-class classification. Findings of the Association for Computational Linguistics (EMNLP). Vol 2020. Association for Computational Linguistics; 2020: 460-475. Google Scholar 34Dhingra B, Shallue C, Norouzi M, Dai A, Dahl G. Embedding Text in Hyperbolic Spaces. Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12). Association for Computational Linguistics; 2018: 59-69. 10.18653/v1/W18-1708 Google Scholar 35Hamann M. On the tree-likeness of hyperbolic spaces. Mathematical Proceedings of the Cambridge Philosophical Society. Cambridge University Press; 2018: 345-361. Google Scholar 36Peng W, Varanka T, Mostafa A, Shi H, Zhao G. Hyperbolic Deep Neural Networks: A Survey. IEEE Trans Pattern Anal Mach Intell. 2021;44(12):10023-10044. Google Scholar 37Ganea O, Bécigneul G, Hofmann T. Hyperbolic neural networks. Adv Neural Inform Process Syst. 2018; 31:5350-5360. Google Scholar 38Liu Q, Nickel M, Kiela D. Hyperbolic graph neural networks. Adv Neural Inform Process Syst. 2019; 32:8228-8239. PubMedGoogle Scholar 39Gulcehre C, Denil M, Malinowski M, et al. Hyperbolic attention networks. arXiv:1805.09786 2018. Google Scholar 40Chen B, Huang X, Xiao L, Cai Z, Jing L. Hyperbolic interaction model for hierarchical multi-label classification. Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press; 2020: 7496-7503. Google Scholar 41Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Adv Neural Inform Process Syst. 2017; 30:5998-6008. Google Scholar 42Cannon JW, Floyd WJ, Kenyon R, Parry WR. Hyperbolic geometry. Flavors Geomet. 1997; 31(59–115): 2. Google Scholar 43Chen B, Fu Y, Xu G, et al. Probing BERT in hyperbolic spaces. 2020. Google Scholar 44Zeb A, Haq AU, Chen J, Lei Z, Zhang D. Learning hyperbolic attention-based embeddings for link prediction in knowledge graphs. Knowl-Based Syst. 2021; 229:107369. 10.1016/j.knosys.2021.107369 Web of Science®Google Scholar 45Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv:1503.02531 2015. Google Scholar 46Bonnabel S. Stochastic Gradient Descent on Riemannian Manifolds. IEEE Trans Autom Contr. 2013; 58(9): 2217-2229. 10.1109/TAC.2013.2254619 Web of Science®Google Scholar 47Kochurov M, Karimov R, Kozlukov S. Geoopt: Riemannian optimization in pytorch. arXiv:2005.02819 2020. Google Scholar 48Loshchilov I, Hutter F. Decoupled weight decay regularization. arXiv:1711.05101 2017. Google Scholar 49Baker CF, Fillmore CJ, Lowe JB. The berkeley framenet project. COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics. Association for Computational Linguistics; 1998. Google Scholar 50Pennington J, Socher R, Manning CD. Glove: global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics; 2014: 1532-1543. 10.3115/v1/D14-1162 Google Scholar 51Sheng J, Sun R, Guo S, et al. CorED: incorporating type-level and instance-level correlations for fine-grained event detection. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery; 2022: 1122-1132. 10.1145/3477495.3531956 Google Scholar 52Schmidhuber J, Hochreiter S, et al. Long short-term memory. Neural Comput. 1997; 9(8): 1735-1780. 10.1162/neco.1997.9.8.1735 PubMedWeb of Science®Google Scholar 53Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. International Conference on Learning Representations; 2018. Google Scholar 54Ma J, Ballesteros M, Doss S, et al. Label semantics for few shot named entity recognition. Findings of the Association for Computational Linguistics (ACL). Vol 2022. Association for Computational Linguistics; 2022: 1956-1971. 10.18653/v1/2022.findings-acl.155 Google Scholar 55Zhao K, Liu J, Xu Z, et al. Graph4Web: a relation-aware graph attention network for web service classification. J Syst Softw. 2022; 190:111324. 10.1016/j.jss.2022.111324 Web of Science®Google Scholar 56Yu J, Zhao K, Liu J, Liu X, Xu Z, Wang X. Exploiting gated graph neural network for detecting and explaining self-admitted technical debts. J Syst Softw. 2022; 187:111219. 10.1016/j.jss.2022.111219 Web of Science®Google Scholar 57Xie Z, Zhu R, Zhao K, Liu J, Zhou G, Huang JX. Dual gated graph attention networks with dynamic iterative training for cross-lingual entity alignment. ACM Trans Inform Syst. 2021; 40(3): 1-30. Web of Science®Google Scholar 58Schölkopf B, Smola A, Müller KR. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 1998; 10(5): 1299-1319. 10.1162/089976698300017467 Web of Science®Google Scholar 59Choi E, Levy O, Choi Y, Zettlemoyer L. Ultra-fine entity typing. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics; 2018: 87-96. Google Scholar Volume40, Issue1February 2024e12627 ReferencesRelatedInformation

求助该文献

最长约 10秒，即可获得该文献文件

HyperED: A hierarchy‐aware network based on hyperbolic geometry for event detection

今日热心研友