潜在Dirichlet分配
主题模型
计算机科学
社会化媒体
数据科学
形势意识
自然灾害
杠杆(统计)
模式(遗传算法)
潜在语义分析
可视化
人工智能
机器学习
情报检索
万维网
气象学
航空航天工程
工程类
物理
作者
Sulong Zhou,Pengyu Kan,Qunying Huang,Janet Silbernagel
标识
DOI:10.1177/01655515211007724
摘要
Natural disasters cause significant damage, casualties and economical losses. Twitter has been used to support prompt disaster response and management because people tend to communicate and spread information on public social media platforms during disaster events. To retrieve real-time situational awareness (SA) information from tweets, the most effective way to mine text is using natural language processing (NLP). Among the advanced NLP models, the supervised approach can classify tweets into different categories to gain insight and leverage useful SA information from social media data. However, high-performing supervised models require domain knowledge to specify categories and involve costly labelling tasks. This research proposes a guided latent Dirichlet allocation (LDA) workflow to investigate temporal latent topics from tweets during a recent disaster event, the 2020 Hurricane Laura. With integration of prior knowledge, a coherence model, LDA topics visualisation and validation from official reports, our guided approach reveals that most tweets contain several latent topics during the 10-day period of Hurricane Laura. This result indicates that state-of-the-art supervised models have not fully utilised tweet information because they only assign each tweet a single label. In contrast, our model can not only identify emerging topics during different disaster events but also provides multilabel references to the classification schema. In addition, our results can help to quickly identify and extract SA information to responders, stakeholders and the general public so that they can adopt timely responsive strategies and wisely allocate resource during Hurricane events.
科研通智能强力驱动
Strongly Powered by AbleSci AI