Textual data transformations using natural language processing for risk assessment

计算机科学 稳健性(进化) 数据挖掘 自然语言 风险评估 自然语言处理 人工智能 数据科学 风险分析(工程) 机器学习 医学 生物化学 化学 计算机安全 基因
作者
Mohammad Zaid Kamil,Mohammed Taleb‐Berrouane,Faisal Khan,Paul Amyotte,Salim Ahmed
出处
期刊:Risk Analysis [Wiley]
卷期号:43 (10): 2033-2052 被引量:16
标识
DOI:10.1111/risa.14100
摘要

Underlying information about failure, including observations made in free text, can be a good source for understanding, analyzing, and extracting meaningful information for determining causation. The unstructured nature of natural language expression demands advanced methodology to identify its underlying features. There is no available solution to utilize unstructured data for risk assessment purposes. Due to the scarcity of relevant data, textual data can be a vital learning source for developing a risk assessment methodology. This work addresses the knowledge gap in extracting relevant features from textual data to develop cause-effect scenarios with minimal manual interpretation. This study applies natural language processing and text-mining techniques to extract features from past accident reports. The extracted features are transformed into parametric form with the help of fuzzy set theory and utilized in Bayesian networks as prior probabilities for risk assessment. An application of the proposed methodology is shown in microbiologically influenced corrosion-related incident reports available from the Pipeline and Hazardous Material Safety Administration database. In addition, the trained named entity recognition (NER) model is verified on eight incidents, showing a promising preliminary result for identifying all relevant features from textual data and demonstrating the robustness and applicability of the NER method. The proposed methodology can be used in domain-specific risk assessment to analyze, predict, and prevent future mishaps, ameliorating overall process safety.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
xxy完成签到,获得积分10
1秒前
butterfly完成签到,获得积分10
1秒前
小蘑菇应助失眠的耳机采纳,获得10
1秒前
领导范儿应助zhanggq123采纳,获得10
1秒前
1秒前
ding应助优雅契采纳,获得10
2秒前
2秒前
逢春完成签到,获得积分10
2秒前
2秒前
CodeCraft应助qiuhai采纳,获得10
2秒前
陈老派发布了新的文献求助10
2秒前
Owen应助xin采纳,获得10
3秒前
liangxianli完成签到,获得积分10
3秒前
3秒前
卡米尔发布了新的文献求助10
3秒前
CipherSage应助活力的听露采纳,获得10
3秒前
HJJHJH发布了新的文献求助10
4秒前
4秒前
在水一方应助Tang125采纳,获得10
4秒前
搜集达人应助开放明雪采纳,获得10
4秒前
5秒前
5秒前
巧克力饼干完成签到,获得积分10
5秒前
安安完成签到,获得积分10
5秒前
will发布了新的文献求助10
6秒前
6秒前
山河西瓜完成签到,获得积分10
6秒前
852应助wqx采纳,获得10
6秒前
7秒前
Quinn发布了新的文献求助10
7秒前
xyzdmmm发布了新的文献求助10
7秒前
8秒前
0307完成签到,获得积分10
8秒前
所所应助策略采纳,获得10
8秒前
藤原拓海发布了新的文献求助10
8秒前
斯文败类应助须臾采纳,获得10
8秒前
微尘应助HJJHJH采纳,获得10
8秒前
8秒前
zzz发布了新的文献求助10
9秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Kinesiophobia : a new view of chronic pain behavior 5000
Molecular Biology of Cancer: Mechanisms, Targets, and Therapeutics 3000
First commercial application of ELCRES™ HTV150A film in Nichicon capacitors for AC-DC inverters: SABIC at PCIM Europe 1000
Feldspar inclusion dating of ceramics and burnt stones 1000
Digital and Social Media Marketing 600
Zeolites: From Fundamentals to Emerging Applications 600
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 内科学 生物化学 物理 计算机科学 纳米技术 遗传学 基因 复合材料 化学工程 物理化学 病理 催化作用 免疫学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 5992066
求助须知:如何正确求助?哪些是违规求助? 7441496
关于积分的说明 16064502
捐赠科研通 5133943
什么是DOI,文献DOI怎么找? 2753723
邀请新用户注册赠送积分活动 1726516
关于科研通互助平台的介绍 1628450