Screening patents of ICT in construction using deep learning and NLP techniques

计算机科学 独创性 人工智能 商标 分类器(UML) 深度学习 机器学习 数据科学 情报检索 自然语言处理 社会学 社会科学 操作系统 定性研究
作者
Hengqin Wu,Qiping Shen,Xue Lin,Minglei Li,Boyu Zhang,Clyde Zhengdao Li
出处
期刊:Engineering, Construction and Architectural Management [Emerald Publishing Limited]
卷期号:27 (8): 1891-1912 被引量:14
标识
DOI:10.1108/ecam-09-2019-0480
摘要

Purpose This study proposes an approach to solve the fundamental problem in using query-based methods (i.e. searching engines and patent retrieval tools) to screen patents of information and communication technology in construction (ICTC). The fundamental problem is that ICTC incorporates various techniques and thus cannot be simply represented by man-made queries. To investigate this concern, this study develops a binary classifier by utilizing deep learning and NLP techniques to automatically identify whether a patent is relevant to ICTC, thus accurately screening a corpus of ICTC patents. Design/methodology/approach This study employs NLP techniques to convert the textual data of patents into numerical vectors. Then, a supervised deep learning model is developed to learn the relations between the input vectors and outputs. Findings The validation results indicate that (1) the proposed approach has a better performance in screening ICTC patents than traditional machine learning methods; (2) besides the United States Patent and Trademark Office (USPTO) that provides structured and well-written patents, the approach could also accurately screen patents form Derwent Innovations Index (DIX), in which patents are written in different genres. Practical implications This study contributes a specific collection for ICTC patents, which is not provided by the patent offices. Social implications The proposed approach contributes an alternative manner in gathering a corpus of patents for domains like ICTC that neither exists as a searchable classification in patent offices, nor is accurately represented by man-made queries. Originality/value A deep learning model with two layers of neurons is developed to learn the non-linear relations between the input features and outputs providing better performance than traditional machine learning models. This study uses advanced NLP techniques lemmatization and part-of-speech POS to process textual data of ICTC patents. This study contributes specific collection for ICTC patents which is not provided by the patent offices.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
ljc发布了新的文献求助10
刚刚
糊糊给糊糊的求助进行了留言
刚刚
哥哥喜欢格格完成签到 ,获得积分10
1秒前
2秒前
3秒前
3秒前
科研通AI5应助俊逸的翠容采纳,获得10
3秒前
图图完成签到,获得积分10
4秒前
机灵的煎蛋完成签到 ,获得积分10
4秒前
善学以致用应助Yancy采纳,获得10
4秒前
cdercder应助黄紫红采纳,获得20
5秒前
5秒前
包容诗槐完成签到,获得积分10
7秒前
茄子肉末先生完成签到 ,获得积分10
8秒前
xjh发布了新的文献求助10
9秒前
李爱国应助lxy采纳,获得10
9秒前
11秒前
12秒前
皮卡丘发布了新的文献求助10
12秒前
13秒前
hyw发布了新的文献求助10
14秒前
南山完成签到 ,获得积分10
14秒前
15秒前
cj326完成签到 ,获得积分10
17秒前
17秒前
18秒前
455关注了科研通微信公众号
18秒前
香蕉海白发布了新的文献求助10
18秒前
18秒前
南山关注了科研通微信公众号
18秒前
19秒前
19秒前
ljc完成签到,获得积分10
19秒前
lxy发布了新的文献求助10
21秒前
熹熹完成签到,获得积分10
22秒前
领导范儿应助Flori采纳,获得30
22秒前
22秒前
23秒前
Yancy完成签到,获得积分10
23秒前
英姑应助皮卡丘采纳,获得10
24秒前
高分求助中
Production Logging: Theoretical and Interpretive Elements 2700
Neuromuscular and Electrodiagnostic Medicine Board Review 1000
こんなに痛いのにどうして「なんでもない」と医者にいわれてしまうのでしょうか 510
The First Nuclear Era: The Life and Times of a Technological Fixer 500
ALUMINUM STANDARDS AND DATA 500
Walter Gilbert: Selected Works 500
岡本唐貴自伝的回想画集 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3667802
求助须知:如何正确求助?哪些是违规求助? 3226272
关于积分的说明 9768903
捐赠科研通 2936222
什么是DOI,文献DOI怎么找? 1608316
邀请新用户注册赠送积分活动 759622
科研通“疑难数据库(出版商)”最低求助积分说明 735407