PathSim

计算机科学 相似性(几何) 路径(计算) 理论计算机科学 语义学(计算机科学) 语义相似性 对象(语法) 数据挖掘 情报检索 人工智能 图像(数学) 程序设计语言
作者
Yizhou Sun,Jiawei Han,Xifeng Yan,Philip S. Yu,Tianyi Wu
出处
期刊:Proceedings of the VLDB Endowment [VLDB Endowment]
卷期号:4 (11): 992-1003 被引量:1654
标识
DOI:10.14778/3402707.3402736
摘要

Similarity search is a primitive operation in database and Web search engines. With the advent of large-scale heterogeneous information networks that consist of multi-typed, interconnected objects, such as the bibliographic networks and social media networks, it is important to study similarity search in such networks. Intuitively, two objects are similar if they are linked by many paths in the network. However, most existing similarity measures are defined for homogeneous networks. Different semantic meanings behind paths are not taken into consideration. Thus they cannot be directly applied to heterogeneous networks. In this paper, we study similarity search that is defined among the same type of objects in heterogeneous networks. Moreover, by considering different linkage paths in a network, one could derive various similarity semantics. Therefore, we introduce the concept of meta path-based similarity , where a meta path is a path consisting of a sequence of relations defined between different object types ( i.e. , structural paths at the meta level). No matter whether a user would like to explicitly specify a path combination given sufficient domain knowledge, or choose the best path by experimental trials, or simply provide training examples to learn it, meta path forms a common base for a network-based similarity search engine. In particular, under the meta path framework we define a novel similarity measure called PathSim that is able to find peer objects in the network ( e.g. , find authors in the similar field and with similar reputation), which turns out to be more meaningful in many scenarios compared with random-walk based similarity measures. In order to support fast online query processing for PathSim queries, we develop an efficient solution that partially materializes short meta paths and then concatenates them online to compute top- k results. Experiments on real data sets demonstrate the effectiveness and efficiency of our proposed paradigm.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
兜兜发布了新的文献求助10
1秒前
xiaoma发布了新的文献求助10
1秒前
呢喃Dora发布了新的文献求助10
1秒前
cinn完成签到 ,获得积分10
2秒前
2秒前
一点点晚风完成签到,获得积分10
2秒前
3秒前
海绵体宝宝完成签到 ,获得积分10
4秒前
小二郎应助xiaoma采纳,获得10
5秒前
mineave完成签到 ,获得积分10
5秒前
之外完成签到,获得积分10
5秒前
DDAIDN发布了新的文献求助10
6秒前
6秒前
羊毛毛衣发布了新的文献求助10
6秒前
jocifer发布了新的文献求助10
6秒前
星辰大海应助wangyue采纳,获得10
6秒前
cookieMichael给cookieMichael的求助进行了留言
8秒前
张学友发布了新的文献求助10
9秒前
领导范儿应助奕初阳采纳,获得10
9秒前
10秒前
从容芮应助自由天荷采纳,获得10
11秒前
11秒前
11秒前
12秒前
江峰发布了新的文献求助10
12秒前
13秒前
121发布了新的文献求助10
14秒前
15秒前
NEFELIBO完成签到,获得积分10
16秒前
yedongyu完成签到,获得积分10
16秒前
moMo发布了新的文献求助10
17秒前
若水发布了新的文献求助10
17秒前
18秒前
18秒前
Nariy完成签到,获得积分10
18秒前
汉字应助手残症采纳,获得10
18秒前
changaipei给WYN的求助进行了留言
18秒前
19秒前
20秒前
21秒前
高分求助中
Evolution 10000
Sustainability in Tides Chemistry 2800
юрские динозавры восточного забайкалья 800
Diagnostic immunohistochemistry : theranostic and genomic applications 6th Edition 500
Chen Hansheng: China’s Last Romantic Revolutionary 500
China's Relations With Japan 1945-83: The Role of Liao Chengzhi 400
Classics in Total Synthesis IV 400
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3150225
求助须知:如何正确求助?哪些是违规求助? 2801322
关于积分的说明 7844073
捐赠科研通 2458853
什么是DOI,文献DOI怎么找? 1308673
科研通“疑难数据库(出版商)”最低求助积分说明 628556
版权声明 601721