PathSim

计算机科学 相似性(几何) 路径(计算) 理论计算机科学 语义学(计算机科学) 语义相似性 对象(语法) 数据挖掘 情报检索 人工智能 图像(数学) 程序设计语言
作者
Yizhou Sun,Jiawei Han,Xifeng Yan,Philip S. Yu,Tianyi Wu
出处
期刊:Proceedings of the VLDB Endowment [Association for Computing Machinery]
卷期号:4 (11): 992-1003 被引量:1654
标识
DOI:10.14778/3402707.3402736
摘要

Similarity search is a primitive operation in database and Web search engines. With the advent of large-scale heterogeneous information networks that consist of multi-typed, interconnected objects, such as the bibliographic networks and social media networks, it is important to study similarity search in such networks. Intuitively, two objects are similar if they are linked by many paths in the network. However, most existing similarity measures are defined for homogeneous networks. Different semantic meanings behind paths are not taken into consideration. Thus they cannot be directly applied to heterogeneous networks. In this paper, we study similarity search that is defined among the same type of objects in heterogeneous networks. Moreover, by considering different linkage paths in a network, one could derive various similarity semantics. Therefore, we introduce the concept of meta path-based similarity , where a meta path is a path consisting of a sequence of relations defined between different object types ( i.e. , structural paths at the meta level). No matter whether a user would like to explicitly specify a path combination given sufficient domain knowledge, or choose the best path by experimental trials, or simply provide training examples to learn it, meta path forms a common base for a network-based similarity search engine. In particular, under the meta path framework we define a novel similarity measure called PathSim that is able to find peer objects in the network ( e.g. , find authors in the similar field and with similar reputation), which turns out to be more meaningful in many scenarios compared with random-walk based similarity measures. In order to support fast online query processing for PathSim queries, we develop an efficient solution that partially materializes short meta paths and then concatenates them online to compute top- k results. Experiments on real data sets demonstrate the effectiveness and efficiency of our proposed paradigm.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
李健的小迷弟应助江峰采纳,获得10
1秒前
2秒前
姜露萍发布了新的文献求助10
2秒前
Lucas应助wd采纳,获得10
3秒前
3秒前
称心曼安发布了新的文献求助10
4秒前
科研通AI5应助动听从寒采纳,获得10
4秒前
宋心茹发布了新的文献求助10
4秒前
一站到底完成签到 ,获得积分10
4秒前
啊啊发布了新的文献求助10
5秒前
斯文败类应助bzlish采纳,获得10
5秒前
5秒前
Missing发布了新的文献求助10
5秒前
6秒前
彭于晏应助陈陈采纳,获得10
6秒前
苹果寻菱完成签到,获得积分20
7秒前
肥猫发布了新的文献求助10
8秒前
9秒前
艺骞完成签到 ,获得积分10
9秒前
我是老大应助Transition采纳,获得10
9秒前
小雨完成签到,获得积分10
9秒前
糖果色完成签到,获得积分10
10秒前
10秒前
田様应助称心曼安采纳,获得10
10秒前
张佳明发布了新的文献求助10
10秒前
王科发布了新的文献求助30
10秒前
10秒前
11秒前
山谷与花发布了新的文献求助10
11秒前
11秒前
记得早睡早起bbh完成签到 ,获得积分10
12秒前
13秒前
13秒前
偷懒会被吃掉的完成签到,获得积分10
14秒前
科研通AI5应助笑面客采纳,获得10
14秒前
南北发布了新的文献求助30
14秒前
ZM发布了新的文献求助10
15秒前
展会恩发布了新的文献求助10
15秒前
莫三颜发布了新的文献求助10
15秒前
高分求助中
【此为提示信息,请勿应助】请按要求发布求助,避免被关 20000
Continuum Thermodynamics and Material Modelling 2000
105th Edition CRC Handbook of Chemistry and Physics 1600
ISCN 2024 – An International System for Human Cytogenomic Nomenclature (2024) 1000
CRC Handbook of Chemistry and Physics 104th edition 1000
Izeltabart tapatansine - AdisInsight 600
Maneuvering of a Damaged Navy Combatant 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3769859
求助须知:如何正确求助?哪些是违规求助? 3314919
关于积分的说明 10174140
捐赠科研通 3030186
什么是DOI,文献DOI怎么找? 1662685
邀请新用户注册赠送积分活动 795067
科研通“疑难数据库(出版商)”最低求助积分说明 756560