计算机科学
公制(单位)
概率逻辑
最近邻搜索
数据挖掘
构造(python库)
k-最近邻算法
比例(比率)
序列(生物学)
人工智能
机器学习
生物
物理
量子力学
经济
遗传学
程序设计语言
运营管理
作者
Mehmet Oğuz Mülâyim,Josep Lluís Arcos
标识
DOI:10.1016/j.knosys.2020.106374
摘要
This work is about speeding up retrieval in Case-Based Reasoning (CBR) for large-scale case bases (CBs) comprised of temporally related cases in metric spaces. A typical example is a CB of electronic health records where consecutive sessions of a patient forms a sequence of related cases. k-Nearest Neighbors (kNN) search is a widely used algorithm in CBR retrieval. However, brute-force kNN is impossible for large CBs. As a contribution to efforts for speeding up kNN search, we introduce an anytime kNN search methodology and algorithm. Anytime Lazy kNN finds exact kNNs when allowed to run to completion with remarkable gain in execution time by avoiding unnecessary neighbor assessments. For applications where the gain in exact kNN search may not suffice, it can be interrupted earlier and it returns best-so-far kNNs together with a confidence value attached to each neighbor. We describe the algorithm and methodology to construct a probabilistic model that we use both to estimate confidence upon interruption and to automatize the interruption at desired confidence thresholds. We present the results of experiments conducted with publicly available datasets. The results show superior gains compared to brute-force search. We reach to an average gain of 87.18% with 0.98 confidence and to 96.84% with 0.70 confidence.
科研通智能强力驱动
Strongly Powered by AbleSci AI