次线性函数
组合数学
k-最近邻算法
数学
最近邻搜索
公制(单位)
图形
离散数学
近似算法
计算机科学
人工智能
运营管理
经济
作者
Artur Czumaj,Christian Sohler
摘要
.Let \((X,d)\) be an \(n\)-point metric space. We assume that \((X,d)\) is given in the distance oracle model, that is, \(X = \{1, \dots, n\}\) and for every pair of points \(x, y\) from \(X\) we can query their distance \(d(x,y)\) in constant time. A \(k\)-nearest neighbor (\(k\)-NN) graph for \((X,d)\) is a directed graph \(G = (V,E)\) that has an edge to each of \(v\)'s \(k\) nearest neighbors. We use \({\mathfrak{cost}}(G)\) to denote the sum of edge weights of \(G\). In this paper, we study the problem of approximating \({\mathfrak{cost}}(G)\) in sublinear time when we are given oracle access to the metric space \((X,d)\) that defines \(G\). Our goal is to develop an algorithm that solves this problem faster than the time required to compute \(G\). We first present an algorithm that in \(\widetilde{O}_{\varepsilon }(n^2/k)\) time with probability at least \(\frac 23\) approximates \({\mathfrak{cost}}(G)\) to within a factor of \(1 + \varepsilon\). Next, we present a more elaborate sublinear algorithm that in time \(\widetilde{O}_{\varepsilon }(\min \{n k^{3/2}, n^2/k\})\) computes an estimate \(\overline{\mathfrak{cost}}\) of \({\mathfrak{cost}}(G)\) that satisfies with probability at least \(\frac 23\) \(|{\mathfrak{cost}}(G) -{\overline{\mathfrak{cost}}} | \le \varepsilon \cdot ({\mathfrak{cost}}(G) +{\mathfrak{mst}}(X))\), where \({\mathfrak{mst}}(X)\) denotes the cost of the minimum spanning tree of \((X,d)\). Further, we complement these results with near matching lower bounds. We show that any algorithm that for a given metric space \((X,d)\) of size \(n\), with probability at least \(\frac 23\), estimates \({\mathfrak{cost}}(G)\) to within a \(1 + \varepsilon\) factor requires \(\Omega (n^2/k)\) time. Similarly, any algorithm that with probability at least \(\frac 23\) estimates \({\mathfrak{cost}}(G)\) to within an additive error term \(\varepsilon \cdot ({\mathfrak{mst}}(X) +{\mathfrak{cost}}(X))\) requires \(\Omega_{\varepsilon }(\min \{n k^{3/2}, n^2/k\})\) time.Keywordssublinear algorithmsapproximation algorithmnearest neighborsMSC codes68Q2568W0568W2068W2568W40
科研通智能强力驱动
Strongly Powered by AbleSci AI