NIPS Poster A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice

Poster

A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice

Hendrik Fichtenberger · Dennis Rohde

Room 210 #91

Keywords: [ Learning Theory ] [ Classification ] [ Computational Complexity ]

[ Abstract ]

Abstract: In the

k

$k$ -nearest neighborhood model (

k

$k$ -NN), we are given a set of points

P

$P$ , and we shall answer queries

q

$q$ by returning the

k

$k$ nearest neighbors of

q

$q$ in

P

$P$ according to some metric. This concept is crucial in many areas of data analysis and data processing, e.g., computer vision, document retrieval and machine learning. Many

k

$k$ -NN algorithms have been published and implemented, but often the relation between parameters and accuracy of the computed

k

$k$ -NN is not explicit. We study property testing of

k

$k$ -NN graphs in theory and evaluate it empirically: given a point set

P \subset R^{δ}

$P \subset \mathbb{R}^\delta$ and a directed graph

G = (P, E)

$G=(P,E)$ , is

G

$G$ a

k

$k$ -NN graph, i.e., every point

p \in P

$p \in P$ has outgoing edges to its

k

$k$ nearest neighbors, or is it

ϵ

$\epsilon$ -far from being a

k

$k$ -NN graph? Here,

ϵ

$\epsilon$ -far means that one has to change more than an

ϵ

$\epsilon$ -fraction of the edges in order to make

G

$G$ a

k

$k$ -NN graph. We develop a randomized algorithm with one-sided error that decides this question, i.e., a property tester for the

k

$k$ -NN property, with complexity

O (\sqrt{n} k^{2} / ϵ^{2})

$O(\sqrt{n} k^2 / \epsilon^2)$ measured in terms of the number of vertices and edges it inspects, and we prove a lower bound of

Ω (\sqrt{n / ϵ k})

$\Omega(\sqrt{n / \epsilon k})$ . We evaluate our tester empirically on the

k

$k$ -NN models computed by various algorithms and show that it can be used to detect

k

$k$ -NN models with bad accuracy in significantly less time than the building time of the

k

$k$ -NN model.

Live content is unavailable. Log in and register to view live content