FOA Home
Two features of the FOA problem can help us to focus on what is known as
the Minkowski metric [Luenberger69] [Jain88] . First, the result of our
calculations below will be a {\em real-valued weight} associating a
keyword with a document or query, and we can assume that this is a
continuous quantity. Further, we can make the somewhat more questionable
assumption that these weights make ``natural'' use of zero, and so index
weights also fall on what is known as a ``ratio'' scale. With these two
assumption, the Minkowski metrics are defined as: \beq
\mathname{Sim}(\mathbf{q,d}) = \left( \sum_{k=1}^{NKw} \mid w_{qk} -
w_{dk} \mid^L \right)^{1/L} \eeq where $L \geq 1$. The most common
version is the $L=2$ norm, and we use it below. The $L=1$ (``Manhattan
distance") and $L=\infty$ (``sup" norm, where $\sum_{k}$ is replaced
with $max$) are also seen often.
A metric is a scalar function over pairs
of points in the vector space. Minkowski metrics satisfying three
critical properties: \mathname{Sim}(x,y) & \ge & 0 \\
\mathname{Sim}(x,y) & = & \mathname{Sim}(y,x) \\ \mathname{Sim}(x,x) & =
& \parallel x \parallel \\ & \geq & arg\max_y \mathname{Sim}(x,y)
The
measure $(x,x)$ of a vector with itself is what we typically think of as
the {\em length} of the vector, or more precisely, its {\em norm}
$\parallel x \parallel$.
Two other important features of metric spaces
follow from these axioms, and will also prove useful later: \item
$\mathname{Sim}(x,y) \leq \parallel x \parallel \cdot \parallel y
\parallel$ ({\em Cauchy-Schwartz Inequality}) \item $\parallel x+y
\parallel \leq \parallel x \parallel + \parallel y \parallel$ ({\em
Triangle Inequality}) gle Inequality}) nequality}) lity}) ) \eenum
Top of Page
Formal notions of similarity