FOA Home
In Chapter 2 we first identified documents, and then lexical features to
be associated with each. Then we built an inverted keyword list to make
going from keywords to documents about those keywords as easy as
possible. Now we become specific about how we measure the similarity
between a document and a query.
The discussion of matching queries and
documents is simplified if we adopt the vector space perspective of
Section §3.4 and imagine both the query
$\mathbf{q}$ and all documents $\mathbf{d}$ to be vectors in a space of
dimensionality equal to the $\mathname{NKw}$, the keyword vocabulary
size.
In this space, the answer to the question of which documents are
the best match for a query seems straightforward - those documents which
are most similar, relative to some particular METRIC
$\mathname{Sim}$ measuring distance between points in the space.
Students of algebra and abstract vector spaces know that a wide range of
choices are possible; see Section §5.2.2 .
Top of Page
Matching queries against documents
Subsections