FOA Home
The same inter-document similarity information captured in the $X = J
J^{T}$ matrix can be used for other purposes, too. For example, Section
§7.5.1 will discuss one approach to
the problem of CLASSIFYING documents known as nearest neighbor.
The
$$ captures patterns of keyword usage across a corpus of documents. The
preceeding sections have held the corpus constant and used this data to
analyze transformations of the keyword dimensions, but the converse is
also possible. For example, Section §6.3
will discuss the representation of inter-keyword relationships
known as THESAURI . One simple baseline for keywords is their
pairwise similarities, as captured by $J$: Y = J^{T} J This produces a
$V \times V$ symmetric, square matrix capturing all $V \choose 2$
inter-keyword simularities, exactly analogous to the inter-document
similarities of (FOAref) .
Littman has also considered an
interesting application of LSI towards the problem of searching across
multi-lingual corpora [Littman98] .
Top of Page
Other uses of vector space