FOA Home
If we had worked hard on a particular test corpus of documents to
identify (always with respect to some particular query) which documents
were and which were not, it would them be possible to carefully study
which features $x_{i}$ were reliably found in relevant documents and
which were not. Collecting such statistics for each feature would then
allow us to estimate: \[ \Pr({\bf x} | \mathname{Rel}) \] the probabily
of any particular set of features ${\bf x}$, given that we know it is
\Rel. (Just which statistics we collect, and how, is discussed in more
detail in Section §7.4 as part of a more
general classification task.) The retrieval question requires that we
ask the converse, the probability that for the document we are
considering, it should be considered relevant. This inverstion is
accomplished via the familiar Bayes Rule: \Pr(\mathname{Rel} | {\bf x})
= {\Pr({\bf x}| \mathname{Rel}) \Pr(\mathname{Rel}) \over \Pr({\bf x})}
Top of Page
Bayesian inversion