FOA Home
It would be most useful if, for every query, the relevance of every
document could be assessed. However, the collection of this many
assesments, for a corpus large enough to provide a real retrieval test,
quickly becomes much too expensive. But if the evaluation goal is
relaxed to being the relative comparison of one retrieval
system with one or more alternative systems, assesments can be
constrained to only those documents retrieved by one of the systems.
We
therefore follow the POOLING procedure used by many other
evaluators, viz., using the proposed retrieval methods themselves as
procedures for identifying documents worth assessing.
The first step in
constructing a RAVe experiment is to combine the ranked retrieval lists
of the two or more retrieval methods, creating a single list of
documents ordered according to how interested we are in having them
assessed by a subject. We call this function
{First, the assessment of a document whose ranked order is highly
correlated across retrieval methods provides little information about
differences between the methods. Said another way, we can potentially
learn most from those documents whose rank order is most different, and
hence a measure of the difference in ranked orders of a particular
document might be used to favor ``controversial'' documents. This factor
has the unfortunate consequence, however, of being sensitive to what we
would expect to be the least germane documents, those documents ranked
low by any of the methods under consideration. A second factor that
could be considered is a ``sanity check,'' including near the top of our
list a random sample. While we might learn a great deal from these if
users agree that these randomly selected documents are in fact relevant,
we expect that in general the retrieval performance of the systems
should not depend on random documents.}
Top of Page
RAVeUnion
RAVeUnion.
RAVeUnion produces
the most straight-forward ``zipper'' merge of the lists, beginning with
the most highly ranked and alternating. The output of
RAVeUnion is a file of (query, document) pairs along with a
field which indicates if the pair was uniquely suggested by only one of
the methods. This last information can be used to compare the average
relevance scores of documents suggested by one method alone to those
retrieved by more than one.