FOA Home | UP: RAVe: A Relevance Assessment VEhicle


RAVeUnion

It would be most useful if, for every query, the relevance of every document could be assessed. However, the collection of this many assesments, for a corpus large enough to provide a real retrieval test, quickly becomes much too expensive. But if the evaluation goal is relaxed to being the relative comparison of one retrieval system with one or more alternative systems, assesments can be constrained to only those documents retrieved by one of the systems.

We therefore follow the POOLING procedure used by many other evaluators, viz., using the proposed retrieval methods themselves as procedures for identifying documents worth assessing.

The first step in constructing a RAVe experiment is to combine the ranked retrieval lists of the two or more retrieval methods, creating a single list of documents ordered according to how interested we are in having them assessed by a subject. We call this function RAVeUnion.

{First, the assessment of a document whose ranked order is highly correlated across retrieval methods provides little information about differences between the methods. Said another way, we can potentially learn most from those documents whose rank order is most different, and hence a measure of the difference in ranked orders of a particular document might be used to favor ``controversial'' documents. This factor has the unfortunate consequence, however, of being sensitive to what we would expect to be the least germane documents, those documents ranked low by any of the methods under consideration. A second factor that could be considered is a ``sanity check,'' including near the top of our list a random sample. While we might learn a great deal from these if users agree that these randomly selected documents are in fact relevant, we expect that in general the retrieval performance of the systems should not depend on random documents.}

RAVeUnion produces the most straight-forward ``zipper'' merge of the lists, beginning with the most highly ranked and alternating. The output of RAVeUnion is a file of (query, document) pairs along with a field which indicates if the pair was uniquely suggested by only one of the methods. This last information can be used to compare the average relevance scores of documents suggested by one method alone to those retrieved by more than one.


Top of Page | UP: RAVe: A Relevance Assessment VEhicle | ,FOA Home


FOA © R. K. Belew - 00-09-21