FOA Home | UP: Classification


Combining Classifiers

The preceding chapters have described a wide range of potentially useful retrieval techniques. A very reasonable response, is to ask if perhaps the best possible retrieval system isn't some sort of mixture of these various techniques. For example, Bartell [REF1093] has considered simple linear combinations of experts like those shown in Fig. (figure) . His experiments considered two experts, one which used a set of simple words as features and a second which did more elaborate phrase extraction. Holding the sum of the two experts contributions constant, there relative contribution describes a circle of fixed radius. Fig. (figure) shows a set of 228 test queries and the optimal weighting of phrase and term experts for each. That is, for a particular query, the optimal contribution of ranking information from the phrase and term expert is determined. Also shown is a line corresponding to the optimal balance between phrase and term expert across all queries. In general, the phrase expert was not very useful. On some queries, however, it was able to improve performance significantly. The way in which individual queries make special demands of the retrieval system is perhaps the most striking feature of these results.

As search engine technologies have developed, the composition of hybrid systems involving multiple systems has required a more ``black box" composition. That is, rather than manipulating a single feature of the retrieval system (e.g., term vs. phrase features), the combination has been of a system's net ranking [Thompson90a] [Thompson90b] . COLLECTION FUSION refers to the problem of combining results coming from disjoint corpora [Towell95] [Yager98] . (In the emerging environment of combined corpora and primarily publisher-driven search engines, corpora have become confounded with the search engines allowed to search them.) Diamond (personal communication) has hypothesized several effects we might imagine from FUSING multiple search engines: SKIMMING EFFECT ... [when] retrieval approaches that represent [documents] differently may retrieve different relevant items, so that a combination method that takes the top-ranked items from each of the retrieval approaches will "push" non-relevant items down in the ranking. CHORUS EFFECT ... when several retrieval approaches (each representing the query differently) suggest that an item is relevant to a query, this tends to be stronger evidence for relevance than a single approach doing so. Thus, allowing several independent retrieval approaches to "vote" on the relevance of an item should enable a sharper distinction between relevant and non-relevant items. Note how the first these focus on differences in the systems' based treatment of documents and the second on queries (cf. Section §3.3.3 ). {Diamond also mentions a ``Dark Horse effect ... [which] refers to the situation in which, for the query at hand, a retrieval approach may produce unusually accurate (or inaccurate) estimates of relevance for at least some items, relative to the other retrieval approaches.''}

In some ways, this combination of classifiers is reminscent of earlier work using multiple query representations [Belkin93] Vogt has recently performed an exhaustive analysis of linear combinations for all 61 search engines submitted to the TREC5 competition [Vogt98] . His primary conclusion, following Lee [Lee97] , is that two systems are best combined linearly when there is a great deal of overlap in the set of relevant documents they identify, while their retrieval of non-relevant documents are nearly disjoint. In terms of Diamond's qualitative expectations, linear combinations are most able to support the ``chorus'' effect. Schutze et al. and Larkey have considered combining various types of special-function classifiers [Schutze95b] [Larkey96] .


Top of Page | UP: Classification | ,FOA Home


FOA © R. K. Belew - 00-09-21