FOA Home
One interesting feature of the training set generated by the routing
task is the odd distribution of positive and negative examples it
generates. Initially we can imagine that this filter is very inaccurate;
i.e., we are likely to see many negative examples. Later, when we hope
it is well-trained, the filter has nearly perfect performance and the
system gets very few negative examples.
Further, no user's interests
remain static. As discussed in the next chapter, one common purpose for
the FOA activity is to become educated(cf. Section §8.3.4 , and this is an elusive,
ever-changing goal. The world changes, and what they read changes a
user's opinion of what needs to be done and what the new questions are.
In brief, documents they used to find relevant aren't any longer. This
has been called CONCEPT DRIFT [Klinkenberg98] . When the world
changes, this corresponds to documents and the news they contain
changing too. This side of the dynamic is called TOPIC TRACKING
[Allan98] [Baker99] . Jaime Carbonell's (of CMU)
approach is to first identify that a concept change has occurred, and
then adjust a time window on the stream of incoming training data over
which a new invariant is then identified (personal communication).
The
distribution of RelFbk generated by the filtering task , where a
standing query is allowed to adapt to a stream of RelFbk generated
by users who receive and evaluate routed documents, (cf. Section §4.3.9 ) provides an especially
interesting form of learning task, because of its TEMPORAL
DIMENSION . Initially, the set of documents routed to users must
depend on the same fundamental matching function shared by other search
engine tasks. But as RelFbk in response to the first retrievals
comes to affect the users' characterizations of interest, only a skewed
sample (relative to the initial distribution) of potential documents is
shown to the users, and only these can be the basis of subsequent
\RelFbk. This tension between EXPLORATION of the universe of
potentially relevant documents and EXPLOITATION of those that
prior RelFbk makes it seem are most likely to be perceived as
relevant by the users is familiar to other REINFORCEMENT LEARNING
situations [Sutton98] .
Top of Page
User drift and event tracking