FOA Home
Two distinct classes of machine learning techniques can be applied to
the FOA problem. These can be distinguished on the basis of the type of
training feedback given to the learning system. The most
powerful and well-understood are SUPERVISED learning methods,
where each and every training instance given to the learning system
comes with an explicit label as to what the learning system should do.
Using the Email example \dhfoot{Email example}, if we want talk
announcements to consistently go into one folder, mail from our family
go in another, and spam is deleted. In terms of supervised learning,
this regime requires that we first provide a TRAINING SET (cf.
Section §7.4 ). In our case the training
set is a set of Email messages and the $C$ mail categories we have
classified them in the past. After training on this data set, we hope
that our classifier generalizes to new, previously unseen messages and
classifies them correctly as well.
A second class of machine learning
techniques makes weaker assumptions concerning the availability of
training feedback. REINFORCEMENT learning assumes only that a
positive/negative signal tells the learning system when it is doing a
good/bad job. In the FOA process, for example, relevant feedback
generates a reinforcement signal (saying whether it was a good or bad
thing that a document was retrieved).
Note that RelFbk does not
count as supervised learning: in general we do not know all of the
documents which should have been retrieved with respect to a particular
query. Supervised training provides more information in the sense that
each and every aspect of the learner's action (retrieval) can be
contrasted with corresponding features of the correct action.
Reinforcement information, on the other hand, aggregates all of these
features into a single measure of performance.
The difference between
these two kinds of learning is especially stark in the FOA context. To
provide reinforcement information, the user need only react to each
document and say whether they are happy or sad it was retrieved. In
order to do supervised training, the user would need to identify the
perfect retrieval, requiring the user evaluating each and every document
in the corpus! Clearly having each user evaluate whether every document
retrieved for every query is excessive. What approximations to to this
notion of ``correct'' answer might be useful?
The distinction between
the supervised retrieval and that shaped by RelFbk highlights the
need to be explicit about what kinds of feedback are hard for the user
and which are easier. The discussion of RAVE made some of our
assumptions concerning cognitive overhead clear §4.4 , but this is another important area
for further study. What other feedback might we reliably and easily be
able to elicit? Can users react with too general/specific? Too
theoretic/applied? How could such information be exploited by a learning
system? Here we continue to assume that RelFbk is easy to acquire.
Top of Page
Sources of Feedback
Subsections