FOA Home | UP: Classification


Hierarchic classification

In many areas of careful scholarship, classification labels are not merely members of a big set, but organized hierarchically into systems of $/\mathname{NT}$ hypernymy (cf. Section §6.3 . For example, the U.S. Patent Office has 400 top-level classifications with 135,000 sub-classes [Larkey98] . These classes are part of a hierarchic tree going down 15 levels.

A simple example suggested by Mitchell and others' use of the UseNet newsgroup hierarchy [Mitchell97] [McCallum98a] is shown in Figure (figure) .

Let $c_{h}$ be a HIERARCHIC CLASSIFICATION , meaning that it is part of a taxonomy rooted at $c_{0}$ and connected via a path of ANCESTOR classifications $\bigoplus c_{h}$: \bigoplus c_{h} & \equiv & \{ c_0, c_{a}, c_{a.b}, ... , c_{a.b.c, c_{a.b.c\ \ldots \ .h} \} \end{eqnarray} This notation is meant to capture the relationship shown in Figure \epsfigh{Ancestors of a class}{hier-ancestors}{2}.

McCallum et al. creatively apply the statistical technique known as shrinkage to the problem of text classification [McCallum98a] . Parameter estimates of children classes which will have very few data instances can be ``shrunk" towards the data-rich ancestors, and the contributions of each ancestor classification are then linearly combined: \theta_{kc_{h}} = \sum_{i \in \bigoplus c_{h}} w_{i} \Pr \left( k | c_{i}\right)


Top of Page | UP: Classification | ,FOA Home


FOA © R. K. Belew - 00-09-21