An Analysis of the Role of Grammatical Categories in a Statistical Information Retrieval System*

Nina Wacholder, Judith L. Klavans and David Kirk Evans
{nina, klavans, devans}@cs.columbia.edu

Abstract

The hypothesis of this research is that words of certain grammatical categories, such as nouns, make a greater contribution to the effectiveness of IR systems than do words from other categories. We have performed an experiment that clearly shows that nouns, and surprisingly, adjectives play an important role in raising precision. Our results are a strong indication that even in statistical IR systems in which all grammatical categories are weighted equally, different parts of speech in the documents have measurably different effects on the performance of the IR system.
Dave Evans
Last modified: Mon Mar 01 19:08:41 Eastern Standard Time 1999