Martha Palmer, University of Pennsylavania

mpalmer@linc.cis.upenn


Title: Putting Meaning into Your Trees

Time:Thursday, January 29, 11:30 - 12:30

Place:CS Conference Room in MUDD

Abstract:
The current success of applications of machine learning techniques to tasks such as part-of-speech tagging and parsing has kindled the hope that these same techniques might have equal or greater success in other areas such as lexical semantics. Advances in automated and semi-automated methods of acquiring lexical semantics would release the field from its dependence on well-defined sub-domains and enable broad-coverage natural language processing. However, supervised machine learning requires large amounts of publicly available training data, and a prerequisite for this training data is general agreement on which elements should be tagged and with what tags. With respect to lexical semantics, this type of general agreement has been strikingly elusive.

A recent consensus on a task-oriented level of semantic representation to be layered on top of the existing Penn Treebank syntactic structures has been achieved. This level, know as the Proposition Bank, or PropBank, consists of argument labels for the semantic roles of individual verbs and similar predicating expressions such as participial modifiers and nominalizations. This talk will describe the PropBank verb semantic role annotation being done at Penn for both English and Chinese. The annotation process will be discussed as well as the use of existing lexical resources such as WordNet, Levin classes and VerbNet. Similar projects include the FrameNet Project at Berkeley and the Prague Tectogrammatics project. PropBank annotation is shallower than the Prague Tectogrammatics project and more broad coverage than FrameNet, in that every verb instance in the corpus has to be annotated.

The talk will also briefly describe progress in developing automatic semantic role labelers based on this training data and investigations into the role of sense distinctions in improving performance.

About the speaker: Martha Palmer is an Associate Professor in the Computer and Information Sciences Department of the University of Pennsylvania. She has been a member of the Advisory Committee for the DARPA TIDES program, the Chair of SIGLEX, the Chair of SIGHAN, and is now Vice-President elect of the Association for Computational Linguistics. Her early work on lexically based semantic interpretation formed the basis of the successful DARPA-funded message processing system, Pundit, and fostered a continuing interest in Information Extraction (ACE) and Machine Translation (TIDES). Her interest in lexical semantics and verb classes led to her involvement in SENSEVAL and the development of the English, Chinese and Korean Proposition Banks.