Underspecified semantics, and sentence generation as planning,
==========================================
a talk by Alexander Koller, October 5, CS conf. room

The purpose of this talk is to introduce myself to the Columbia NLP
group. It consists of two parts: First I will present the research on
underspecification-based processing of scope ambiguities which has been
my main line of work in the past few years. Then I will talk about the
research that I mean to do while at Columbia, which is about efficient
sentence generation by using AI planning algorithms.

Underspecification is currently the standard approach to dealing with
scope ambiguities in computational semantics. Scope ambiguities have a
reputation for being rare in practice, but it turns out that the median
number of scope readings in HPSG-annotated corpora is about 60, with top
sentences in the billions. The idea in underspecification is to
represent this potentially large set of readings with a single compact
underspecified description, from which readings can be enumerated by
need. I will give a non-technical overview of our underspecification
formalism (dominance graphs), efficient algorithms for solving them,
translations between different underspecification formalisms, and
methods for eliminating readings that were predicted as possible by the
grammar but not meant by the speaker.

In the second part of the talk, I will turn to the problem of generating
sentences from a communicative goal, a semantic knowledge base, and a
TAG grammar. This problem has been studied in the past (e.g. in the
context of Matthew Stone's SPUD system), but people haven't focused that
much on efficient algorithms. I want to speed up generation by
translating it into a problem of AI planning and applying or adapting
recent planning algorithms, which have become much faster in the past
decade or so. I will then speculate on ways in which information about
the quality of generated sentences (e.g. statistical models) can be
incorporated and used to search for optimal sentences.