Pablo Duboue

Graduate Student, NLP Group; pablo@cs.columbia.edu

Pattern Induction Algorithms And Learning Of Content Planning Schemas

Time: Thursday February 26th, 11:30-12:30

Abstract:

Schemas, as introduced by McKeown (1985), provide a means to structure text to achieve a given discourse goal. My work focuses on learning such schemas from examples of data-to-be-structured and text created by a human (showing one such structure). From the learning of schemas perspective, the text is a sequence of occurrences of the data-to-be-structured (i.e., sequences of messages).

In previous work, I employed techniques from Computational Genomics to find patterns over the sequences of messages. These patterns contain conserved structure across texts that should be part of the final schema.

In this talk, I will introduce my thesis problem in detail, explain the requirements of my pattern induction problem and present a brief survey of pattern induction methods, indexed by pattern language and algorithms.