The generation module consists of two subsystem: sentence reduction module and sentence combination module. The sentence reduction system removes non-essential information from a sentence. The combination system merges sentences from a document, either in their original forms or their resulting forms from sentence reduction.
We collected a corpus consisting documents and their human-written summaries. We then developed an automatic program to analyze how humans construct summary sentences by cutting and pasting phrases from original documents. Other resources used in the generation module include a large-scale lexicon we combined from multiple resources, a syntactic parser for English, and a co-reference resolution system.