Human summarizers often rely on cutting and pasting of the full document to generate summaries. Decomposing a human-written summary sentence requires determining: (1) whether it is constructed by cutting and pasting, (2) what components in the sentence come from the original document, and (3) where in the document the components come from.
arthur b sackler vice president for law and public policy of time warner inc and a member of the direct marketing association told the communications subcommittee of the senate commerce committee that legislation to protect children's privacy online could destroy the spontaneous nature that makes the internet unique
Source document sentences identified by the program: (the matching phrases identified by the program are marked in color)
Sentence 1: a proposed new law that would require web publishers to obtain parental consent before collecting personal information from children could destroy the spontaneous nature that makes the internet unique a member of the direct marketing association told a senate panel thursday
Sentence 2: arthur b sackler vice president for law and public policy of time warner inc said the association supported efforts to protect children online but he urged lawmakers to find some middle ground that also allows for interactivity on the internet
Sentence 3: for example a child's e-mail address is necessary in order to respond to inquiries such as updates on mark mcguire's and sammy sosa's home run figures this year or updates of an online magazine sackler said in testimony to the communications subcommittee of the senate commerce committee
Sentence 5: the subcommittee is considering the children's online privacy protection act which was drafted on the recommendation of the federal trade commission
We reduce the decomposition problem to a problem of finding the most likely document origin for each word in a summary sentence, shown below, and then solve the problem using HMM.