NLP for the Web

Spring 2008

Prof. Kathy McKeown


Syllabus

 

Syllabus

 

Date

Topic and Slides

Reading

 

 

 

Jan 24th

Introduction and Summarization

Automatic Summarising: Factors and Directions (K. Sparck Jones), Advances in Automatic Text Summarization 2000.

Automatic Evaluation of Summaries Using N-gram Co-Occurrence Statistics (C.Y. Lin and E. Hovy), HLT-NAACL 2003.

 

 

 

Jan 31st

Single Document Summarization and Evaluation

Cut and paste based text summarization (H. Jing and K.R. McKeown), NAACL 2000.

Statistics-Based Summarization --- Step One: Sentence Compression,(K. Knight and D. Marcu), AAAI, 2000.

Evaluating content selection in summarization: the Pyramid method, (A. Nenkova and R. Passonneau), NAACL-HLT 2004.

Multi-Candidate Reduction: Sentence Compression as a Tool for Document Summarization Tasks, Zajic et al, Information Processing and Management, 2007.

 

 

 

Feb 7th

 Multi-document Summarization

Sentence Fusion for Multidocument News Summarization (R. Barzilay and K.R. McKeown), CL 2005. NOTE: DROPPED FROM READING ASSIGNMENT

Topic-Focused Multi-document Summarization Using an Approximate Oracle Score (J.M. Conroy et al), COLING/ACL 2006.

Bayesian Query-Focused Summarization (H. Daume III and D. Marcu), ACL 2006.

Do Summaries Help? A Task-Based Evaluation of Multi-Document Summarization (K. McKeown et. al.), SIGIR 2005.

Automatic Text Summarization of Newswire: Lessons Learned from the Document Understanding Conference (A. Nenkova), AAAI 05.

 

 

 

Feb 14th

Question Answering

Open-Domain Question-Answering (J. Prager), Foundations and Trends in Information Retrieval 2006. NOTE: Sections 2 and 3.

Natural language based reformulation resource and web exploitation for question answering (U. Hermjakob et al), TREC 2002.

IBM's Statistical Question-answering System - TREC 10 (I. Ittycheriah et al), TREC 2001.

The Structure and Performance of an Open Domain Question-answering System (D. Moldovan et al), ACL00.

 

 

 

Feb 21st

Question Answering at IBM

Invited Speaker: JohnPrager, IBM, Abstract

Improving QA Accuracy by Question Inversion (J. Prager et al), ACL 2006.

Question-Answering by Predictive Annotation (J. Prager et al), SIGIR 2000.

 

 

 

Feb 28th

Paraphrasing

Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment (R. Barzilay and L. Lee), NAACL-HLT03.

Extracting Paraphrases from a Parallel Corpus (R. Barzilay and K. McKeown), ACL/EACL01.

Search in Constraint-Based Paraphrasing (M. Dras), NLPIA98.

Syntax-based alignment of multiple translations: extracting paraphrases and generating new sentences (B. Pang et al), NAACL03.

Paraphrase Acquisition for Information Extraction (Y. Shinyama et al), 2nd International Workshop on Paraphrasing 03.

 

 

 

Mar 6th

Entailment

Invited Speaker: Bill MacCartney, Stanford University

Learning to recognize features of valid textual entailments (B. MacCartney et al), NAACL06.

Containment, Exclusion, and Implicativity: A Model of Natural Logic for Textual Inference (B. MacCartney and C. Manning), Stanford TR08.

An Inference Model for Semantic Entailment in Natural Language (R. de Salvo Braz et al), AAAI05.

An Inference Model for Semantic Entailment in Natural Language Recognition (R. Bar-Haim et al), ACL PASCAL-RTE Workshop 07.

 

 

 

Mar 13th

Opinions

Get out the vote: Determining support or opposition from Congressional floor-debate transcripts, (M. Thomas et al), emnlp06.

Annotating Expressions of Opinions and Emotions in Language, (J. Wiebe et al), Language Resources and Evaluation 05.

Identifying expressions of opinion in context, (E. Breck et al), IJCAI07.

Just How Mad are You? Finding Strong and Weak Opinion Caluses, (T. Wilson et al), AAAI04.

 

 

 

Mar 20th

Spring Break

 

 

 

 

Mar 27th

Sentiment Analysis for the Web

Invited Speaker: Regina Barzilay, MIT

Multiple Aspect Ranking using the Good Grief Algorithm, (B. Snyder and R. Barzilay), NAACL07.

Learning Document-Level Semantic Properties from Free-text Annotations, S. R. K. Branavan and Harr Chen and Jacob Eisenstein and Regina Barzilay, ACL08

Show me the money! Deriving the Pricing Power of Product Features by Mining Consumer Reviews, (N. Archak et al), ACM SIGKDD07.

Opinion Mining Using Econometrics: A Case Study on Reputation Systems , (A. Ghose et al), ACL07.

 

 

 

Apr 3rd

Multilingual Tasks and Approaches

Translating Named Entities Using Monolingual and Bilingual Resources , (Y. Al-Onaizan and K. Knight), ACL02.

Translating Named Entities Using Monolingual and Bilingual Resources , (R. Sproat et al), COLING-ACL06.

Web as Corpus , (A. Kilgarriff and G. Grefenstette), CL Journal03.

The Web as a Parallel Corpus , (P. Resnick and N.A. Smith), CL Journal03.

 

 

 

Apr 10th

Domain Adaptation

Web-based Models for Natual Language Processing , (M. Lapata and F. Keller), TSLP05.

Domain Adaptation for Statistical Classifiers , (H. Daume and D. Marcu), AI Journal 06.

A Backoff Model for Bootstrapping Resources for Non-English Languages, (C. Xi and R. Hwa), EMNLP05.

 

 

 

Apr 17th

NLP for IR

Invited Speaker: Ronald Kaplan, Powerset

 Meeting of the MINDS: An Information Retrieval Agenda, Jamie Callan et al.

 

 

 

Apr 24th

IE and Semantics on the Web

 Strategies for Lifelong Knowledge Extraction from the Web, Michele Banko and Oren Etzioni

Machine Reading, Oren Etzioni et al.

Unsupervised Recognitions of Objects and Relations on the Web, Alexander Yates and Oren Etzioni

 

 

 

 

May 1st

Final Project presentations

 

 

 

 

Exam

More final project presentations