NLP for the Web

Spring 2010

Prof. Kathy McKeown


Syllabus

Syllabus

Date

Topic and Slides

Reading




Jan 21st

Introduction and Summarization
(Slides: intro)

Automatic Summarising: Factors and Directions (K. Sparck Jones), Advances in Automatic Text Summarization 2000.

Automatic Evaluation of Summaries Using N-gram Co-Occurrence Statistics (C.Y. Lin and E. Hovy), HLT-NAACL 2003.




Jan 28th

Generation for Summarization
(Slides: spark-jones-00)

Cut and paste based text summarization (H. Jing and K.R. McKeown), NAACL 2000.

Supervised and Unsupervised Learning for Sentence Compression,(J. Turner and E. Charniak), ACL, 2005.

Sentence Fusion for Multidocument News Summarization (R. Barzilay and K.R. McKeown), CL 2005.

Sentence Compression beyond Word Deletion, Cohn and Lapata, COLING 2008.




Feb 4th

Summarization on the Web
(Slides: sun-05 / buyukkokten-02 / tools)

Efficient web browsing on handheld devices using page and form summarization.Orkut Buyukkokten, Oliver Kaljuvee, Hector Garcia-Molina, Andreas Paepcke, and Terry Winograd. In ACM Transactions on Information Systems, Vol. 20, No. 1, pages 82-115, January 2002.


Web-page summarization using clickthrough data.Jian-Tao Sun, Dou Shen, Hua-Jun Zeng, Qiang Yang, Yuchang Lu, and Zheng Chen. In SIGIR 2005, pages 194-201, 2005.


On the Summarization of Dynamically Introduced Information: Online Discussions and Blogs, L. Zhou and E. Hovy, in AAAI Spring Symposium 2006.


Ocelot: a system for summarizing web pages. A. Berger and V. Mittal. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'00), pages 144-151, 2000.






Feb 11th

Social Networking
(Slides: gruzd-2008)

Automated discovery and analysis of social networks from threaded discussions, International Network of Social Network Analysts (Gruzd and Haythornthwaite), 2008
From social bookmarking to social summarization: an experiment in community-based summary generation. Oisin Boydell and Barry Smyth. In IUI '07: Proceedings of the 12th international conference on Intelligent user interfaces, pages 42-51, New York, NY
Joint Group and Topic Discovery from Relations and Text, Andrew McCallum, Xuerui Wang, and Natasha Mohanty, Statistical Network Analysis: Models, Issues and New Directions, Lecture Notes in Computer Science 4503, pages 28-44, 2007.

Discovering authorities in question answer communities by using link analysis, Pavel Jurczyk and Eugene Agichtein, In CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (2007)




Feb 18th

Question Answering
(Slides: prager-06 / hermjakob-02 / prager-acl-06 / hong-09)

Open-Domain Question-Answering (J. Prager), Foundations and Trends in Information Retrieval 2006. NOTE: Sections 2 and 3.


Natural language based reformulation resource and web exploitation for question answering (U. Hermjakob et al), TREC 2002.


Improving QA Accuracy by Question Inversion (J. Prager et al), ACL 2006.


A classification-based approach to question answering in discussion boards, Liangjie Hong and Brian D Davison, SIGIR 09.






Feb 25th

Question Answering

Invited Speaker: Sanda Harabagiu, Language Computer Corporation


**Answering Questions with Authority (A. Hickl), CIKM 2008.

**Answering Complex Questions with Random Walk Models (S. Harabagiu, F. Lacusu, A. Hickl), SIGIR 2006.

Using Discourse Commitments to Recognize Textual Entailment (A. Hickl), COLING 2008.

Negation, Contrast and Contradiction in Text Processing (S. Harabagiu, A. Hickl, F. Lacatusu), AAAI 2006.





Mar 4th

Entailment
(Slides: mccartney-06 / mccartney-08 )

Learning to recognize features of valid textual entailments (B. MacCartney et al), NAACL06.


Containment, Exclusion, and Implicativity: A Model of Natural Logic for Textual Inference (B. MacCartney and C. Manning), Stanford TR08.


An Inference Model for Semantic Entailment in Natural Language (R. de Salvo Braz et al), AAAI05.


An Inference Model for Semantic Entailment in Natural Language Recognition (R. Bar-Haim et al), ACL PASCAL-RTE Workshop 07.






Mar 11th

Generating online content

Invited Speaker: Regina Barzilay, MIT


Automatically Generating Wikipedia Articles: A Structure-Aware Approach", Christina Sauper and Regina Barzilay , Proceedings of ACL, 2009.










Mar 18th

Spring Break







Mar 25th

Opinions
(Slides: wiebe-05 / breck-07 / thomas-06)

Get out the vote: Determining support or opposition from Congressional floor-debate transcripts, (M. Thomas et al), emnlp06.


Annotating Expressions of Opinions and Emotions in Language, (J. Wiebe et al), Language Resources and Evaluation 05.


Identifying expressions of opinion in context, (E. Breck et al), IJCAI07.


Just How Mad are You? Finding Strong and Weak Opinion Caluses, (T. Wilson et al), AAAI04.






Apr 1st

Sentiment Analysis for the Web
(Slides: snyder-07 / branavan-08 / archak-07 / lerman-09)

Multiple Aspect Ranking using the Good Grief Algorithm, (B. Snyder and R. Barzilay), NAACL07.

Learning Document-Level Semantic Properties from Free-text Annotations, S. R. K. Branavan and Harr Chen and Jacob Eisenstein and Regina Barzilay, ACL08


Show me the money! Deriving the Pricing Power of Product Features by Mining Consumer Reviews, (N. Archak et al), ACM SIGKDD07.


Sentiment Summarization: Evaluating and Learning User Preferences, K. Lerman, S. Blair-Goldensohn, and R. McDonald, EACL 09.






Apr 8th

Multilingual Tasks and Approaches
(Slides: lapata-05 / resnik-03 )

Web-based Models for Natual Language Processing , (M. Lapata and F. Keller), TSLP05.


The Web as a Parallel Corpus , (P. Resnick and N.A. Smith), CL Journal03.


Translating Named Entities Using Monolingual and Bilingual Resources , (R. Sproat et al), COLING-ACL06.


Answering English Questions using Foreign-Language, Semi-Structured Sources, (Boris Katz, Gary Borchardt, Sue Felshin, Yuan Shen and Gabriel Zaccak), ICSC 2007.

Web as Corpus , (A. Kilgarriff and G. Grefenstette), CL Journal03.





Apr 15th

Search

Invited Speaker: Wisam Dakka, Google


Learning query-biased web page summarization. Wang, C., Jing, F., Zhang, L., and Zhang, H. 2007.

In CIKM '07


Fast generation of result snippets in web search.Turpin, A., Tsegay, Y., Hawking, D., and Williams, H. E. 2007. In SIGIR '07





Apr 22nd

IE and Semantics on the Web
(Slides: etzioni-06 / nothman-09)

Strategies for Lifelong Knowledge Extraction from the Web, Michele Banko and Oren Etzioni

Machine Reading, Oren Etzioni et al.

Analysing Wikipedia and Gold-Standard Corpora for NER Training Joel Nothman, Tara Murphy and James R. Curran, ACL 09

Deriving Generalized Knowledge from Corpora Using WordNet Abstraction Benjamin Van Durme, Phillip Michalak and Lenhart Schubert, ACL 09.





Apr 29th

Final Project presentations







Exam

More final project presentations