Summarization Resources


On-line Proceedings

author=B. Endres-Niggemeyer, J. Hobbs, and Karen Sparck Jones
title=Summarizing text for intelligent communication
source=Technical Report Dagstuhl Seminar Report 79, 13.12-19. 12.93 (9350), IBFI, Dagstuhl, Germany ,1993. (Short and Full versions, the latter only available in electronic form)
notes=a number of interesting papers and ideas. a good start to get to know some early work in the field.

Back to Index


conference=ACL/EACL'97 Workshop on Intelligent Scalable Text Summarization
time&address=July, 1997, Madrid, Spain

conference= AAAI Spring Symposium on Intelligent Text Summarization
submission deadline=Oct. 24, 1997
time&address=March, 1998, Stanford

Back to Index


General papers

author=C. D. Paice
title=Constructing literature abstracts by computer:techniques and prospects
source=Information Processing and Management, 26(1):171--186, 1990.
notes=very good overview, widely cited

author=Karen Sparck Jones
title=What might be in a summary?
source=In G. Knorz, J. Krause, and C. Womser-Hacker, editors, Information retrieval '93: von der modellierung zur anwendung, pages 9--26. Konstanz, Universitatsverlag Konstanz, 1993

author=J. Hutchins
title=Summarization: Some problems and methods (abstracting)
source=Meaning: The Frontier of Informatics 9, March 1987, pages 151-73. Karen Spark Jones edition
notes=Review of then-current research projects on summarization.

author=Tomek Strzalkowski
title=Robust Natural Language Processing and user-guided concept discovery for Information retrieval, extraction and summarization
source=Tipster Phase III. In Tipster Text Phase III Kickoff Workshop, Columbia, maryland, October, 1996

author=Garner Ruth
title=Efficient text summarization:costs and benefits
source=Journal of Educational Research, 75:275-279, 1982

author=Winograd, Peter N.
title= Strategic difficulties in summarizing texts
source=Reading Research Quaterly, 19(4):404-425, summer, 1984

author=A. J. Warner
title=Natural language processing in information retrieval
source=Bulletin of the American Society for Information Science 14(6):18-19, August-September 1988
notes=Very general and non-specific paper

author=K.W. Church and Lisa. F. Rau
title=Commercial Applications of Natural Language Processing
source=Communications of the ACM, 38(11):71-79, 1995

author=Endres-Niggemeyer B. and Neugebauer E.
title=Professional Summarising: No Cognitive Simulation without Observation
source=Proceedings of the International Conference in Cognitive Science 1995, May 2-6, San Sebastian. 1995.
notes=presents a "grounded" (i.e, "naturalistic") cognitive model of expert summarising. computerised simulation provided

author=Russell P.
title=Investigating Summary Typology: Considerations for Classification
source=Technostyle, Vol 11 (3/4), Spring/Fall Issue, pp 37-47. 1994

author=Chou Hare, Victoria and Kathleen M. Borchardt
title=Direct instruction of summarization skills.
source=Reading Research Quarterly, 20(1):62--78, Fall. 1984.

author=Endres-Niggemeyer, B., Maier, E., & Sigel, A.
title=How to implement a naturalistic model of abstracting: four core working steps of an expert abstractor
source=Information Processing & Management 31(5), 631-674. 1995.

Back to Index

Methods: Statistical

author=Gerard Salton, James Allan, Chris Buckley, and Amit Singhal
title=Automatic analysis, theme generation, and summarization of machine readable texts
source=Science 264(5164):1421-6, June 1994.
notes=Good overview.

author=G. Salton, A. Singhal, M. Mitra, and C. Buckley
title=Automatic Text Structuring and Summarization
source=Information Processing and Management, 33(2), 193-208, 1997

author=G. Salton, and A. Singhal
title=Automatic Text theme generation and the analysis of text structure
source=Cornell TR94-1438, 1994

author=G. Salton, A. Singhal, C. Buckley and M. Mitra
title=Automatic text decomposition using text segments and text themes
source=In seventh ACM conference on Hypertext, Washington, D.C, 1996
notes=(2 methods for text decompostion: segments and themes. applications -- IR, text traversal, and text summarization, mathematical method. summary: selected paragraph excerpts, plus transition materials)

author=G. Salton, J. Allan, and A. Singhal
title=Automatic Text Decomposition and structuring
source=Information Processing and Management, 32(2), 127-138, 1996

author=Luhn, H.P.
title=The automatic creation of Literature abstracts. IBM Journal of research and development
source=IBM Journal of research and development, 2(2), 1958
notes=the earliest work in summarization

author=J. Kupiec, J. Pedersen, and F. Chen
title=A trainable document summarizer
source=ACM SIGIR '95, pages 68-73
notes=Talks about "document extracts", which are formed by deleting 80% of the original document's text.

author=K. Mahesh
title=Hypertext summary extraction for fast document browsing
source=In proceedings of AAAI Spring Symposium: NLP for WWW, pp. 95-104, Stanford, CA, 1997

author=Fum, D., Guida, G., and Tasso, C.
title=Evaluating importance: a step towards text summarization
source=In IJCAI proceedings. p840-844. Los Altos CA: Kaufmann, 1985

author=Chin-Yew Lin and Eduard Hovy
title=Identifying topics by position

author=Chin-Yew Lin
title=Knowledge-based automatic topic identification
source=In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL--95), pages 308--310, Cambridge, Massachusetts, June 26--30.

author=Fumiyo Fukumoto, Yoshimi Suzuki, and Junichi Fukumoto
title=An automatic Extraction of key paragraphs based on context dependency

author=Alterman R.
title=Summarisation in the small
source=N. Sharkey edition - Advances in cognitive science. Chichester, England, Ellis Horwood. 1986.

author=Alterman R.
title=Text summarisation
source=in Artificial Intelligence Review. 1990.

author=Rath G.J., Resnick A. and Savage R.
title=The formation of abstracts by the selection of sentences: Part 1: sentence selection by man and machines
source=American Documentation 12 (2) pp 139-141. 1961.

author=R. Brandow, K. Mitze, and L.F. Rau
title=Automatic Condensation of Electronic Publications by Sentence Selection
source=Information Processing and Management , 31(5), 675-685, 1995.

author=Inderjeet Mani, David House, Mark Maybury, and Morgan Green
title=Towards Content-Based Browsing of Broadcast News Video
source=In Mark Maybury, editor, Intelligent Multimedia Information Retrieval, AAAI/MIT Press, 1997.

Back to Index

Methods: linguistics-based

author=Lisa F. Rau, P.S. Jacobs, and U. Zernick
title=Information extraction and text summarization using linguistic knowledge acquisition
source=Information Processing and Management 25(4):419-28, 1989.

author=L. F. Rau
title=Conceptual information extraction and information retrieval from natural language input
source=Proceedings of the Conference on User-Oriented, Content-Based, Text and Image Handling, pages 424--437, Cambridge, Massachusetts, 1988.

author=C. Aone, M.E. Okurowski. J. Gorlinsky, and B. Larsen
title=A scable summarization system using robust NLP
source=ACL/EACL-97 summarization workshop

author=Inderjeet Mani and Eric Bloedorn
title=Multi-document Summarization by Graph Search and Matching
source=In proceedings of AAAI-97, Providence Rhode Island, 1997

author=Inderjeet Mani and Eric Bloedorn
title=Summarizing similarities and differences among related documents
source=Proceedings of RIAO-97, Montreal, Canada, June 25-27, 1997

author=Regina Barzilay and Michael Elhadad
title=Using lexical chains for text summarization
source=ACL/EACL-97 summarization workshop.

author=Ruqaiya Hasan
title=Coherence and Cohesive harmony

author=Daniel Marcu
title=From Discourse Structures to text summaries
source=ACL/EACL-97 summarization workshop, p82-88

author=Daniel Marcu
title=Building up Rhetorical Structure Trees

author=Mann, W.C., and Sandra A.T
title=Rhetorical structure theory: Toward a functional theory of text organization
source=Text, 8(3):243-281, 1988

author=Aretoulaki M.
title=Towards a Hybrid Abstract Generation System
source=in Proceedings of the International Conference on New Methods in Language Processing pp 220-227. Manchester, England. 1994.

author=Aretoulaki M.
title=COSY-MATS: A Hybrid Connectionist-Symbolic Approach To The Pragmatic Analysis Of Texts For Their Automatic Summarisation
source=PhD Thesis. Centre for Computational Linguistics, Dept. of Language Engineering, University of Manchester. Institute of Science and Technology (U.M.I.S.T.). Manchester, England. 1996.

author=Fontana N.M.
title=Summarising Strategies in L1 and L2
source=MA Dissertation. University College of North Wales, Bangor. 1989.

author=Gladwin P. and Pulman S. and Spark Jones K.
title=Shallow processing and automatic summarising: a first study
source=Technical Report No. 223, University of Cambridge, Computer Laboratory. 1991.

author=Jordan M.P.
title=The Linguistic Genre of Abstracts
source=in A. Della Volpe edition, The Seventeenth LACUS Forum. Linguistics Association of Canada and the United States, pp 507-527. 1991.

author=Ono K. and Sumita K. and Miike S.
title=Abstract Generation based on Rhetorical Structure Extraction
source=in Proceedings of the 15th International Conference on Computational Linguistics (COLING-94), Vol 1 pp 344-348, Kyoto, Japan. 1994

author=Rino L.H.M. and Scott D.
title=Automatic generation of draft summaries: heuristics for content selection
source=presented at XI Simposio Brasileiro de Inteligencia Artificial, Fortaleza, Brazil. October 1994.

author=Rino L.H.M. and Scott D.
title=Content selection in Summary Generation
source=presented at Third International Conference on the Cognitive Science of Natural Language Processing, Dublin City University, Ireland. July

author=Rush J.E., et al
title=Automatic abstracting and indexing. II. Production of abstracts by application of contextual inference and syntactic coherence criteria
source=Journal of the American Society for Information Science 22 (4) pp 260-274. 1971.

author=Spark Jones K.
title=Discourse modelling for automatic summarising
source=Technical Report No. 290, University of Cambridge Computer Laboratory. 1993

author=Fum, D., Guida, G., & Tasso, C.
title=Forward and backward reasoning in automatic abstracting
source=In COLING. Proceedings of the 9th International Conference on Computational Linguistics (pp. 83-88). Prague. 1982.

author=F. C. Johnson, C. D. Paice, W. J. Black, and A. P. Neal
title=The application of linguistic processing to automatic abstract generation
source=Journal of Document and Text Management, 1(3): 215-241, 1993

Back to Index

Methods: Knowledge-based

author=K. McKeown and D.R. Radev
title=Generating summaries of multiple news articles
source=ACM SIGIR 1995, pages 74-82
notes=Summarizes a group of news articles all about the same event. (Corpus based, summarization operators.)

author=P.S. Jacobs and L.F. Rau
title=SCISOR:Extracting Information from on-line news
source=Communications of the ACM, 33(11), pages 88-97, 1990

author=Young, S. R., Hayes, P.J.
title=Automatic classification and summarization of banking telexes
source=Proceedings of the 2nd conf. on Artifical Intelligence Applications[CAIA]. p402-408. Miami beach, FL, Dec. 11-13, 1985

author=U. Reimer and U. Hahn
title=Text condensation as knowledge base abstraction
source=Proceedings of the Fourth Conference on Artificial Intelligence Applications, March 1988, pages 338-44
notes=Parses the text into a knowledge base, which is filtered and abstracted to produce a summary graph.

author=Paice, C.D., and Jones, A. P.
title=The identification of important concepts in highly structured technical papers
source=In proceedings of the sixteenth annual international ACM SIGIR conference on research and development in IR. 1993.
notes=uses stylistic clues and constructs to fill in semantic frame for automatic abstracting. aims to combine indexing for IR and abstracting tasks.

author=Lehnert W.G.
title=Plot Units and Narrative Summarization
source=in Cognitive Science, 4, pp 293-331. 1981.

author=Tait J.I.
title=Automatic summarising of English texts
source=Technical Report No. 47, University of Cambridge Computer Laboratory. 1982.

author=Tait J.I.
title=Automatic summarising of English texts
source=PhD thesis, University of Cambridge, Cambridge, England.

author=Tait J.I.
title=Generating summaries using a script based language analyser
source=Steels L. and Campbell J.A. editions, Progress in artificial intelligence, Chichester, Ellis Horwood. 1985

author=C. D. Paice
title=Automatic Generation of Literature Abstracts - An Approach Based on the Identification of Self-Indicating Phrases
source=In R. N. Oddy, S. E. Robertson, C. J. van Rijsbergen, and P. W. Williams, editors, Information Retrieval Research, pages 172--191. Butterworths, London, U.K., 1981.

Back to Index

Methods: Generation from Data

author=K. McKeown, J. Robin, and K. Kukich
title=Generating concise natural language summaries
source=Information Processing and Management 31(5):703-33, 1995
notes=This paper talks about how to pack as much information as possible into a summary sentence. Paper talks about application to producing summaries of baseball games.

author=K. McKeown
title=Text generation
source=Cambridge University Press, 1985
notes=This book is based on her Ph.D. thesis, and discusses using discourse strategies (text schemas) to generate natural language text. Although not directly related to text summarization, schemas can be useful for template-based systems.

author=Robin, J. and McKeown, K.R.
title=Empirically designing and evaluating a new revision-based model for summary generation
source=Artificial Intelligence Journal, 1995

Back to Index

Relevant: Information Extraction

author=Douglas E. Appelt and David Israel
title=Building Information Extraction Systems
source=ANLP-97 Tutorial

author=J. Hobbs, D. Appelt, J. Bear, D. Israel, M. Kameyama, M. Stickel, and M. Tyson.
title=FASTUS: a cascaded finite-state transducer for extracting information form natural-language text
source=cmp-lg/9705013, 1997

Back to Index

Relevant: Text Classification

author=Hang Li and Kenji Yamanishi
title=Document classification using a finite mixture model

author=Ido Dagan, Yael Karov, and Dan Roth
title=Mistake-Driven learning in text categorization

author=Brett Kessler, Geoffrey Nunberg, and Hinrich Schutze
title=Automatic Detection of Text Genre

author=Thomas Bayer, Ingrid Renz, Michael Stein, Ulrich Kressel
title=Domain and language independent feature extraction for statistical text categorization

Back to Index


author=Therese. F. Hand
title=A proposal for Task-based Evaluation of Text Summarization Systems
source=ACL/EACL-97 summarization workshop, p31-38
notes=talks about DARPA's TIPSTER III project on evaluation of summarization systems.

author=Jean-Luc Minel, Sylvaine Nugier, and Gerald piat
title=How to appreciate the quality of automatic text summarization
source=ACL/EACL-97 summarization workshop, p25-30

author=Lehnert, W. G. and Sundheim, B
title=A performance Evaluation of Text Analysis Technology
source=AI magazine 12(3):81-94, 1991

author=Karen Sparck Jones and J.R. Galliers
title=Evaluating natural language processing systems: an analysis and review
source=New York: Springer, 1996

author=J.R. Galliers and Karen S. Jones
title=Evaluating Natural Language Processing Systems
source=University of Cambridge Computer Laboratory Technical Report No. 291, Computer Laboratory, University of Cambridge, 1993
notes=the above book is based on this report.

author=A. H. Morris, G. M. Kasper, and D. A. Adams
title=The effects and limitations of Automated Text Condensing on Reading Comprehension Performance
source=Information Systems Research 3:1 pages 17-35, 1992

author=Black W.J., and Johnson F.C.
title=A practical evaluation of two rule-based automatic abstracting techniques
source=Expert Systems for Information Management 1 pp159-177. 1988.

author=Reder L.M. and Anderson J.R.
title=A Comparison of Texts and Their Summaries: Memorial Consequences
source=in Journal of Verbal Learning and Verbal Behavior 19, pp 121-134. 1980

Back to Index


project=Summarization Evaluation Conference
organization=TIPSTER Summarization Evaluation Conference
notes= Participants of the evaluation conference include:
Carnegie Group Inc. and Carnegir-Mellon University(CGI/CMU),
Cornell Univeristy and SabIR Research, Inc. (Cornell/SabIR),
GE Research and Development(GE),
New Mexicon State University(NMSU),
the University of Pennsylvania(Penn),
the University of Southern California-Information Sciences Institute(ISI),
the University of Surrey(Surrey, Britain),
IBM Thomas J. Watson Research(IBM),
TextWise LLC,
SRA International,
British Telecommunications(BT, Britain),
Intelligent Algorithms(IA),
the Center for Intelligent Information Retrieval at the Univeristy of Massachussetts(Umass),
the Center for Information Research(CIR, Russia),
the National Taiwan University (NTU, Taiwan)

project= The Sheffield University Summarisation project
organization=University of Sheffield, UK

project= The Text Summarization Project
organization=University of Ottawa

Company=British Telecom

Company=inxight, a xerox new enterprise company

system= Extractor
Company=Tetranet Software

system=Summarization Project
Company=IBM Japan

system=Summarizer for MS Word

Back to Index

Other Related Links

Back to Index

Welcome to send all comments to

Last updated: 1997 (I no longer maintain this page -- papers published after 1997 are not listed).