Applied Natural Language Processing Research at GE Labs

Tomek Strzalkowski
Natural Language Group
GE Labs

 

Abstract

This talk will discuss several on-going research projects by the NL Group. (1) The natural language information retrieval project uses advanced text processing techniques to improve statistical information retrieval. The full-text query expansion technique has been particularly promissing in TREC. I will present the stream-based architecture of our system and the results of last year TREC evaluations, including some preliminary reflections on NLP role in IR. (2) The robust text summarization project is a part of Tipster Phase III contract. The summarizer, which exploits the concept of discourse macro-structure, derives perfectly readable, brief summaries, both generically and with respect to a supplied topic. (3) The Automated Transcription project is developing text-based methods for improving the output quality of automated continuous speech recognition. The application domain for this project is the clinical dictation (radiology, pathology, ER). The SRS output-correction method will be discussed along with some preliminary results.



Luis Gravano
gravano@cs.columbia.edu