Applied Natural Language Processing Research at GE Labs
Abstract
This talk will discuss several on-going research projects by the NL Group.
(1) The natural language information retrieval project uses advanced text
processing techniques to improve statistical information retrieval. The
full-text query expansion technique has been particularly promissing in
TREC. I will present the stream-based architecture of our system and the
results of last year TREC evaluations, including some preliminary reflections
on NLP role in IR. (2) The robust text summarization project is a part
of Tipster Phase III contract. The summarizer, which exploits the concept
of discourse macro-structure, derives perfectly readable, brief summaries,
both generically and with respect to a supplied topic. (3) The Automated
Transcription project is developing text-based methods for improving the
output quality of automated continuous speech recognition. The application
domain for this project is the clinical dictation (radiology, pathology,
ER). The SRS output-correction method will be discussed along with some
preliminary results.
Luis Gravano
gravano@cs.columbia.edu