| Former PhD student in the Natural Language Processing Group , Computer Science Department at Columbia University . |
My new web page is here .
I have moved to University of Maryland Institute for Advanced Computer Studies (UMIACS) as a Postdoctoral Research Associate. I am working with Philip Resnik and the Machine Translation group.
In my thesis, I have designed, implemented and evaluated a relational learning framework for inducing constraint-based grammars using a domain ontology as background knowledge. This grammar induction framework is general and I have shown its ability to cover large fragments of natural language and its usefulness for acquiring domain knowledge from text. In particular, I have focused on acquiring terminological knowledge in the medical domain. Understanding and sharing terminology, both by systems and humans, are important aspects of communication. Many domains, including the medical domain, evolve rapidly, new concepts being defined in textual resources, such as on-line articles and web documents. Thus, relying on static dictionaries and glossaries is not enough to keep the information up-to-date. I designed, implemented and evaluated DEFINDER (, ), a system for extracting definitions from online medical articles. I applied my grammar induction tool and inference mechanisms to the definitional corpus extracted by DEFINDER to build a terminological knowledge base. For multiple definitions of the same term extracted from different sources, I have implemented a merging algorithm, in which similarities, differences and contradictions are identified. Contradictions might be used as an indicator of potentially unreliable source documents.