Research Interests

My expertise lies in natural language processing (NLP), specifically in the areas of formal and computational models of syntax and other levels of linguistic representation, and in modeling how people use language to achieve communicative goals. I have an active interest in both theoretical issues and practical applications.

I am currently working on modeling how humans interact in written communication. Written communication usually is asynchronous, and typically is not face-to-face. It is thus rather different from a prototypical face-to-face spoken conversation. With the advent of the internet and mobile computing, more and more of our communication is shifting to written communication: emails, tweets, text messages, discussion forums, responses to media contributions and blog posts, interactions on websites such as Facebook, and so on. In these written interactions, we use language to achieve various types of communicative goals; many of our communicative goals revolve around creating and perpetuating social relations: for humans, language is the prime medium for maintenance of social networks. My research interests lie at the intersection of social relations (including power relations) and language use in written dialogs. For more detail, see the home page of the WISR (Written Interaction and Social Relations) research group.

In addition, I am interested in modeling the linguistic "nuts and bolts" (morphology, syntax, semantics) of languages. One project aims at finding a good linguistic representation for the lexicon, morphology, and syntax of Arabic dialects. Arabic dialects pose a problem for natural language processing as the spoken dialects are not written, and the written language is not natively spoken. Thus, standard corpus-based approaches do not work. This project has resulted in several widely used tools. For more information, see the page of the CADIM Group.

One application area that I have been working on is computational social science. Much of the data analyzed by social scientists is language, and natural language processing holds the promise of automatically processing an enormous amount of such data. This will allow social scientists to draw conclusions that are empirically well founded. One example of such work I have been involved in is the analysis of the Enron corpus in order to relate power, gender, and language use. Another example is a current project that aims at enabling historians to study large amounts of source material on American foreign relations (such as diplomatic cables). This requires pre-processing the sources to detect and classify named entities. Finally, a third project studies tweets by Chicago gang members; the NLP contribution is to automatically analyze the tweets and to relate them to offline violence. Computational social science projects are inherently interdisciplinary, of course, and much of the excitement for me of the work in computational social science comes from the cross-disciplinary interactions that it requires.

Back to Owen Rambow's Home Page