Alpa Jain
Ph.D. Student
Advisor: Luis Gravano
Database Research Group
Computer Science Department
Columbia University
Contact Information
450 Computer Science
1214 Amsterdam Avenue
New York, NY 10027
USA
Telephone: 212-939-7117
Fax: 212-666-0140

  • Research interests: Information extraction, Information retrieval, Text databases and query processing.
  • CV

Projects

  • SQOUT: Structured Query Processing over Text Documents: Developing efficient strategies for "structured" relational query processing over plain text documents by relying on information extraction and information retrieval techniques.

  • Past Projects
    • DISCUS: Decentralised Information Spaces for Composition and Unification of Services : A prototype framework that enables secured, ad-hoc communication between hetrogeneous software components that may span organisational boundaries, to rapidly deal with a unique and temporary problem.
    • WGC: Workgroup Cache : A system that enables enables collaboration within and among workgroups by providing a shared repository for information and thereby reducing distribution latency and costs.

Publications

Papers in Refereed Journals

  1. A Quality-Aware Optimizer for Information Extraction,
    Alpa Jain and Panagiotis Ipeirotis, to appear ACM Transactions on Database Systems (TODS)

Papers in Refereed Conferences

  1. Exploring a Few Good Tuples From a Text Database,
    Alpa Jain and Divesh Srivastava, ICDE , 2009.
  2. Join Optimization of Information Extraction Output: Quality Matters!,
    Alpa Jain, Panagiotis Ipeirotis, AnHai Doan, and Luis Gravano, ICDE , 2009.
  3. Optimizing SQL Queries over Text Databases,
    Alpa Jain, AnHai Doan, and Luis Gravano, ICDE, 2008
  4. Acronym-Expansion Recognition and Ranking on the Web,
    Alpa Jain, Silviu Cucerzan, and Saliha Azzam, IEEE-IRI, 2007.
  5. SQL Queries Over Unstructured Text Databases,
    Alpa Jain, AnHai Doan, and Luis Gravano, ICDE, 2007 (short poster paper).
  6. Names and Similarities on the Web: Fact Extraction in the Fast Lane,
    Marius Pasca, Dekang Lin, Jeffrey Bigham, Andrei Lifchits, and Alpa Jain, COLING-ACL, 2006.
  7. Organizing the World Wide Web of Facts - Step One: the One-Million Fact Extraction Challenge, Marius Pasca, Dekang Lin, Jeffrey Bigham, Andrei Lifchits, and Alpa Jain, AAAI, 2006.
  8. Decentralized Information Spaces for Composition and Unification of Services,
    Alpa Jain and Gail Kaiser, Position paper in the Object-Oriented Web Services (OOWS) Workshop, OOPSLA Conference, 2002

Papers in Posters, and Demonstration Sessions

  1. Relational Query Processing Over Text Documents, Alpa Jain and Luis Gravano, in New York DB/IR Day (April 2005: Best Technical Presentation Award, November 2005: Honorable Mention Award)

Invited Papers

  1. Building Query Optimizers for Information Extraction: The SQoUT Project,
    Alpa Jain, Panagiotis Ipeirotis, and Luis Gravano, to appear in SIGMOD Record, Special Issue on "Managing Information Extraction," vol. 37, no. 4, December 2008.

Teaching

  • Database Management Systems (Fall 2004). Teaching Assistant for Prof. Gail Kaiser (Extraordinary Teaching Assistant Award).
  • Advanced Web Applications (Spring 2000). Teaching Assistant for Dr.Alfred Spector, IBM
  • Internet Communication Programming (Spring 2000). Teaching Assistant for Dr.Doree Seligmann, Lucent Technologies

Tutorials