Luis Gravano's
Publications and Patents
This material is
presented
to ensure timely dissemination of scholarly and technical work.
Copyright and all rights therein are retained by authors or by
other copyright holders. All persons copying this information are
expected to adhere to the terms and constraints invoked by each
author's copyright. In most cases, these works may not be
reposted without the explicit permission of the copyright holder.
Patents
- String Predicate Selectivity Estimation, S. Chaudhuri, V.
Ganti, and L. Gravano, United States Patent 7,149,735, issued December
12, 2006
- Systems and Methods for Using Anchor Text as Parallel Corpora
for Cross-Language Information Retrieval, L. Gravano and M.
Henzinger, United States Patent 7,146,358, issued December 5, 2006
- Method of Building Multidimensional Workload-Aware Histograms,
S. Chaudhuri, N. Bruno, and L. Gravano, United States Patent 7,007,039,
issued February 28, 2006
- Method for Cost-Based Optimization over Multimedia
Repositories, S. Chaudhuri and L. Gravano, United States Patent
5,806,061, issued September 8, 1998
- Method of Packet Routing in Torus Networks with Two Buffers
per Edge, R. Cypher and L. Gravano, United States Patent
5,444,701, issued August 22, 1995
Papers in Refereed Journals
- Classification-Aware
Hidden-Web Text Database Selection, P. Ipeirotis and L.
Gravano, in ACM Transactions on Information Systems, vol. 26, no. 2,
art. 6, Mar. 2008.
- Towards a Query Optimizer
for Text-Centric Tasks, P. Ipeirotis, E. Agichtein, P. Jain,
and L. Gravano, in ACM Transactions on Database Systems, vol. 32, no.
4, Nov. 2007.
- Modeling and Managing
Changes in Text Databases, P. Ipeirotis, A. Ntoulas, J. Cho,
and L. Gravano, in ACM Transactions on Database Systems, vol. 32, no.
3, Aug. 2007.
- Optimizing Top-k Selection
Queries over Multimedia Repositories, S. Chaudhuri, L.
Gravano, and A. Marian, in IEEE Transactions on Knowledge and Data
Engineering, vol. 16, no. 8, Aug. 2004.
- Evaluating Top-k Queries over
Web-Accessible Databases, A. Marian, N. Bruno, and L. Gravano,
in ACM Transactions on Database Systems, vol. 29, no. 2, June 2004.
- Learning to Find Answers to
Questions on the Web, E. Agichtein, S. Lawrence, and L.
Gravano, in ACM Transactions on Internet Technology, vol. 4, no. 2, May
2004.
- QProber: A System for
Automatic Classification of Hidden-Web Databases, L. Gravano,
P. Ipeirotis, and M. Sahami, in ACM Transactions on Information
Systems, vol. 21, no. 1, Jan. 2003.
- Top-k Selection Queries over
Relational Databases: Mapping Strategies and Performance Evaluation,
N. Bruno, S. Chaudhuri, and L. Gravano, in ACM Transactions on Database
Systems, vol. 27, no. 2, Jun. 2002.
- GlOSS: Text-Source
Discovery over the Internet, L. Gravano, H.
Garcia-Molina, A. Tomasic, in ACM Transactions on Database Systems,
vol. 24, no. 2, Jun. 1999.
- The Stanford Digital Library
Metadata Architecture, M. Baldonado, C.-C. K. Chang, L.
Gravano, and A. Paepcke, in International Journal on Digital Libraries,
vol. 1, no. 2, Sep. 1997.
- Data Structures for Efficient
Broker Implementation, A. Tomasic, L. Gravano, C. Lue, P.
Schwarz, and L. Haas, in ACM Transactions on Information Systems, vol.
15, no. 3, Jul. 1997.
- Storage-Efficient,
Deadlock-Free Packet Routing Algorithms for Torus Networks, R.
Cypher and L. Gravano, in IEEE Transactions on Computers, vol. 43, no.
12, Dec. 1994.
- Requirements for
Deadlock-Free, Adaptive Packet Routing, R. Cypher and L.
Gravano, in SIAM Journal on Computing, vol. 23, no. 6, Dec. 1994.
- Adaptive Deadlock- and
Livelock-Free Routing with All Minimal Paths in Torus Networks,
L. Gravano, G. Pifarre, P. Berman, and J. Sanz, in IEEE Transactions on
Parallel and Distributed Systems, vol. 5, no. 12, Dec. 1994.
- Adaptive Deadlock- and
Livelock-Free Routing in the Hypercube Network, G. Pifarre, L.
Gravano, G. Denicolay, J. Sanz, in IEEE Transactions on Parallel and
Distributed Systems, vol. 5, no. 11, Nov. 1994.
- Fully Adaptive Minimal
Deadlock-Free Packet Routing in Hypercubes, Meshes, and Other Networks:
Algorithms and Simulations, G. Pifarre, L. Gravano, S.
Felperin, and J. Sanz, in IEEE Transactions on Parallel and Distributed
Systems, vol. 5, no. 3, Mar. 1994.
Book Chapter
- XML & Data Streams, N. Bruno, L. Gravano, N. Koudas,
and D. Srivastava. Chapter 4 in "Stream Data Management,"
edited by N. Chaudhry, K. Shaw, and M. Abdelguerfi, Series: Advances in
Database Systems, Volume 30, pages 59-81, Springer, 2005.
Papers in Refereed Conferences
- Learning
Similarity Metrics for Event Identification in Social Media,
H. Becker, M. Naaman, and L. Gravano, to appear in Proc. of the 2010
ACM International Conference on Web Search and Data Mining (WSDM 2010),
2010.
- Join Optimization of
Information Extraction Output: Quality Matters!, A. Jain, P.
Ipeirotis, A. Doan, and L. Gravano, in Proc. of the 25th IEEE
International Conference on Data Engineering (ICDE 2009), 2009.
- Answering General
Time-Sensitive Queries, W. Dakka, L. Gravano, and P.
Ipeirotis, in Proc. of the 17th ACM Conference on Information and
Knowledge Management (CIKM 2008), 2008 (short 2-page "poster" paper).
- Optimizing SQL Queries over
Text Databases, A. Jain, A. Doan, and L. Gravano, in Proc. of
the 24th IEEE International Conference on Data Engineering (ICDE 2008),
2008.
- Efficient Summarization-Aware
Search for Online News Articles, W. Dakka and L. Gravano, in
Proc. of the 2007 ACM+IEEE Joint Conference on Digital Libraries (JCDL
2007), 2007.
- Efficient Keyword Search
Across Heterogeneous Relational Databases, M. Sayyadian, H.
LeKhac, A. Doan, and L. Gravano, in Proc. of the 23rd IEEE
International Conference on Data Engineering (ICDE 2007), 2007.
- SQL Queries Over
Unstructured Text Databases, A. Jain, A. Doan, and L. Gravano,
in Proc. of the 23rd IEEE International Conference on Data Engineering
(ICDE 2007), 2007 (short 3-page "poster" paper).
- To Search or to Crawl?
Towards a Query Optimizer for Text-Centric Tasks ("Best
Paper" Award), P. Ipeirotis, E. Agichtein, P. Jain, and L.
Gravano, in Proc. of the 2006 ACM SIGMOD International Conference on
Management of Data, 2006.
- Modeling and Managing
Content Changes in Text Databases ("Best Paper" Award), P.
Ipeirotis, A. Ntoulas, J. Cho, and L. Gravano, in Proc. of the 21st
IEEE International Conference on Data Engineering (ICDE 2005), 2005.
- When one Sample is not
Enough: Improving Text Database Selection Using Shrinkage, P.
Ipeirotis and L. Gravano, in Proc. of the 2004 ACM SIGMOD International
Conference on Management of Data, 2004.
- Selectivity Estimation for
String Predicates: Overcoming the Underestimation Problem, S.
Chaudhuri, V. Ganti, and L. Gravano, in Proc. of the 20th IEEE
International Conference on Data Engineering (ICDE 2004), 2004.
- Categorizing Web Queries
According to Geographical Locality, L. Gravano, V.
Hatzivassiloglou, and R. Lichtenstein, in Proc. of the 12th ACM
Conference on Information and Knowledge Management (CIKM 2003), 2003.
- Efficient IR-Style Keyword
Search over Relational Databases, V. Hristidis, L. Gravano, and
Y. Papakonstantinou, in Proc. of the 29th International Conference on
Very Large Data Bases (VLDB 2003), 2003.
- Text Joins in an RDBMS for Web
Data Integration, L. Gravano, P. Ipeirotis, N. Koudas, and D.
Srivastava, in Proc. of the 12th International World-Wide Web
Conference (WWW 2003), 2003.
- Querying Text Databases for
Efficient Information Extraction ("Best Student Paper" Award),
E. Agichtein and L. Gravano, in Proc. of the 19th IEEE International
Conference on Data Engineering (ICDE 2003), 2003 [errata].
- Navigation- vs. Index-Based
XML Multi-Query Processing, N. Bruno, L. Gravano, N. Koudas,
and D. Srivastava, in Proc. of the 19th IEEE International Conference
on Data Engineering (ICDE 2003), 2003.
- Text Joins for Data
Cleansing and Integration in an RDBMS, L. Gravano, P.
Ipeirotis, N. Koudas, and D. Srivastava, in Proc. of the 19th IEEE
International Conference on Data Engineering (ICDE 2003), 2003 (short
3-page "poster" paper).
- Distributed Search over the
Hidden-Web: Hierarchical Database Sampling and Selection, P.
Ipeirotis and L. Gravano, in Proc. of the 28th International Conference
on Very Large Data Bases (VLDB 2002), 2002.
- Evaluating Top-k Queries over
Web-Accessible Databases, N. Bruno, L. Gravano, and A.
Marian, in Proc. of the 18th IEEE International Conference on Data
Engineering (ICDE 2002), 2002.
- Extending SDARTS:
Extracting Metadata from Web Databases and Interfacing with the Open
Archives Initiative, P. Ipeirotis, T. Barry, and L. Gravano,
in Proc. of the Second ACM+IEEE Joint Conference on Digital Libraries
(JCDL 2002), 2002.
- Approximate String Joins in a
Database (Almost) for Free, L. Gravano, P. Ipeirotis, H. V.
Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava, in Proc. of
the 27th International Conference on Very Large Data Bases (VLDB 2001),
2001 [errata].
- Probe, Count, and
Classify: Categorizing Hidden Web Databases, P. Ipeirotis, L.
Gravano, and M. Sahami, in Proc. of the 2001 ACM SIGMOD International
Conference on Management of Data, 2001.
- STHoles: A
Multidimensional Workload-Aware Histogram, N. Bruno, S.
Chaudhuri, and L. Gravano, in Proc. of the 2001 ACM SIGMOD
International Conference on Management of Data, 2001.
- SDLIP + STARTS = SDARTS: A
Protocol and Toolkit for Metasearching, N. Green, P. Ipeirotis,
and L. Gravano, in Proc. of the First ACM+IEEE Joint Conference on
Digital Libraries (JCDL 2001), 2001.
- PERSIVAL, a System for
Personalized Search and Summarization over Multimedia Healthcare
Information, K. McKeown, S.-F. Chang, J. Cimino, S. Feiner, C.
Friedman, L. Gravano, V. Hatzivassiloglou, S. Johnson, D. Jordan, J.
Klavans, A. Kushniruk, V. Patel, and S. Teufel, in Proc. of the First
ACM+IEEE Joint Conference on Digital Libraries (JCDL 2001), 2001.
- Learning Search Engine
Specific Query Transformations for Question Answering, E.
Agichtein, S. Lawrence, and L. Gravano, in Proc. of the 10th
International World-Wide Web Conference (WWW10), 2001.
- Computing Geographical
Scopes of Web Resources, J. Ding, L. Gravano, and N.
Shivakumar, in Proc. of the 26th International Conference on Very Large
Data Bases (VLDB'00), 2000. (PDF
version)
- An Investigation of
Linguistic Features and Clustering Algorithms for Topical Document
Clustering, V. Hatzivassiloglou, L. Gravano, and A. Maganti, in
Proc. of the 23rd ACM SIGIR Conference on Research and Development in
Information Retrieval (SIGIR'00), 2000. (PDF version)
- Snowball: Extracting Relations
from Large Plain-Text Collections, E. Agichtein and L. Gravano,
in Proc. of the 5th ACM International Conference on Digital Libraries
(DL'00), 2000. (PDF version)
- Evaluating Top-k Selection
Queries, S. Chaudhuri and L. Gravano, in Proc. of the 25th
International Conference on Very Large Data Bases (VLDB'99), 1999. (PDF version)
- Merging Ranks from
Heterogeneous Internet Sources, L. Gravano and H.
Garcia-Molina, in Proc. of the 23rd International Conference on Very
Large Data Bases (VLDB'97), 1997. (PDF
version)
- Metadata for Digital Libraries:
Architecture and Design Rationale, M. Baldonado, C.-C.
K. Chang, L. Gravano, and A. Paepcke, in Proc. of the 2nd ACM
International Conference on Digital Libraries (DL'97), 1997.
- STARTS: Stanford Proposal
for Internet Meta-Searching, L. Gravano, C.-C. K. Chang, H.
Garcia-Molina, and A. Paepcke, in Proc. of the 1997 ACM SIGMOD
International Conference on Management of Data, 1997.
- dSCAM: Finding Document Copies
across Multiple Databases, H. Garcia-Molina, L.
Gravano, and N. Shivakumar, in Proc. of the 4th International
Conference on Parallel and Distributed Information Systems (PDIS'96),
1996.
- Optimizing Queries over
Multimedia Repositories, S. Chaudhuri and L. Gravano, in Proc.
of the 1996 ACM SIGMOD International Conference on Management of Data,
1996.
- Generalizing GlOSS to
Vector-Space Databases and Broker Hierarchies, L. Gravano and
H. Garcia-Molina, in Proc. of the 21st International Conference on Very
Large Data Bases (VLDB'95), 1995.
- Precision and Recall of GlOSS
Estimators for Database Discovery, L. Gravano, H.
Garcia-Molina, and A. Tomasic, in Proc. of the 3rd International
Conference on Parallel and Distributed Information Systems (PDIS'94),
1994 (short paper).
- The Effectiveness of GlOSS
for the Text-Database Discovery Problem, L. Gravano, H.
Garcia-Molina, and A. Tomasic, in Proc. of the 1994 ACM SIGMOD
International Conference on Management of Data, 1994.
- Requirements for
Deadlock-Free, Adaptive Packet Routing, R. Cypher and L.
Gravano, in Proc. of the 11th ACM Symposium on Principles of
Distributed Computing (PODC '92), 1992. (PDF version)
- Adaptive, Deadlock-Free Packet Routing in Torus Networks with
Minimal Storage, R. Cypher and L. Gravano, in Proc. of the 1992
International Conference on Parallel Processing (ICPP '92), 1992.
- Adaptive Deadlock- and
Livelock-Free Routing with All Minimal Paths in Torus Networks,
P. Berman, L. Gravano, G. Pifarre, and J. Sanz, in Proc. of the 4th
Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA
'92), 1992. (PDF version)
- Adaptive Deadlock-Free
Worm-Hole Routing in Hypercubes, L. Gravano, G. Pifarre, G.
Denicolay, and J. Sanz, in Proc. of the 6th International Parallel
Processing Symposium (IPPS '92), 1992 (short paper).
- Fully-Adaptive Routing: Packet
Switching Performance and Worm-Hole Algorithms, S. Felperin, L.
Gravano, G. Pifarre, and J. Sanz, in Proc. of Supercomputing '91, 1991.
(PDF version)
- Fully-Adaptive Minimal
Deadlock-Free Packet Routing in Hypercubes, Meshes, and Other Networks,
G. Pifarre, L. Gravano, S. Felperin, and J. Sanz, in Proc. of the 3rd
Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA
'91), 1991. (PDF version)
Papers in Refereed Workshops and
Demonstration Sessions
- Event Identification in
Social Media, H. Becker, M. Naaman, and L. Gravano, in Proc.
of the ACM SIGMOD Workshop on the Web and Databases (WebDB 2009), 2009.
- Modeling Query-Based Access
to Text Databases, E. Agichtein, P. Ipeirotis, and L. Gravano,
in Proc. of the ACM SIGMOD Workshop on the Web and Databases (WebDB
2003), 2003.
- QXtract: A Building
Block for Efficient Information Extraction from Text Databases
(demonstration), E. Agichtein and L. Gravano, in Proc. of the
2003 ACM SIGMOD International Conference on Management of Data, 2003.
- Snowball: A Prototype
System for Extracting Relations from Large Text Collections
(demonstration), E. Agichtein, L. Gravano, J. Pavel, V.
Sokolova, and A. Voskoboynik, in Proc. of the 2001 ACM SIGMOD
International Conference on Management of Data, 2001.
- PERSIVAL Demo: Categorizing
Hidden-Web Resources (demonstration), P. Ipeirotis, L. Gravano,
and M. Sahami, in Proc. of the First ACM+IEEE Joint Conference on
Digital Libraries (JCDL 2001), 2001.
- Automatic Classification of
Text Databases through Query Probing, P. Ipeirotis, L. Gravano,
and M. Sahami, in Proc. of the ACM SIGMOD Workshop on the Web and
Databases (WebDB'00), 2000. (PDF
version) Also in LNCS Series no. 1997, Springer, 2001.
- Combining Strategies for
Extracting Relations from Text Collections, E.
Agichtein, E. Eskin, and L. Gravano, in Proc. of the ACM SIGMOD
Workshop on Research Issues in Data Mining and Knowledge Discovery
(DMKD 2000), 2000. (PDF version)
- Exploiting Geographical
Location Information of Web Pages, O. Buyukkokten, J. Cho, H.
Garcia-Molina, L. Gravano, and N. Shivakumar, in Proc. of the ACM
SIGMOD Workshop on the Web and Databases (WebDB'99), 1999. (PDF version)
Invited Papers
- Building Query
Optimizers for Information Extraction: The SQoUT Project, A.
Jain, P. Ipeirotis, and L. Gravano, in SIGMOD Record, Special Issue on
"Managing Information Extraction," vol. 37, no. 4, December 2008.
- Query- vs. Crawling-based
Classification of Searchable Web Databases, L. Gravano, P.
Ipeirotis, and M. Sahami, in IEEE Data Engineering Bulletin, vol. 25,
no. 1, March 2002.
- Using q-grams in a DBMS for
Approximate String Processing, L. Gravano, P.
Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, L. Pietarinen,
and D. Srivastava, in IEEE Data Engineering Bulletin, vol. 24, no. 4,
December 2001 [errata].
- Simplifying Data Access:
The Energy Data Collection Project, J. L. Ambite, Y.
Arens, E. Hovy, A. Philpot, L. Gravano, V. Hatzivassiloglou, and J.
Klavans, in IEEE Computer, vol. 34, no. 2, February 2001.
- Database Research at
Columbia University, S.-F. Chang, L. Gravano, G.
Kaiser, K. Ross, and S. Stolfo, in SIGMOD Record, vol. 27,
no. 3, September 1998.
- Mediating and Metasearching
on the Internet, L. Gravano and Y. Papakonstantinou, in IEEE
Data Engineering Bulletin, vol. 21, no. 2, June 1998.
- The Stanford InfoBus and
Its Service Layers: Augmenting the Internet with Higher-Level
Information Management Protocols, M. Roscheisen, M.
Baldonado, C.-C. K. Chang, L. Gravano, S. Ketchpel, and A. Paepcke, in Digital
Libraries in Computer Science: The MeDoc Approach, LNCS Series no.
1392, Springer, 1998.
- Optimizing Queries over
Multimedia Repositories, S. Chaudhuri and L. Gravano, in
IEEE
Data Engineering Bulletin, vol. 19, no. 4, December 1996.
- Routing Techniques for
Massively Parallel Communication, S. Felperin, L. Gravano, G.
Pifarre, and J. Sanz, in Proceedings of the IEEE, vol. 79, no. 4, April
1991.
Position Papers, Meeting Reports, and
Miscellaneous Publications
- Characterizing Web
Resources for Improved Search, L. Gravano. Position paper for the
First NSF-DELOS Workshop on Information Seeking, Searching, and
Querying in Digital Libraries, Zurich, Switzerland, December 2000. (PDF version)
- Resource Indexing and
Discovery In a Globally Distributed Digital Library, L.
Gravano. Position paper for the NSF-EU Digital Library Collaboratory
Working Group, Budapest, Hungary, November 1997.
- Querying Multiple Document
Collections across the Internet, L. Gravano. Ph.D.
Dissertation, Stanford University (advisor: H. Garcia-Molina), August
1997. (PDF version)
- Informal Internet Standards at
Stanford, L. Gravano, C.-C. K. Chang, H. Garcia-Molina, and A.
Paepcke. Position paper for the 1996
World-Wide Web Consortium (W3C) Distributed Indexing/Searching Workshop,
May 1996.
Luis Gravano
gravano@cs.columbia.edu