New York Area Theory Day

Columbia University Department of Computer Science

New York Area Theory Day

Organized by: IBM/NYU/Columbia
Friday, May 7, 2010
Davis auditorium, 412 Schapiro (CEPSR) building, Columbia University, New York.

The New York Area Theory Day is a semi-annual conference, aimed to bring together people in the New York Metropolitan area for one day of interaction and discussion. The Theory Day features several (usually 4-5) hour-long presentations by leading theoretical computer scientists about state-of-the-art advances in various areas. Some presentations give a survey of the latest advances in some area, while others may concentrate on a particular result.
The meeting is free and open to everyone; in particular, students are encouraged to attend.

Program:

9:30 - 10:00 Coffee and bagels

10:00 - 10:55 Dr. Shai Halevi (IBM Research)
On Homomorphic Encryption and Secure Computation

10:55 - 11:05 Short break

11:05 - 12:00 Prof. Claire Mathieu (Brown University)
Recognizing Well-Parenthesized Expressions in the Streaming Model

12:00 - 2:00 Lunch break

2:00 - 2:55 Dr. Alex Andoni (Princeton)
Polylogarithmic Approximation to Edit Distance
(or, the Asymmetric Query Complexity)

2:55 - 3:15 Coffee break

3:15 - 4:10 Dr. John Langford (Yahoo Research)
The Foundations of Learning from Exploration Data

For directions, please see
http://www.cs.columbia.edu/theory/directions.html

To subscribe to our mailing list, follow instructions at
http://www.cs.nyu.edu/mailman/listinfo/theory-ny

Organizers:
Yevgeniy Dodis dodis@cs.nyu.edu
Cliff Stein stein.cliff@gmail.com
Tal Rabin talr@us.ibm.com
Baruch Schieber sbar@us.ibm.com

==================================================================

Abstracts

On Homomorphic Encryption and Secure Computation

Dr. Shai Halevi
(IBM Research)

A homomorphic encryption scheme is one that allows computing on
encrypted data, such that the result of the computation can still be
decrypted. I will talk about recent developments in constructing (fully)
homomorphic encryption schemes, and relations with protocols for secure
function evaluation. Specifically, I will:

* Describe the approach that underlies Gentry's recent homomorphic
encryption scheme and all its variants, and illustrate a few different
instantiations of this approach.

* Survey some easy (but often ignored) connections between homomorphic
encryption schemes and protocols for two-party secure function
evaluation.

* Show how to extend homomorphic encryption schemes to a "multi hop"
setting: In this setting, several parties sequentially compute on the
same encrypted data, and we want to allow each party to use not only the
original encrypted data but also the results of prior computations.

==================================================================

Recognizing Well-Parenthesized Expressions in the Streaming Model

Dr. Claire Mathieu
Brown University)

Motivated by a concrete problem and with the goal of understanding the
sense in
which the complexity of streaming algorithms is related to the complexity
of
formal languages, we investigate the problem Dyck(s) of checking matching
parentheses, with s different types of parenthesis. We present a
one-pass
randomized streaming algorithm for Dyck(2) with space O( sqrt{n} log n),
time per letter polylog(n), and one-sided error. We prove that
this one-pass algorithm is optimal, up to a polylog(n) factor, even
when two-sided error is allowed. For the lower bound, we prove a direct
sum
result on hard instances by following the "information cost" approach,
but
with a few twists. Indeed, we play a subtle game between public and
private
coins. This mixture between public and private coins results from a
balancing
act between the direct sum result and a combinatorial lower bound for the
base
case. ÂSurprisingly, the space requirement shrinks drastically if we have
access to the input stream in reverse. We present a two-pass randomized
streaming algorithm for Dyck(2) with space O((log n)^2), time polylog(n)
and one-sided error, where the second pass is in the reverse
direction. Both algorithms can be extended to Dyck(s) since this problem
is
reducible to Dyck(2) for a suitable notion of reduction in the streaming
model.

This is joint work with Frederic Magniez and Ashwin Nayak.

==================================================================

Polylogarithmic Approximation to Edit Distance
or, the Asymmetric Query Complexity)

Dr. Alex Andoni
(Princeton)

We present a near-linear time algorithm that approximates the edit
distance
between two strings within a polylogarithmic factor. This is an
exponential improvement over the previously known bounds.

Our new algorithm emerges from the investigation of the edit distance
within a new framework, namely a model of asymmetric queries. Within
this framework, we are able to maintain a parallel view of the
upper and lower bounds, leading to near-tight query complexity bounds.
Our investigation also yields the first
rigorous separation between the edit distance and the Ulam distance
(edit distance on permutations), thus uncovering further hardness
phenomena inherent to the edit distance,
beyond what the previous analyses have revealed.

I will talk about some arising open questions.

Joint work with Robert Krauthgamer and Krzysztof Onak.

==================================================================

The Foundations of Learning from Exploration Data

Dr. John Langford
(Yahoo Research)

It is very natural to wish to apply machine learning to the
large amounts of data generated by user interactions, but the process
turns out to be delicate. For example, if a news story snippet is shown
to a user, and the user clicks on (and reads) it, this event is
fundamentally _not_ equivalent to a multiclass label for several
reasons. For example, the user might easily have been interested in
other (unseen) stories as well. To deal with these issues, we propose
using the contextual bandit setting, where on each round information is
used to choose an action, and feedback about just this action is
observed. This setting has the great virtue that it's tractable, with
algorithms enjoying regret and sample complexity guarantees entirely
comparable to what's possible in a more familiar supervised learning
setting.