For Spring 2011, the usual time for the meetings will be Monday at 13:00 - 14:30 in the CS conference room, CSB 453. Abstracts for talks are given below the talk schedule.

Moshe Lewenstein

Bar-Ilan University

**Abstract:**
We present an approximate distance oracle for a point set S with n points and doubling dimension [;\lambda;]. For every [;\epsilon > 0;], the oracle supports [;(1 + \epsilon);] - approximate distance queries in (universal)
constant
time, occupies space [;[\epsilon^{-O(\lambda)} + 2^{O(\lambda \log \lambda)}]n;], and can be constructed in [;[2^{O(\lambda)}\log^3 n + \epsilon^{-O(\lambda)} + 2^{O(\lambda \log \lambda)}]n;] expected time. This improves upon
the
best
previously known constructions, presented by Har-Peled and Mendel. Furthermore, the oracle can be made fully dynamic with expected [;O(1);] query time and only [;2^{O(\lambda)} \log n + \epsilon^{-O(\lambda)} + 2^{O(\lambda \log
\lambda)};]
update time. This is the first fully dynamic [;(1+\epsilon);]-distance oracle.

Coauthors: Yair Bartal, Lee-Ad Gottlieb, Tsvi Kopelowitz, Liam Roditty

Dov Gordon

Columbia University

**Abstract:**
In 1982, Yao introduced the field of "secure computation", in which [;n;] parties, holding private inputs [;x_1,...,x_n;], engage in a protocol to compute some function [;f(x_1,...,x_n);], while revealing nothing more than the
output. In the decade that followed, the topic of secure computation was thoroughly explored, and almost every theoretical question was answered. In the past decade, the focus has shifted towards improving efficiency, and building
implementations. Today, secure computation is poised to become an extremely powerful tool with wide-ranging application. Both bodies of research were essential in bringing this about.

In this talk, we review two recent results. The first provides insight into one of the few remaining theoretical questions in secure computation. The second demonstrates improved efficiency for secure computation in a particular setting of interest. On the theoretical side, we discuss the problem of "fairness" in secure computation, which is a security property guaranteeing simultaneous output delivery to both parties. Until recently, this was widely believed to be impossible to achieve. We will discuss a new relaxed notion of fairness and show broad feasibility results. We will also touch on a very exciting result demonstrating (for the first time) that some interesting functions can in fact be computed with complete fairness. In the second half of the talk, we will focus on a particular setting where a server owns a large database, and repeatedly interacts with a client in a sequence of secure computations over that data. We demonstrate great improvement over the existing general solutions by amortizing the cost of the computation. In presenting these two results, we hope to demonstrate the importance of both foundational and applied cryptographic theory.

Sergei Vassilvitskii

Yahoo! Research

**Abstract:**
A popular practical method of obtaining a good estimate of the error rate
of a learning algorithm is [;k;]-fold cross-validation. Here, the
set of examples is first partitioned into [;k;] equal-sized folds.
Each fold acts as a test set for evaluating the hypothesis learned
on the other [;k-1;] folds. The average error across the k
hypotheses is used as an estimate of the error rate. Although
widely used, especially with small values of [;k;] (such as 10), the
cross-validation method has heretofore resisted theoretical
analysis due to the fact that the [;k;] distinct estimates have inherent
correlations between
them. With only sanity-check bounds known, there is no
compelling reason to use the [;k;]-fold cross-validation estimate over
a simpler holdout estimate.

Conventional wisdom is that the averaging in cross-validation leads to a tighter concentration of the estimate of the error around its mean. We show that the conventional wisdom is essentially correct. We analyze the reduction in variance of the gap between the cross-validation estimate and the true error rate, and show that for a large family of stable algorithms cross-validation achieves a near optimal variance reduction factor of [;\frac{1 +o(1)}{k};]. In these cases, the [;k;] different estimates are essentially independent of each other.

To proceed with the analysis, we define a new measure of algorithm stability, called mean-square stability. This measure is weaker than most stability notions described in the literature, and encompasses a large class of algorithms including bounded SVM regression and regularized least-squares regression. For slightly less stable algorithms such as t-nearest-neighbor, we show that cross-validation leads to an [;O\left(\frac{1}{\sqrt{k}}\right);] reduction in the variance of the generalization error.

Nova Fandina

Ben-Gurion University

**Abstract:**
Galperin and Wigderson proposed a succinct representation for graphs, that uses number of bits that is logarithmic in
the number of nodes. They proved complexity results for various decision problems on graph properties, when the graph
is given in a succinct representation. Later, Papadimitriou and Yannakakis showed, that under the same succinct
encoding method, certain class of decision problems on graph properties becomes exponentially hard. We consider the
complexity of the Permanent problem when the graph/matrix is given in a restricted succinct representation. We
present an optical architecture that is based on the holographic concept for solving balanced succinct permanent problem.

Joint work with S. Dolev, J. Rosen.

David Xiao

Université Paris-Sud

**Abstract:**
Privacy has become an ever more pressing concern as we
conduct more and more of our lives in public forums such as the
Internet. One privacy question that has received much study is how a
database curator may output "sanitized" data that does not reveal too
much information about any particular individual. This criteria has
been formalized as differential privacy, proposed originally by Dwork
et al. (TCC '06 and ICALP '06), which captures the idea that "the
presence or absence of any individual's data does not change the
distribution of the sanitized data by much". This guarantee has been
interpreted to mean that individuals should be comfortable revealing
their information, since their participation barely changes the output.

In this talk, we argue that the interpretation is incomplete because unless participation in the database somehow explicitly benefits the individuals, individuals will always refuse to participate truthfully, regardless of whether the sanitization mechanism is differentially private or not.

We therefore advocate studying differential privacy in conjunction with the notion of truthfulness from game theory, which says that a mechanism should be designed so that it is in the individuals' own interest to give their true information. We show that there exist games for which differentially private mechanisms, in particular the exponential mechanism of McSherry and Talwar (FOCS '07), do not motivate the individuals to participate truthfully. On the positive side, we show that a wide class of games do admit differentially private, truthful, and efficient mechanisms.

Finally, we explore the possibility of tradeoffs between utility and privacy. This is because individuals may be willing to give up some privacy if they receive enough utility from a game, and vice versa. We show that, under a natural measure of information cost, even the release of a differentially private histogram may reveal so much information that individuals would rather suffer the consequences of lying rather than have their information published.

Shubhangi Saraf

MIT

**Abstract:**
Locally decodable codes are error-correcting codes that admit efficient
decoding
algorithms: They give a method to encode k bit messages into n bit codewords
such that even after a constant fraction of the bits of the codeword get
corrupted any bit of the original message can be recovered by only looking at
q(k) bits of the corrupted codeword. The tradeoff between the rate of a code
(i.e., the ratio k/n) and the locality/efficiency (the function q(k)) of its
decoding algorithms has been studied extensively in the past decade. However
most prior work has focused on codes with extremely small q (e.g., constant
functions), and the resulting constructions suffer from extremely poor rate
(k/n goes to zero; in fact n grows nearly exponentially in k). Indeed it was
widely believed that such behavior is inherent: Known codes with locality
[;k^{\frac{1}{c}};] had rate [;2^{- O(c)};]; and no code with non-trivial locality (q = o(k))
and rate > 1/2 (the practical regime) was known.

In this talk we overcome the rate 1/2 barrier and give a new class of codes with very high rates (arbitrarily close to 1) with strong local decoding properties ([;q(k) = k^{\epsilon};] for arbitrarily small epsilon), thereby giving new performance tradeoffs between the rate and locality of decoding. These codes, which we call multiplicity codes, are based on evaluating high degree multivariate polynomials and their derivatives. Multiplicity codes extend traditional multivariate polynomial based codes; they inherit the local-decodability of these codes, and at the same time achieve better tradeoffs and flexibility in the rate and decodability. These codes give respectable parameters even in concrete settings (and not just in asymptotic behavior), giving hope that local decoding techniques may have practical implications.

Based on joint work with Swastik Kopparty (IAS) and Sergey Yekhanin (MSR, Silicon Valley)

Madhur Tulsiani

Princeton University and IAS

**Abstract:**
The term "decomposition theorems" is often used to describe results in
combinatorics which analyze how an object of study behaves with
respect to a family of tests. Such results decompose an object into a
part which has a simple structure and a part whose structure is
completely invisible to the tests in question. I will discuss the
following results proved in somewhat different contexts, and show how
they can be cast in this framework:

(a) the Weak Regularity Lemma of Frieze and Kannan, a result in graph theory,

(b) the "Dense Model Theorem" of Green, Tao and Ziegler, a result in additive combinatorics, that helps transfer (for example) results about dense sets of integers to results about the primes, and

(c) the Impagliazzo Hard-Core Set lemma, a result from computational complexity theory, which is a structure theorem for functions that are hard to compute.

I will show that these results are very strongly related in that they not only imply each other; but the proof techniques used for proving each of these can be used to prove a general result from the all three follow quite easily. I will also discuss some open questions that are raised by the above connections and mention some recent results.

Based on joint works with Omer Reingold, Luca Trevisan and Salil Vadhan.

Yevgeniy Vahlis

Columbia University

**Abstract:**
We study the problem of computing on large datasets that are stored on an untrusted server. We follow the approach of amortized verifiable computation introduced by Gennaro, Gentry, and Parno in CRYPTO 2010. We present the first
practical verifiable computation scheme for high degree polynomial functions. Such functions can be used, for example, to make predictions based on polynomials fitted to a large number of sample points in an experiment. In
addition to the many non-cryptographic applications of delegating high degree polynomials, we use our verifiable computation scheme to obtain new solutions for verifiable keyword search, and proofs of retrievability. Our
constructions are based on the DDH assumption and its variants, and achieve adaptive security, which was left as an open problem by Gennaro \etal (albeit for general functionalities).

Joint work with Siavosh Benabbas and Rosario Gennaro.

Xi Chen

Columbia University

**Abstract:**
We prove a complexity dichotomy theorem for all non-negatively weighted
counting Constraint Satisfaction Problems (#CSP). This caps a long
series of important results on counting problems including unweighted
and weighted graph homomorphisms and the celebrated dichotomy theorem
for unweighted #CSP. Our dichotomy theorem gives a succinct criterion
for tractability. If a set F of constraint functions satisfies the
criterion, then the #CSP problem defined by F is solvable in polynomial
time; if F does not satisfy the criterion, then the problem is #P-hard.
We furthermore show that the question of whether F satisfies the
criterion is decidable in NP.

Surprisingly, our tractability criterion is simpler than the previous criteria for the more restricted classes of problems, although when specialized to those cases, they are logically equivalent. Our proof mainly uses Linear Algebra, and represents a departure from Universal Algebra, the dominant methodology in recent years.

Joint work with Jin-Yi Cai and Pinyan Lu.

Mike Rosulek

University of Montana

**Abstract:**
The universal composition (UC) framework is the most realistic model for the protocol security in complex settings like the Internet. However, it is also the most demanding; in fact, UC-secure protocols are impossible for many
important tasks of interest. One way around these impossibility results is to consider allowing parties to access a trusted setup functionailty, such as a fair coin-tossing functionality. If all tasks do have a UC-secure protocol
when allowed access to a particular trusted setup (as is the case for coin-tossing), we say that the setup is *complete*. An important question is to identify which setups are complete, or, more generally: given some description
of a functionality, how can we tell how helpful is it as a trusted setup in the UC framework?

In this talk I'll discuss a very simple but powerful property called "splittability" that can be easily determined for arbitrary (even randomized and reactive) 2-party functionalities. Splittability can then be used to derive two important characterizations for the power of a functionality: (1) a complete characterization for which tasks have UC-secure protocols without trusted setups (such tasks are therefore useless themselves as a trusted setup), and (2) an almost-complete characterization for which tasks are complete. Comparing these two characterizations, we can see that nearly every functionality is either trivial or complete for UC security! The functionalities which are neither trivial nor complete appear to all be quite unnatural.

Some results discussed in this talk are joint work with Manoj Prabhakaran.

Kevin Matulef

Tsinghua University

**Abstract:**
We develop a new technique for proving lower bounds in property testing,
by showing a strong connection between testing and communication
complexity. We give a simple scheme for reducing communication problems to
testing problems, thus allowing us to use known lower bounds in
communication complexity to prove lower bounds in testing. This scheme is
general and implies a number of new testing bounds, as well as simpler
proofs of several known bounds. In this talk we will illustrate our
technique by proving lower bounds for testing juntas, k-linearity,
monotonicity, and more.

Based on joint work with Eric Blais and Joshua Brody.

Andrew Wan

Tsinghua University

**Abstract:**
We give an explicit construction of a pseudorandom generator for read-once
formulas whose inputs can be read in arbitrary order. For formulas in n
inputs and arbitrary gates of fan-in at most [;d = O(n/\log n);], the
pseudorandom generator uses [;(1 - \Omega(1))n;] bits of randomness and
produces an output that looks [;2^{-\Omega(n)};] pseudorandom to all such
formulas.

The construction of explicit pseudorandom distributions for various models of computation is a central research program in computational complexity. Celebrated examples include generators that fool circuits of constant depth [AB84, AW85, Nis91, Bra10] and logarithmic space machines that read their input once and in a fixed, left-to-right order [Nis92, INW94]. Read-once boolean formulas are one of the simplest natural classes of computations for which no efficient construction of pseudorandom generators was previously known.

Joint work with Andrej Bogdanov and Periklis Papakonstantinou

Ilias Diakonikolas

UC Berkeley

**Abstract:**
When agents with independent priors bid for a single item, Myerson’s optimal auction maximizes expected revenue, whereas Vickrey’s second-price auction optimizes social welfare. We address the natural question of trade-offs,
auctions that optimize revenue without losing too much welfare, say. If one allows for randomized mechanisms, it is easy to see that there are polynomial-time mechanisms that achieve any point in the trade-off (the Pareto curve)
between revenue and welfare. We ask the question of whether one can achieve the same guarantees using *deterministic* mechanisms. We provide a negative answer to this question by showing that this is an NP-hard problem. On the
positive side, we provide polynomial-time deterministic mechanisms that approximate any point of the trade-off between these two fundamental objectives.

(Joint work with Christos Papadimitriou, George Pierrakos and Yaron Singer.)

Anindya De

UC Berkeley

**Abstract:**
The talk deals with the following seemingly different problems :

- Extracting randomness from high entropy distribution sampled in NC0

- Constructing explicit distributions not samplable in NC0

Using (as a black box) a long line of work on affine extractors, we show that it is possible to extract approximately k2/n bits from any distribution sampled by a NC0 circuit which has min-entropy k as long as k>> n^{2/3} with inverse exponential error. Next, using this result, we construct an explicit function f : {0,1}^n -> {0,1} such that it is not possible to sample (U_n, f(U_n)) up to inverse exponential error (over random) by NC0 circuits. The previous best result by Viola (FOCS 2010) could only show a function f : {0,1}^n -> {0,1} such that it is not possible to sample (U_n, f(U_n)) up to inverse logarithmic error.

Joint work with Thomas Watson (Berkeley) and to appear in RANDOM 2011.

Contact

There is a mailing list for the reading group. General information about the mailing list (including how to subscribe yourself to it) is available here. If you want to unsubscribe or change your options, send email to theoryread-request@lists.cs.columbia.edu with the word `help' in the subject or body (don't include the quotes), and you will get back a message with instructions.

Comments on this page are welcome; please send them to
` tal-at-cs.columbia.edu
`

Last updated 02/24/2011.