## Columbia Theory Seminar, Spring 2016

For Spring 2016 the usual time for the meetings will be Fridays at 12pm in 417 Mudd (DSI). Abstracts for talks are given below the talk schedule.

• Friday, January 29, 12:00, 417 Mudd: Sepehr Asadi: Tight Bounds for Linear Sketches of Approximate Matchings (Abstract)
• Friday, February 5, 12:00, 417 Mudd: Li-Yang Tan: Circuit lower bounds via random projections (Abstract)
• Friday, February 12, 12:00, 417 Mudd: Ilias Diakonikolas: A New Approach for Distribution Testing (Abstract)
• Friday, February 19, 12:00, 417 Mudd: Mark Bun: Differentially private release and learning of threshold functions (Abstract)
• Friday, February 26, 12:00, 417 Mudd: Yaron Singer: Some limitations and possibilities of data-driven optimization (Abstract)
• Friday, March 4, 12:00, 417 Mudd: Aaron Bernstein: An improved deterministic algorithm for dynamic single source shortest paths (Abstract)
• Friday, March 11, 12:00, 417 Mudd: Dimitris Paparas: On the complexity of Revenue Maximizing Bayesian Auctions (Abstract)
• Friday, March 25, 12:00, 417 Mudd: Jing Chen: Rational Proofs with Multiple Provers (Abstract)
• Friday, April 1, 12:00, 417 Mudd: Periklis Papakonstantinou: Efficient depth reduction for composites is possible (Abstract)
• Friday, April 15: New York Area Theory Day at Columbia
• Friday, April 22, 12:00, 417 Mudd: Xiaorui Sun: A near-optimal algorithm for testing isomorphism of two unknown graphs (Abstract)
• Friday, April 29, 12:00, 417 Mudd: Greg Valiant: When your big data seems too small: accurate inferences beyond the empirical distribution (Abstract)
• Friday, May 13, 12:00, 417 Mudd: Noga Ron-Zewi: Towards Optimal Deterministic Coding for Interactive Communication (Abstract)

• ### Talk Abstracts

Friday, January 29:
Tight Bounds for Linear Sketches of Approximate Matchings
University of Pennsylvania

Abstract: We consider the problem of approximating a maximum matching in dynamic graph streams where the stream may include both edge insertions and deletions. Our main result is a resolution of the space complexity of linear sketches for approximating the maximum matching in this model.

Specifically, we show that for any $\eps > 0$, there exists a single-pass streaming algorithm, which only maintains a linear sketch of size roughly $n^{2-3\eps}$ bits and recovers an $n^\epsilon$-approximate maximum matching in dynamic graph streams, where $n$ is the number of vertices in the graph. We complement this result with the following lower bound result: any linear sketch for approximating the maximum matching to within a factor of $n^\eps$ has to be of size at least $n^{2-3\eps -o(1)}$ bits.

This is based on joint work with Sanjeev Khanna, Yang Li, and Grigory Yaroslavtsev.

Friday, February 5:
Circuit lower bounds via random projections
Li-Yang Tan
Toyota Technological Institute at Chicago

Abstract: The well-known method of random restrictions (Subbotovskaya, Furst--Saxe--Sipser, Yao, Håstad) lies at the heart of numerous landmark lower bounds in Boolean circuit complexity. I will present three recent circuit lower bounds that are obtained via random projections, a generalization of random restrictions:

1. We prove an average-case depth hierarchy theorem for Boolean circuits, an average-case extension of the classical worst-case depth hierarchy theorems of Sipser, Yao, and Håstad. This resolves a main open problem in Håstad’s 1986 Ph.D. thesis. Via a classical connection between Boolean circuits and structural complexity, our circuit lower bound implies that the polynomial hierarchy is infinite relative to a random oracle, resolving a similarly longstanding conjecture of Håstad, Cai, and Babai from the 1980s.

2. We prove the first super-polynomial lower bounds against the "depth-d Frege" propositional proof system for d = \Theta(\sqrt{\log n}). Previous work of Pitassi--Beame--Impagliazzo (1993) and Krajicek--Pudlak--Woods (1995) gave super-polynomial lower bounds against depth-d Frege for d = \Theta(\log log n).

3. We obtain near-optimal small-depth circuit lower bounds for the distance-k connectivity problem: given an n-node graph G and vertices s and t, is there an s-t path of length at most k in G? Our n^{k^{1/d}/d} lower bound almost matches the folklore n^{k^{2/d}} upper bound based on Savitch's "repeated squaring" algorithm; improving our lower bound even to n^{k^{\Omega(1/d)}} would separate P from NC1.

Based on joint works with Xi Chen, Igor Oliveira, Toniann Pitassi, Benjamin Rossman, and Rocco Servedio.

Friday, February 12:
A New Approach for Distribution Testing
Ilias Diakonikolas
University of Southern California

Abstract: We study problems in distribution property testing: Given sample access to one or more unknown discrete distributions, we want to determine whether they have some global property or are $\eps$-far from having the property in $\ell_1$ distance. We provide a simple and general approach to obtain upper bounds in this setting, by reducing $\ell_1$-testing to $\ell_2$-testing. Our reduction yields optimal $\ell_1$-testers, by using a standard $\ell_2$-tester as a black-box.

Using our framework, we obtain sample-optimal and computationally efficient estimators for a wide variety of $\ell_1$ distribution testing problems, including the following: identity testing to a fixed distribution, closeness testing between two unknown distributions (with equal/unequal sample sizes), independence testing (in any number of dimensions), closeness testing for collections of distributions, and testing $k$-flatness. For several of these problems, we give the first optimal tester in the literature. Moreover, our estimators are significantly simpler to state and analyze compared to previous approaches.

As our second main contribution, we provide a direct general approach for proving distribution testing lower bounds, by bounding the mutual information. Our lower bound approach is not restricted to symmetric properties, and we use it to prove tight lower bounds for all the aforementioned problems.

Joint work with Daniel Kane.

Friday, February 19:
Differentially private release and learning of threshold functions
Mark Bun
Harvard University

Abstract: The line of work on differential privacy is aimed at enabling rich statistical analyses on data while provably protecting individual-level privacy. The last decade of research has shown that, at least in principle, a number of fundamental statistical tasks are compatible with differential privacy. However, privacy-preserving analyses often require additional complexity over their non-private counterparts: for instance, in terms of the number of data samples one needs to collect in order to get accurate results.

In this talk, we will examine the price of privacy for several basic tasks involving one-dimensional threshold functions. We give the first non-trivial lower bounds on the sample complexity of performing these tasks with (approximate) differential privacy. Surprisingly, our techniques seem closely related to classic results in distributed computing, and bolster connections between privacy and combinatorics.

Joint work with Kobbi Nissim, Uri Stemmer, and Salil Vadhan.

Friday, February 26:
Some limitations and possibilities of data-driven optimization
Yaron Singer
Harvard University

Abstract: As we grow highly dependent on data for making predictions, we translate these predictions into models that help us make informed decisions. But how do the guarantees we have on predictions translate to guarantees on decisions? In many cases, we learn models from sampled data and then aim to use these models to make decisions. This intuitive approach turns out to have non-trivial limitations. In some cases, despite having access to large data sets, the current frameworks we have for learnability do not suffice to guarantee desirable outcomes. In other cases, the learning techniques we have introduce estimation errors which can result in poor outcomes and stark impossibility results. In this talk we will formalize some of these ideas using convex and combinatorial optimization and discuss some possibility and impossibility results of this agenda.

Friday, March 4:
An improved deterministic algorithm for dynamic single source shortest paths
Aaron Bernstein
Columbia University

Abstract: Computing shortest paths in a graph is one of the fundamental problems of graph algorithms. The goal of dynamic single source shortest paths (SSSP) is to maintain a shortest path tree from a fixed source s as the edges of the graph change over time. The most general case is the fully dynamic one, where each adversarial update inserts or deletes an arbitrary edge. The trivial algorithm is to recompute SSSP after ever update in O(m) time. For the fully dynamic case, no non-trivial algorithm is known.

We can, however, improve upon the trivial algorithm by restricting the update sequence to be partially dynamic: only insertions, or only deletions. For example, in the only-insertions case, we start with an empty graph and the adversary inserts one edge at at a time until we end with some final graph with m edges; the goal is to maintain SSSP throughout this entire sequence of insertions. The trivial algorithm takes O(m) time per update, but as far back as 1981 Even and Shiloach showed how to achieve O(n) time per update. There are conditional lower bounds showing that this O(n) is optimal.

In 2011 Bernstein and Roditty showed that with a (1+eps) approximation we can break through this lower bound: they achieved ~O(n^2/m) update time. This was successively improved, with the state of the art achieving ~O(m^{1/\sqrt{log(n)}}) = m^o(1) update time. All of these algorithms, however, are randomized. Moreover, the randomization forces them to adopt a weaker model with a non-adaptive adversary, which makes these algorithms unsuitable to many settings.

We present a (1+eps)-approximate deterministic algorithm with update time ~O(n^2/m). This does not match the best randomized algorithm, but it is the first deterministic algorithm to beyond O(n). It is also much simpler than all of the above randomized algorithms. Our main tool is an entirely new deterministic sparsification technique, which we believe could be applicable to many other deterministic shortest path algorithms.

Accepted to STOC 2016. Joint work with Shiri Chechik.

Friday, March 11:
On the complexity of Revenue Maximizing Bayesian Auctions
Dimitris Paparas
Columbia University

Abstract: We consider the problem of designing Revenue Maximizing Bayesian Auctions in both the Deterministic and the Randomized setting.

In the main part of the talk, we will focus on a Unit-demand buyer. For the Deterministic setting, we obtain a polynomial time algorithm for the case that the valuations of the buyer are drawn from a discrete product distribution of support size 2. On the negative side, we prove that the problem is NP-complete in the general case. For the Randomized setting, we study the menu-size complexity of the problem, presenting cases were a small menu suffices as well as cases that require an exponentially large menu. Regarding the computational complexity of the problem, we prove that the problem is #P-hard.

In the remaining part, we will focus on an Additive buyer and briefly discuss some recent results in this setting.

Based on joint work with Xi Chen, Ilias Diakonikolas, George Matikas, Anthi Orfanou, Xiaorui Sun, and Mihalis Yannakakis.

Friday, March 25:
Rational Proofs with Multiple Provers
Jing Chen
Stony Brook University

Abstract: Interactive proofs model a world where a verifier delegates computation to an untrustworthy prover, verifying the prover's claims before accepting them. These proofs have applications to delegation of computation, probabilistically checkable proofs, crowdsourcing, and more.

In some of these applications, the verifier may pay the prover based on the quality of his work. Rational proofs, introduced by Azar and Micali (2012), are an interactive proof model in which the prover is rational rather than untrustworthy---he may lie, but only to increase his payment. This allows the verifier to leverage the greed of the prover to obtain better protocols: while rational proofs are no more powerful than interactive proofs, the protocols are simpler and more efficient. Azar and Micali posed as an open problem whether multiple provers are more powerful than one for rational proofs.

We provide a model that extends rational proofs to allow multiple provers. In this model, a verifier can cross-check the answers received by asking several provers. The verifier can pay the provers according to the quality of their work, incentivizing them to provide correct information.

We analyze rational proofs with multiple provers from a complexity-theoretic point of view. We fully characterize this model by giving tight upper and lower bounds on its power. On the way, we resolve Azar and Micali's open problem in the affirmative, showing that multiple rational provers are strictly more powerful than one (under standard complexity-theoretic assumptions). We further show that the full power of rational proofs with multiple provers can be achieved using only two provers and five rounds of interaction. Finally, we consider more demanding models where the verifier wants the provers' payment to decrease significantly when they are lying, and fully characterize the power of the model when the payment gap must be noticeable (i.e., at least 1/p where p is a polynomial).

Friday, April 1:
Efficient depth reduction for composites is possible
Periklis Papakonstantinou
Rutgers University

Abstract: In 1989 it was shown by Allender and Hertrampf that every circuit of depth d and gates AND,OR,NOT, and MODp, for prime p, can be reduced to a depth 3 circuit of size 2^( (logn)^O(d) ). The question about MODm, for composite m, was handled a year later by Yao (randomized circuits) and by Beigel and Tarui (derandomizing Yao's), but with triple-exponential size construction, i.e. 2^((logn)^2^O(d)).

We obtain for composites the same asymptotic result as for primes.

I will outline this exponentially-improved depth reduction and discuss the most significant among its implications. For example, one consequence is an (almost) exponentially better depth lower bound (from loglogn to logn/loglogn) in William's program for NEXP. If time permits I may also discuss various consequences in the interactive compression settings.

This is joint work with Shiteng Chen.

Friday, April 22:
A near-optimal algorithm for testing isomorphism of two unknown graphs
Xiaorui Sun
Columbia University

Abstract: We study the graph isomorphism problem in property testing framework: Given edge queries to two unknown graphs, how many edge queries are required to distinguish between the case that two graphs are isomorphic, and the case that they are \epsilon-far from being isomorphic.

Fischer and Matsliah (SICOMP 2008) showed that the number of queries necessary to (approximately) test isomorphism is at least Omega(n) and at most O(n^(5/4)). We close this gap by showing an algorithm that makes n*2^O(sqrt(log n)) queries.

Joint work with Krzysztof Onak (IBM TJ Watson Research Center).

Friday, April 29:
When your big data seems too small: accurate inferences beyond the empirical distribution
Greg Valiant
Stanford University

Abstract: We discuss two problems related to the general challenge of making accurate inferences about a complex distribution, in the regime in which the amount of data (i.e the sample size) is too small for the empirical distribution of the samples to be an accurate representation of the underlying distribution. The first problem we consider is the following basic learning task: given independent draws from an unknown distribution over a discrete support, output an approximation of the distribution that is as accurate as possible in L1 distance (ie total variation distance). Perhaps surprisingly, it is often possible to "de-noise" the empirical distribution of the samples to return an approximation of the true distribution that is significantly more accurate than the empirical distribution, without relying on any prior assumptions on the distribution. We present an instance optimal learning algorithm which optimally performs this de-noising for every distribution for which such a de-noising is possible. One curious implication of our techniques is an algorithm for accurately estimating the number of new domain elements that would be seen given a new larger sample, of size up to n*log n. (Extrapolation beyond this sample size is provable information theoretically impossible, without additional assumptions on the distribution.) While these results are applicable generally, we highlight an adaptation of this general approach to some problems in genomics (e.g. quantifying the number of unobserved protein coding variants).

The second problem we discuss is the problem of recovering a low-rank approximation to a matrix of probabilities P, given access to an observed matrix of "counts" obtained via independent samples from the distribution defined by P. This problem can be viewed as a generalization of "community detection", and is relevant to several recent machine learning efforts, including the work on constructing "word embeddings", and recommendation systems derived from sparse data.

This talk is based on three papers, which are joint works with Paul Valiant, with Paul Valiant and James Zou, and with Qingqing Huang, Sham Kakade, and Weihao Kong.

Friday, May 13:
Towards Optimal Deterministic Coding for Interactive Communication
Noga Ron-Zewi
Princeton University and Rutgers University

Abstract: We show an efficient, deterministic interactive coding scheme that simulates any interactive protocol under random errors with nearly optimal parameters.

Specifically, our coding scheme achieves a communication rate of 1 – õ(\sqrt{epsilon}) and a failure probability of exp(-n/ \log n), where n is the protocol length and each bit is flipped independently with constant probability epsilon. Prior to our work, all nontrivial deterministic schemes (either efficient or not) had a rate bounded away from 1. Furthermore, the best failure probability achievable by an efficient deterministic scheme with constant rate was only quasi-polynomial, i.e., of the form exp(-polylog(n)) (Braverman, ITCS 2012). The rate of our scheme is essentially optimal (up to poly-logarithmic factors) by a result of Kol and Raz (STOC 2013).

Joint work with Ran Gelles, Bernhard Haeupler, Gillat Kol and Avi Wigderson.

Contact Alexandr Andoni if you want to volunteer to give a talk (especially students!). The talk can be about your or others' work. It can be anything from a polished presentation of a completed result, to an informal black-board presentation of an interesting topic. It should be accessible to a general theory audience. I will be happy to help students choose papers to talk about.
There is a mailing list for the reading group. General information about the mailing list (including how to subscribe yourself to it) is available here.