For Fall 2016 the usual time for the meetings will be Mondays at 12pm in 417 Mudd (DSI). Abstracts for talks are given below the talk schedule.
Abstract: We survey recent developments in randomness extractors, in particular non-malleable extractors and their applications to two-source extractors, Ramsey graphs, and privacy amplification. We present the two new pseudo-random objects that are at the center of recent progress - correlation breakers and independence-preserving mergers.
Abstract: In the bin packing problem we are given an instance consisting of a sequence of items with sizes between 0 and 1. The objective is to pack these items into the smallest possible number of bins of unit size. FirstFit and BestFit algorithms are simple online algorithms introduced in early seventies, when it was also shown that their asymptotic approximation ratio is equal to $1.7$. We present a simple proof of this bound and sketch a proof that also the absolute approximation ratio of these algorithms is exactly $1.7$. We also discuss a recent online algorithm that achieves an absolute ratio of $5/3$, which is optimal.
Based on joint work with Gy. Dosa, J. Balogh, J. Bekesi, and Rob van Stee.
Abstract: Suppose we would like to estimate (learn) the mean of a high-dimensional Gaussian distribution. It is easy to see that the sample mean is an accurate estimator of the mean. Now suppose that an adversary corrupts a small fraction of our samples. Can we still recover the true mean, up to a small error? A moment's thought reveals that the sample mean is not a good estimator in this noisy setting.
It turns out that the concept of the Tukey median is a robust estimator of the mean in the noisy setting. Unfortunately, it is NP-hard to compute, even approximately. This prompts the following question: Can we reconcile robustness and computational efficiency in high-dimensional distribution estimation?
In this talk, I will describe a general approach for efficiently detecting and correcting corruptions in high dimensions. This technique yields the first computationally efficient algorithms for robustly learning fundamental families of high-dimensional structured distributions, including Gaussians, discrete product distributions, and mixtures thereof (under some natural conditions). Time permitting, I will mention generalizations of the approach to robust estimation of Bayesian networks, and other families of graphical models.
The talk will be based on joint works with (different subsets of) G. Kamath, D. Kane, J. Li, A. Moitra, and A. Stewart.
Abstract: Matrix completion is a basic machine learning problem that has wide applications, especially in collaborative filtering and recommender systems. Simple non-convex optimization algorithms are popular and effective in practice. Despite recent progress in proving various non-convex algorithms converge from a good initial point, it remains unclear why random or arbitrary initialization suffices in practice.
We prove that the commonly used non-convex objective function for matrix completion has no spurious local minima --- all local minima must also be global. Therefore, many popular optimization algorithms such as (stochastic) gradient descent can provably solve positive semidefinite matrix completion with arbitrary initialization in polynomial time. The result can be generalized to the setting when the observed entries contain noise. We believe that our main proof strategy can be useful for understanding geometric properties of other statistical problems involving partial or noisy observations.
Abstract: We present a new methodology for proving distribution testing lower bounds, by establishing a connection between distribution testing and the simultaneous message passing (SMP) communication model. Extending the framework of Blais, Brody, and Matulef [BBM12], we show a simple methodology of reducing lower bounds on (private-coin) SMP problems to distribution testing problems. This reduction allows us to prove several new distribution testing lower bounds, as well as to provide simpler proofs of known lower bounds.
Our main result is concerned with testing identity to a specific distribution p, given as a massive parameter. Valiant and Valiant [VV14] showed that the sample complexity of the foregoing question is closely related to the 2/3-pseudonorm of $p$. We obtain alternative, nearly tight bounds on the complexity of this problem, in terms of an arguably more intuitive measure and using simpler proofs. Specifically, we show that the sample complexity is essentially determined by the size of the effective support of p, which loosely speaking is the number of supported elements that constitute the vast majority of the mass of p. This result, in turn, stems from an unexpected connection to the theory of interpolation spaces, specifically the K-functional between L_1 and L_2 spaces.
Abstract: A classical question in the geometry of numbers asks how many lattice points lie in some ball around the origin. Minkowski's celebrated theorem gives us a tight lower bound for this number that depends only on the determinant of the lattice. (One can think of the determinant as the limit of the density of lattice points inside large balls. So, Minkowski's theorem gives a tight lower bound on a lattice's “local density” based on its “global density.”) We resolve a conjecture due to Dadush that gives a nearly tight converse to Minkowski's theorem—an upper bound on the number of lattice points in a ball that depends only on the determinants of sublattices. This theorem has numerous applications, from complexity theory, to the geometry of numbers, to the behavior of Brownian motion on flat tori.
Based on joint work with Oded Regev.
Abstract: We present functions that are hard to compute on average for algorithms running in some fixed polynomial time, assuming widely-conjectured worst-case hardness of certain problems from the study of fine-grained complexity (Orthogonal Vectors, 3SUM, etc). Examples of such functions are known from before (Goldmann et al., Inf. Process. Lett. 1994), but these have been canonical functions that have not found further use than as a proof of this concept, while our functions are closely related to well-studied problems and have considerable algebraic structure.
Based on the average-case hardness of our functions, we outline the construction of a Proof of Work scheme and discuss possible approaches to constructing fine-grained One-Way Functions. We also show how our reductions make conjectures regarding the worst-case hardness of the problems we reduce from (and consequently the Strong Exponential Time Hypothesis) heuristically falsifiable in a sense similar to that of (Naor, CRYPTO 2003).
We prove our hardness results in each case by showing fine-grained reductions from solving one of three problems – namely, Orthogonal Vectors (OV), 3SUM, and All-Pairs Shortest Paths (APSP) – in the worst case to computing our function correctly on a uniformly random input.
Based on joint work with Alon Rosen, Manuel Sabin, Prashant Vasudevan.
Abstract: In the recent years linear sketching has emerged as a powerful tool for approximate computing in settings with limited resources including distributed computation and streaming. It has been used in breakthrough results on graph and matrix processing, dimensionality reduction, etc. Strikingly, linear sketches have been shown to be optimal for dynamic stream processing under fairly mild assumptions.
In this talk I will describe a new study of linear sketching that focuses on understanding the power of linear sketches based on parities (i.e. over GF_2, the field over two elements, as compared to the previous work that uses real arithmetic). I will illustrate various properties of such sketches using Fourier-analytic methods and tools from communication complexity. In particular, linear sketching over GF_2 turns out to be closely related to Fourier sparsity with respect to Lp-norms. Moreover, it can be shown to be (almost) optimal in streaming and distributed settings for data generated according to the uniform distribution.
Joint work with Sampath Kannan (UPenn) and Elchanan Mossel (MIT).
Abstract: Given corrupted 1-bit measurements of the form sgn(w* . x), recover a vector w that is a good approximation to w*. This problem has been studied in both the learning theory and signal processing communities. In learning theory, this is known as the problem of learning half-spaces with noise, and in signal processing, as 1-bit compressed sensing, in which there is an additional assumption that w* is sparse. The challenge in both cases is to design computationally efficient algorithms tolerant to large amounts of noise. In this talk, I will describe recent work where we give algorithms achieving nearly optimal guarantees for both problems under two well-studied noise models, bounded (Massart) noise and adversarial noise.
Joint work with Nina Balcan, Nika Haghtalab and Hongyang Zhang.
Abstract: Power Grid Islanding is an effective method to mitigate cascading failures in power grids. The challenge is to partition the power grid into smaller connected components, called islands, so that each island can operate independently for a short period of time. To address this problem, I introduce and present our recent results on the Doubly Balanced Connected graph Partitioning (DBCP) problem. The DBCP problem is the problem of partitioning a graph into two parts such that both parts are connected and comparable in size, and supply is almost equal to demand in each part. I show that by using the convex embedding of 3-connected graphs, we can find a solution to the DBCP problem for 2-connected graphs in polynomial time.
Abstract: I show an efficient approximate nearest neighbor search (ANN) algorithm over an arbitrary high-dimensional *symmetric* norm. Traditionally, the ANN problem in high dimensions has been studied over the Manhattan and Euclidean distances with a few exceptions. Thus, this new result is a (modest) step towards a unified theory of similarity search.
At a very high level, the algorithm proceeds by embedding a symmetric norm into a carefully crafted product space, which we can handle via a combination of classical and new tools. The resulting data structure achieves doubly-logarithmic approximation using sub-polynomial query time and near-linear space.
Joint work with Alexandr Andoni, Aleksandar Nikolov, and Ilya Razenshteyn.
Back to Theory of Computation at Columbia.