Columbia Theory Seminar, Spring 2014

For Spring 2014, the usual time for the meetings will be Friday at 12:00 - 13:30 in the open meeting area, CSB 477. Abstracts for talks are given below the talk schedule.

Friday, January 24, 12:00, CSB 477: Zhenming Liu: Statistically-secure ORAM with [;\tilde{O}(\log^2 n);] Overhead (Abstract)

Friday, January 31, 12:00, CSB 477: Ilya Volkovich: On Learning, Lower Bounds and (un)Keeping Promises (Abstract)

Friday, February 7, 12:00, CSB 477: Clément Canonne: Testing probability distributions with more oracles (Abstract)

Friday, February 14, 12:00, CSB 477: Tengyu Ma: Provable bounds for learning some deep representations (Abstract)

Friday, February 21, 12:00, CSB 477: Ilias Diakonikolas: Optimal Algorithms for Testing Closeness of Discrete Distributions (Abstract)

Friday, February 28, 12:00, CSB 477: Andrew Goldberg: Recent Results on Hub Labeling (Abstract)

Thursday, March 13, 12:00, CSB 453 (note unusual date and place): Maria Florina Balcan: Foundations For Learning in the Age of Big Data (Abstract)

Friday, March 28, 12:00, CSB 477: Li-Yang Tan: A polynomial lower bound for monotonicity testing of Boolean functions (Abstract)

Friday, April 18, 12:00, CSB 477: Alina Ene: Routing in Directed Graphs with Symmetric Demands (Abstract)

Friday, May 9, 12:00, CSB 477: Anindya De: Central limit theorem for Gaussian chaos, Polynomial decomposition and deterministic approximate counting for low-degree polynomial threshold functions (Abstract)

Talk Abstracts

Friday, January 24:
Statistically-secure ORAM with [;\tilde{O}(\log^2 n);] Overhead
Zhenming Liu
Princeton University

Abstract: We demonstrate a simple, statistically secure, ORAM with computational overhead [;\tilde{O}(\log^2 n);]; previous ORAM protocols achieve only computational security (under computational assumptions) or require [;\tilde{\Omega}(\log^3 n);] overheard. An additional benefit of our ORAM is its conceptual simplicity, which makes it easy to implement in both software and (commercially available) hardware.

Our construction is based on recent ORAM constructions due to Shi, Chan, Stefanov, and Li (Asiacrypt 2011) and Stefanov and Shi (ArXiv 2012), but with some crucial modifications in the algorithm that simplifies the ORAM and enable our analysis. A central component in our analysis is reducing the analysis of our algorithm to a "supermarket" problem; of independent interest (and of importance to our analysis,) we provide an upper bound on the rate of "upset" customers in the "supermarket" problem.

Friday, January 31:
On Learning, Lower Bounds and (un)Keeping Promises
Ilya Volkovich
Princeton University

Abstract: We extend the line of research initiated by Fortnow and Klivans \cite{FortnowKlivans09} that studies the relationship between efficient learning algorithms and circuit lower bounds. In \cite{FortnowKlivans09}, it was shown that if a Boolean circuit class C has an efficient deterministic exact learning algorithm, (i.e. an algorithm that uses membership and equivalence queries) then [;EXP^{NP} \not \subseteq ppoly[C];] -the class of all languages that can be computed by polynomial-size circuits from the class C.}. Recently, in \cite{KKO13} [;EXP^{NP};] was replaced by [;DTIME(n^{\omega(1)});]. Yet for the models of randomized exact learning or Valiant's PAC learning, the best result so far is a lower bound against BPEXP (the exponential-time analogue of BPP. In this paper, we derive stronger lower bounds as well as some other consequences from randomized exact learning and PAC learning algorithms, answering an open question posed in \cite{FortnowKlivans09} and \cite{KKO13}. In particular, we show that :

If a Boolean circuit class C has an efficient randomized exact learning algorithm or an efficient PAC learning algorithm (In fact, our result hold even for a more general model of PAC learning with membership queries.) then [;BPTIME(n^{\omega(1)})/1 \not \subseteq ppoly[C];].

If a Boolean circuit class C has an efficient randomized exact learning algorithm then no strong pseudo-random generators exist in [;ppoly[C];].

We note that in both cases the learning algorithms need not be proper. The extra bit of advice comes to accommodate the need to keep the promise of bounded away probabilities of acceptance and rejection. The exact same problem arises when trying to prove lower bounds for BPTIME or MA \cite{Barak02, FS04, Santhanam09}. It has been an open problem to remove this bit. We suggest an approach to settle this problem in our case.

Friday, February 7:
Testing probability distributions with more oracles
Clément Canonne
Columbia University

Abstract: We analyze and study a hybrid model for testing and learning probability distributions, in which the testing algorithm is provided with two different types of oracles to the unknown distribution D over [n]. More precisely, we define both the Dual and Cumulative Dual access models, in which the algorithm A can both sample from D and respectively, for any i in [n],

query the probability mass D(i) (query access); or

get the total mass of [i], i.e. \sum_{j=1}^i D(j) (cumulative access)

These two models, by generalizing the previous sampling and query oracles, allow us to bypass the strong lower bounds established for a number of problems in these settings, while capturing several interesting properties of theirs -- and providing new insight on their limitations. Finally, we show that while the testing algorithms can be in most cases strictly more efficient, some tasks remain hard even with this additional power.

Joint work with Ronitt Rubinfeld (MIT).

Friday, February 14:
Provable bounds for learning some deep representations
Tengyu Ma
Princeton University

Abstract: I will mainly talk about our recent paper on deep learning: Provable bound for learning some deep representations. Deep learning, a modern version of artificial neural net, has been a very powerful tool in recent years for many machine learning tasks such as image recognition, speech recognition and natural language processing. However, theoretically, it still remains mysterious why these modern heuristics and tricks together with millions of data result in big improvements for so many important questions. With the motivation to resolve these mysteries, this paper proposes an efficient algorithm that provably learns a class of deep neural networks under a generative model.

In more detail, the network we study has constant number of layers, each consecutive two layers are connected via a random degree-d bipartite graph with random weights in [-1,1]. The observed data at bottom is generated as follows: First a sparse hidden vector at the top layer is chosen, then the top layer activates some of the second layer nodes by multiplying the connection matrix and applying another point-wise threshold function, and then adding some small noise. Then the second layer activates the third using the same rules with the new connection matrix, so on and so forth. Our algorithms use layer-wise learning and for each layer we exploit the old Hebbian rule: "Things fire together wire together", that is, if two nodes in the same layer are correlated, then they are likely to share a neighbor in the layer above them. After observing the common neighbor information, we could apply a graph recovery procedure and get the support of the network, from which we could get the hidden variables and the weights. The running time is polynomial and the sample complexity is quadratic or cubic in the number of nodes, depending on the details of the model.

The talk will be self contained and this is the joint work with Sanjeev Arora, Aditya Bhaskara and Rong Ge.

Friday, February 21:
Optimal Algorithms for Testing Closeness of Discrete Distributions
Ilias Diakonikolas
University of Edinburgh

Abstract: We study the question of closeness testing for two discrete distributions. More precisely, given samples from two distributions p and q over an n-element set, we wish to distinguish whether p = q versus p is at least epsilon-far from q, in either L1 or L2 distance. Batu et al (FOCS 2000/JACM 2013) gave the first sub-linear time algorithms for these problems, albeit with suboptimal dependence on n and epsilon.

In this talk, I will present simple (and new) testers for both the L1 and L2 settings, with sample complexity that is information-theoretically optimal, to constant factors, both in the dependence on n, and the dependence on epsilon.

The talk will be based on joint work with S. Chan (MSR), G. Valiant (Stanford) and P. Valiant (Brown).

Friday, February 28:
Recent Results on Hub Labeling
Andrew Goldberg
Microsoft Research Silicon Valley

Abstract: Given a weighted graph, a distance oracle takes as an input a pair of vertices and returns the distance between them. The labeling approach to distance oracle design is to precompute a label for every vertex so that the distances can be computed from the corresponding labels, without looking at the graph. The hub labeling (HL) algorithm received a lot of attention recently in both theoretical and experimental contexts. In particular, Cohen et al. (2003) developed an O(log n) approximation algorithm for finding the smallest labeling. Abraham at al. (2012) introduced a special kind of labeling -- hierarchical labeling (HHL) -- and showed that it is determined by an ordering of vertices. They give a practical algorithm to compute small HHLs on some classes of graphs, including road networks. Akiba et al. (2013) give a faster algorithm to compute the HHL from an order and apply it to social networks.

We introduce a variant of the Cohen at al. algorithm that is faster both in theory and in practice. We also introduce on-line tie-breaking that reduces the label size on networks with multiple shortest paths, such as social networks. We introduce an algorithm to efficiently compute good orders. This algorithm scales to large networks and is more robust than the previous ones. Finally, we introduce a label compression technique that significantly reduces the size of the labels.

Our experimental results show that the new HHL algorithm is practical on a much broader class of graphs. We also show that it usually produces labels that are not much bigger than those for the HL algorithm with the guaranteed approximation ratio.

Joint work with Daniel Delling, Thomas Pajor, Ruslan Savchenko and Renato Werneck.

Thursday, March 13:
Foundations For Learning in the Age of Big Data
Maria Florina Balca
Georgia Institute of Technology

Abstract: With the variety of applications of machine learning across science, engineering, and computing in the age of Big Data, re-examining the underlying foundations of the field has become imperative. In this talk, I will describe new models and algorithms for important emerging paradigms, specifically, interactive learning and distributed learning.

For active learning, where the algorithm itself can ask for labels of carefully chosen examples from a large pool of unannotated data with the goal of minimizing human labeling effort, I will present results giving computationally efficient, optimal label complexity algorithms. I will also discuss learning with more general forms of interaction, as well as unexpected implications of these results for classic supervised learning paradigms.

For distributed learning, I will discuss a model that for the first time addresses the core question of what are the fundamental communication requirements for achieving accurate learning. Broadly, we consider a framework where massive amounts of data is distributed among several locations, and our goal is to learn a low-error predictor with respect to the overall distribution of data using as little communication, and as few rounds of interaction, as possible. We provide general upper and lower bounds on the amount of communication needed to learn a given class, as well as broadly-applicable techniques for achieving communication-efficient learning.

Friday, March 28:
A polynomial lower bound for monotonicity testing of Boolean functions
Li-Yang Tan
Columbia University

Abstract: We prove a [;\tilde{\Omega}(n^{1/5});] lower bound on the query complexity of any non-adaptive two-sided error algorithm for testing whether an unknown n-variable Boolean function is monotone versus constant-far from monotone. This gives an exponential improvement on the previous lower bound of [;\Omega(log n);] due to Fischer et al from 2002. Our approach extends to give a similar lower bound for monotonicity testing of Boolean-valued functions over certain hypergrid domains [;\{1,2,...,m\}^n;].

Joint work with Rocco Servedio.

Friday, April 18:
Routing in Directed Graphs with Symmetric Demands
Alina Ene
Princeton University

Abstract: In this talk, we consider some fundamental maximum throughput routing problems in directed graphs. In this setting, we are given a capacitated directed graph. We are also given source-destination pairs of nodes (s_1, t_1), (s_2, t_2), ..., (s_k, t_k). The goal is to select a largest subset of the pairs that are simultaneously routable subject to the capacities; a set of pairs is routable if there is a multicommodity flow for the pairs satisfying certain constraints that vary from problem to problem (e.g., integrality, unsplittability, edge or node capacities). Two well-studied optimization problems in this context are the Maximum Edge Disjoint Paths (MEDP) and the All-or-Nothing Flow (ANF) problem. In MEDP, a set of pairs is routable if the pairs can be connected using edge-disjoint paths. In ANF, a set of pairs is routable if there is a feasible multicommodity flow that fractionally routes one unit of flow from s_i to t_i for each routed pair (s_i, t_i).

MEDP and ANF are both NP-hard and their approximability has attracted substantial attention over the years. Over the last decade, several breakthrough results on both upper bounds and lower bounds have led to a much better understanding of these problems. At a high level, one can summarize this progress as follows. MEDP and ANF admit poly-logarithmic approximations in undirected graphs if one allows constant congestion, i.e., the routing violates the capacities by a constant factor. Moreover, these problems are hard to approximate within a poly-logarithmic factor in undirected graphs even if one allows constant congestion. In sharp contrast, both problems are hard to approximate to within a polynomial factor in directed graphs even if a constant congestion is allowed and the graph is acyclic.

In this talk, we focus on routing problems in directed graphs in the setting in which the demand pairs are symmetric: the input pairs are unordered and a pair s_i t_i is routed only if both the ordered pairs (s_i,t_i) and (t_i,s_i) are routed. Perhaps surprisingly, the symmetric setting can be much more tractable than the asymmetric setting. As we will see in this talk, when the demand pairs are symmetric, ANF admits a poly-logarithmic approximation with constant congestion. We will also touch upon some open questions related to MEDP in directed graphs with symmetric pairs.

This talk is based on joint work with Chandra Chekuri.

Friday, May 9:
Central limit theorem for Gaussian chaos, Polynomial decomposition and deterministic approximate counting for low-degree polynomial threshold functions
Anindya De
IAS

Abstract: The class of polynomial threshold functions (PTFs) is one of the most well-studied classes of Boolean functions in complexity theory and learning theory. The problem of deterministic approximate counting for polynomial threshold functions (over the Boolean hypercube) is an important problem in the area of unconditional derandomization and has been the subject of a lot of research in the past decade. Despite this, the problem of getting an algorithm with polynomial running time and sub-constant error has remained open. In this talk, we offer a positive resolution to this problem.

The novel ingredients in this work are:

A new central limit theorem for low-degree polynomials of Gaussians. Roughly speaking, it states that a necessary and sufficient condition for the distribution of a low-degree Gaussian polynomial to be close to a Gaussian with the same mean and variance is that its (appropriately defined) eigenvalues are small. The proof uses the so-called Stein's method from probability theory and Malliavin calculus.

A new decomposition procedure which can express a given low-degree polynomial in terms of a few polynomials all of which have "small" eigenvalues. A novel feature of this lemma is that the number of polynomials and the bound on the eigenvalue can be essentially chosen to be free parameters at the cost of slightly perturbing the polynomial. This lemma is likely to be of independent interest and useful in other applications.

Joint work with Rocco Servedio.

Contact xichen-at-cs-dot-columbia-dot-edu if you want to volunteer to give a talk (especially for students!). The talk can be about your or others' work. It can be anything from a polished presentation of a completed result, to an informal black-board presentation of an interesting topic where you are stuck on some open problem. It should be accessible to a general theory audience. I will be happy to help students choose papers to talk about.

There is a mailing list for the reading group. General information about the mailing list (including how to subscribe yourself to it) is available here. If you want to unsubscribe or change your options, send email to theoryread-request@lists.cs.columbia.edu with the word `help' in the subject or body (don't include the quotes), and you will get back a message with instructions.

Comments on this page are welcome; please send them to xichen-at-cs.columbia.edu

Back to Theory of Computation at Columbia main page