For Spring 2014, the usual time for the meetings will be Friday at 12:00 - 13:30 in the open meeting area, CSB 477. Abstracts for talks are given below the talk schedule.

Zhenming Liu

Princeton University

**Abstract:**
We demonstrate a simple, statistically secure, ORAM with
computational overhead [;\tilde{O}(\log^2 n);]; previous ORAM
protocols achieve only computational security (under computational
assumptions) or require [;\tilde{\Omega}(\log^3 n);] overheard. An
additional benefit of our ORAM is its conceptual simplicity, which
makes it easy to implement in both software and (commercially
available) hardware.

Our construction is based on recent ORAM constructions due to Shi, Chan, Stefanov, and Li (Asiacrypt 2011) and Stefanov and Shi (ArXiv 2012), but with some crucial modifications in the algorithm that simplifies the ORAM and enable our analysis. A central component in our analysis is reducing the analysis of our algorithm to a "supermarket" problem; of independent interest (and of importance to our analysis,) we provide an upper bound on the rate of "upset" customers in the "supermarket" problem.

Ilya Volkovich

Princeton University

**Abstract:**
We extend the line of research initiated by Fortnow and Klivans
\cite{FortnowKlivans09} that studies the relationship between
efficient learning algorithms and circuit lower bounds. In
\cite{FortnowKlivans09}, it was shown that if a Boolean circuit class
C has an efficient *deterministic* exact learning algorithm,
(i.e. an algorithm that uses membership and equivalence queries) then
[;EXP^{NP} \not \subseteq ppoly[C];] -the class of all languages that
can be computed by polynomial-size circuits from the class C.}.
Recently, in \cite{KKO13} [;EXP^{NP};] was replaced by
[;DTIME(n^{\omega(1)});]. Yet for the models of *randomized* exact
learning or Valiant's PAC learning, the best result so far is a
lower bound against BPEXP (the exponential-time analogue of
BPP. In this paper, we derive stronger lower bounds as well as
some other consequences from *randomized* exact learning and
PAC learning algorithms, answering an open question posed in
\cite{FortnowKlivans09} and \cite{KKO13}. In particular, we show that
:

We note that in both cases the learning algorithms need not be
*proper*. The extra bit of advice comes to accommodate the need
to keep the promise of bounded away probabilities of acceptance and
rejection. The exact same problem arises when trying to prove lower
bounds for BPTIME or MA \cite{Barak02, FS04, Santhanam09}. It
has been an open problem to remove this bit. We suggest an approach to
settle this problem in our case.

Clément Canonne

Columbia University

**Abstract:**
We analyze and study a hybrid model for testing and learning probability
distributions, in which the testing algorithm is provided with two different
types of oracles to the unknown distribution D over [n]. More precisely,
we define both the Dual and Cumulative Dual access models, in which the
algorithm A can both sample from D and respectively, for any i in [n],

These two models, by generalizing the previous sampling and query oracles, allow us to bypass the strong lower bounds established for a number of problems in these settings, while capturing several interesting properties of theirs -- and providing new insight on their limitations. Finally, we show that while the testing algorithms can be in most cases strictly more efficient, some tasks remain hard even with this additional power.

Joint work with Ronitt Rubinfeld (MIT).

Tengyu Ma

Princeton University

**Abstract:**
I will mainly talk about our recent paper on deep learning: Provable
bound for learning some deep representations. Deep learning, a modern
version of artificial neural net, has been a very powerful tool in
recent years for many machine learning tasks such as image
recognition, speech recognition and natural language processing.
However, theoretically, it still remains mysterious why these modern
heuristics and tricks together with millions of data result in big
improvements for so many important questions. With the motivation to
resolve these mysteries, this paper proposes an efficient algorithm
that provably learns a class of deep neural networks under a
generative model.

In more detail, the network we study has constant number of layers, each consecutive two layers are connected via a random degree-d bipartite graph with random weights in [-1,1]. The observed data at bottom is generated as follows: First a sparse hidden vector at the top layer is chosen, then the top layer activates some of the second layer nodes by multiplying the connection matrix and applying another point-wise threshold function, and then adding some small noise. Then the second layer activates the third using the same rules with the new connection matrix, so on and so forth. Our algorithms use layer-wise learning and for each layer we exploit the old Hebbian rule: "Things fire together wire together", that is, if two nodes in the same layer are correlated, then they are likely to share a neighbor in the layer above them. After observing the common neighbor information, we could apply a graph recovery procedure and get the support of the network, from which we could get the hidden variables and the weights. The running time is polynomial and the sample complexity is quadratic or cubic in the number of nodes, depending on the details of the model.

The talk will be self contained and this is the joint work with Sanjeev Arora, Aditya Bhaskara and Rong Ge.

Ilias Diakonikolas

University of Edinburgh

**Abstract:**
We study the question of closeness testing for two discrete
distributions. More precisely, given samples from two distributions p
and q over an n-element set, we wish to distinguish whether p = q
versus p is at least epsilon-far from q, in either L1 or L2 distance. Batu
et al (FOCS 2000/JACM 2013) gave the first sub-linear time algorithms
for these problems, albeit with suboptimal dependence on n and epsilon.

In this talk, I will present simple (and new) testers for both the L1 and L2 settings, with sample complexity that is information-theoretically optimal, to constant factors, both in the dependence on n, and the dependence on epsilon.

The talk will be based on joint work with S. Chan (MSR), G. Valiant (Stanford) and P. Valiant (Brown).

Andrew Goldberg

Microsoft Research Silicon Valley

**Abstract:**
Given a weighted graph, a distance oracle takes as an input a pair of
vertices and returns the distance between them. The labeling approach
to distance oracle design is to precompute a label for every vertex so
that the distances can be computed from the corresponding labels,
without looking at the graph. The hub labeling (HL) algorithm received
a lot of attention recently in both theoretical and experimental
contexts. In particular, Cohen et al. (2003) developed an O(log n)
approximation algorithm for finding the smallest labeling. Abraham at
al. (2012) introduced a special kind of labeling -- hierarchical
labeling (HHL) -- and showed that it is determined by an ordering of
vertices. They give a practical algorithm to compute small HHLs on
some classes of graphs, including road networks. Akiba et al. (2013)
give a faster algorithm to compute the HHL from an order and apply it
to social networks.

We introduce a variant of the Cohen at al. algorithm that is faster both in theory and in practice. We also introduce on-line tie-breaking that reduces the label size on networks with multiple shortest paths, such as social networks. We introduce an algorithm to efficiently compute good orders. This algorithm scales to large networks and is more robust than the previous ones. Finally, we introduce a label compression technique that significantly reduces the size of the labels.

Our experimental results show that the new HHL algorithm is practical on a much broader class of graphs. We also show that it usually produces labels that are not much bigger than those for the HL algorithm with the guaranteed approximation ratio.

Joint work with Daniel Delling, Thomas Pajor, Ruslan Savchenko and Renato Werneck.

Maria Florina Balca

Georgia Institute of Technology

**Abstract:**
With the variety of applications of machine learning across science,
engineering, and computing in the age of Big Data, re-examining the
underlying foundations of the field has become imperative. In this
talk, I will describe new models and algorithms for important emerging
paradigms, specifically, interactive learning and distributed
learning.

For active learning, where the algorithm itself can ask for labels of carefully chosen examples from a large pool of unannotated data with the goal of minimizing human labeling effort, I will present results giving computationally efficient, optimal label complexity algorithms. I will also discuss learning with more general forms of interaction, as well as unexpected implications of these results for classic supervised learning paradigms.

For distributed learning, I will discuss a model that for the first time addresses the core question of what are the fundamental communication requirements for achieving accurate learning. Broadly, we consider a framework where massive amounts of data is distributed among several locations, and our goal is to learn a low-error predictor with respect to the overall distribution of data using as little communication, and as few rounds of interaction, as possible. We provide general upper and lower bounds on the amount of communication needed to learn a given class, as well as broadly-applicable techniques for achieving communication-efficient learning.

Li-Yang Tan

Columbia University

**Abstract:**
We prove a [;\tilde{\Omega}(n^{1/5});] lower bound on the query complexity
of any non-adaptive two-sided error algorithm for testing whether an
unknown n-variable Boolean function is monotone versus constant-far
from monotone. This gives an exponential improvement on the previous
lower bound of [;\Omega(log n);] due to Fischer et al from 2002. Our
approach extends to give a similar lower bound for monotonicity
testing of Boolean-valued functions over certain hypergrid domains
[;\{1,2,...,m\}^n;].

Joint work with Rocco Servedio.

Alina Ene

Princeton University

**Abstract:**
In this talk, we consider some fundamental maximum throughput routing
problems in directed graphs. In this setting, we are given a
capacitated directed graph. We are also given source-destination pairs
of nodes (s_1, t_1), (s_2, t_2), ..., (s_k, t_k). The goal is to select
a largest subset of the pairs that are simultaneously routable subject
to the capacities; a set of pairs is routable if there is a
multicommodity flow for the pairs satisfying certain constraints that
vary from problem to problem (e.g., integrality, unsplittability, edge
or node capacities). Two well-studied optimization problems in this
context are the Maximum Edge Disjoint Paths (MEDP) and the
All-or-Nothing Flow (ANF) problem. In MEDP, a set of pairs is routable
if the pairs can be connected using edge-disjoint paths. In ANF, a set
of pairs is routable if there is a feasible multicommodity flow that
fractionally routes one unit of flow from s_i to t_i for each routed
pair (s_i, t_i).

MEDP and ANF are both NP-hard and their approximability has attracted substantial attention over the years. Over the last decade, several breakthrough results on both upper bounds and lower bounds have led to a much better understanding of these problems. At a high level, one can summarize this progress as follows. MEDP and ANF admit poly-logarithmic approximations in undirected graphs if one allows constant congestion, i.e., the routing violates the capacities by a constant factor. Moreover, these problems are hard to approximate within a poly-logarithmic factor in undirected graphs even if one allows constant congestion. In sharp contrast, both problems are hard to approximate to within a polynomial factor in directed graphs even if a constant congestion is allowed and the graph is acyclic.

In this talk, we focus on routing problems in directed graphs in the setting in which the demand pairs are symmetric: the input pairs are unordered and a pair s_i t_i is routed only if both the ordered pairs (s_i,t_i) and (t_i,s_i) are routed. Perhaps surprisingly, the symmetric setting can be much more tractable than the asymmetric setting. As we will see in this talk, when the demand pairs are symmetric, ANF admits a poly-logarithmic approximation with constant congestion. We will also touch upon some open questions related to MEDP in directed graphs with symmetric pairs.

This talk is based on joint work with Chandra Chekuri.

Anindya De

IAS

**Abstract:**
The class of polynomial threshold functions (PTFs) is one of the most well-studied classes of Boolean functions in complexity theory and learning theory. The problem of deterministic approximate counting for polynomial threshold functions (over the Boolean hypercube) is an important problem in the area of unconditional derandomization and has been the subject of a lot of research in the past decade. Despite this, the problem of getting an algorithm with polynomial running time and sub-constant error has remained open. In this talk, we offer a positive resolution to this problem.

The novel ingredients in this work are:

Joint work with Rocco Servedio.

Contact

There is a mailing list for the reading group. General information about the mailing list (including how to subscribe yourself to it) is available here. If you want to unsubscribe or change your options, send email to theoryread-request@lists.cs.columbia.edu with the word `help' in the subject or body (don't include the quotes), and you will get back a message with instructions.

Comments on this page are welcome; please send them to
` xichen-at-cs.columbia.edu
`