For Spring 2013, the usual time for the meetings will be Friday at 11:30 - 13:00 in the CS conference room, CSB 453. Abstracts for talks are given below the talk schedule.

Delaram Kahrobaei

CUNY Graduate Center

**Abstract:**
Most common public key cryptosystems and public key exchange
protocols presently in use, such as the RSA algorithm, Diffie-Hellman,
and elliptic curve methods are based on number theory and depend, in
particular, on the structure of abelian groups. The strength of computing
machinery has made these techniques potentially susceptible to attack.
As a result, an active line of research has arisen to develop cryptosystems
and key exchange protocols using noncommutative cryptographic
platforms. This line of investigation has been given the broad title of
noncommutative algebraic cryptography. Research in this area was
initiated by two public key protocols that used the braid groups, one by
Ko, Lee et. al. and one by Anshel, Anshel and Goldfeld. The study of
these protocols and the group theory surrounding them has had a large
effect on research in infinite group theory. In this talk I survey a couple
of these noncommutative group based methods and discuss several
ideas in abstract group theory that have arisen from them. I will also
allude to my recent work in this area.

Jens Vygen

University of Bonn

**Abstract:**
We prove new results and obtain polynomial-time algorithms
with improved approximation guarantees for the graphic
traveling salesman problem and some related problems.

For the graphic TSP itself, we improve the approximation ratio to 7/5. For a generalization, the connected-T-join problem, we obtain the first nontrivial approximation algorithm, with ratio 3/2. This contains the graphic s-t-path-TSP as a special case. Our improved approximation guarantee for finding a smallest 2-edge-connected spanning subgraph is 4/3.

The key new ingredient of all our algorithms is a special kind of ear-decomposition optimized using forest representations of hypergraphs. The same methods also provide the lower bounds (arising from LP relaxations) that we use to deduce the approximation ratios.

This is joint work with Andras Sebo.

Igor Carboni Oliveira

Columbia University

**Abstract:**
Fortnow and Klivans proved the following relationship between
efficient learning algorithms and circuit lower bounds: if a class
of boolean circuits C contained in P/poly is exactly learnable with
membership and equivalence queries in polynomial-time, then
[;EXP^{NP};] is not contained in C (the class [;EXP^{NP};] was subsequently
improved to EXP by Hitchcock and Harkins). In this paper, we
improve on these results and show

- If C is exactly learnable with membership and equivalence queries in polynomial-time, then DTIME([;n^{\omega(1)};]) is not contained in C. We obtain even stronger consequences if C is learnable in the mistake-bounded model, in which case we prove an average-case hardness result against C.

- If C is learnable in polynomial time in the PAC model then PSPACE is not contained in C, unless PSPACE is contained in BPP. Removing this extra assumption from the statement of the theorem would provide an unconditional separation of PSPACE and BPP.

- If C is efficiently learnable in the Correlational Statistical Query (CSQ) model, we show that there exists an explicit function f that is average-case hard for circuits in C. This result provides stronger average-case hardness guarantees than those obtained by SQ-dimension arguments (Blum et al. 1993). We also obtain a non-constructive extension of this result to the stronger Statistical Query (SQ) model.

Similar results hold in the case where the learning algorithm runs in subexponential time.

Our proofs regarding exact and mistake-bounded learning are simple and self-contained, yield explicit hard functions, and show how to use mistake-bounded learners to "diagonalize"' over families of polynomial-size circuits. Our consequences for PAC learning lead to new proofs of Karp-Lipton-style collapse results, and the lower bounds from SQ learning make use of recent work relating combinatorial discrepancy to the existence of hard-on-average functions.

Joint work with Adam Klivans (UT Austin) and Pravesh Kothari (UT Austin).

Clément Canonne

Columbia University

**Abstract:**
One of the most fundamental problem paradigms in statistics is that
of inferring some information about an unknown probability distribution
D, given access to independent samples drawn from it. More than a
decade ago, Batu et al. initiated the study of problems of this type
from within the framework of /property testing/ - a setting where,
given an unknown "massive object" an algorithm can access
only by making a small (sublinear) number of "local inspections"
(queries) of the object, the goal is to determine whether the object has
a particular property. The algorithm must accept if the object has the
desired property, and reject if the object is far from every object with
the property. In distribution property testing, the "massive object" is
an unknown probability distribution D over an N-element set, and the
tester accesses the distribution by drawing independent samples from it.

We study a new framework for such a task, by considering distribution testing algorithms that have access to a /conditional sampling oracle/. This is an oracle that takes as input a subset S of the domain [N]={1,...,N} of the unknown probability distribution D and returns a draw from the conditional probability distribution D restricted to S. This new model allows considerable flexibility in the design of distribution testing algorithms; in particular, testing algorithms in this model can be adaptive.

We study a wide range of natural distribution testing problems in this new framework and some of its variants, giving both upper and lower bounds on query complexity. These problems include testing whether D is the uniform distribution U; testing whether D = D* for an explicitly provided D*; testing whether two unknown distributions D1 and D2 are equivalent; and estimating the variation distance between D and the uniform distribution.

At a high level our main finding is that the new "conditional sampling" framework we consider is a powerful one: while all the problems mentioned above have Omega(N^(1/2)) sample complexity in the standard model (and in some cases the complexity must be almost linear in N), we give poly(log N, 1/eps)-query algorithms (and in some cases poly(1/eps)-query algorithms independent of N) for all these problems in our conditional sampling setting.

Joint work with Dana Ron (Tel-Aviv University) and Rocco Servedio (Columbia University).

Xiaorui Sun

Columbia University

**Abstract:**
A standard process for testing graph isomorphism is to first assign
distinct labels to a small subset of vertices of an input graph, and
then propagate and create new labels across the graph, aiming to
assign distinct and isomorphism-invariant labels to all vertices in
the graph. This process is usually referred to as the individualization
based refinement for canonical labeling of graphs.

In this talk, we consider a framework for analyzing an extended version of this individualization-based refinement process over Steiner 2-systems. A Steiner 2-system consists of a set of points and a set of lines such that each line passes the same number of points and each pair of points uniquely determines a line. Applying this framework, we show that isomorphism of Steiner 2-systems with n lines can be tested in quasipolynomial time exp(polylog(n)).

An essentially identical result from a rather different perspective was obtained simultaneously by Laszlo Babai and John Wilmes. These results improved the previous best exp(O(n^{1/4}))-time algorithm of Spielman, and the quasipolynomial-time algorithm of Babai and Luks for the special case when the line size is polylogarithmic.

Join work with Xi Chen and Shang-Hua Teng.

Krzysztof Onak

IBM Research

**Abstract:**
I will present [;n^{1+\Omega(1/p)};] lower bounds for the space complexity of
p-pass streaming algorithms for the following problems on n-vertex graphs:

- computing maximum matching size,
- testing if two specific vertices are at distance at most 2(p+1),
- testing if there is a directed path between two specific vertices.

Our result follows from a communication complexity lower bound for a communication game in which the players hold two graphs on the same set of vertices. The task of the players is to find out whether the sets of vertices reachable from a specific vertex in exactly p+1 steps intersect. The game requires a significant amount of communication only if the players are forced to speak in a specific difficult order.

Howard Karloff

AT&T Labs--Research

**Abstract:**
Given a very large, node-weighted, rooted tree on, say, n nodes,
if one has only enough space to display a k-node summary of the tree,
what is the most informative way to draw the tree? We define a type of
weighted tree that we call a "summary tree" of the original tree, that
results from aggregating nodes of the original tree subject to certain
constraints. We suggest that the best choice of which summary tree to
use (among those with a fixed number of nodes) is the one that maximizes
the information-theoretic entropy of a natural probability distribution
associated with the summary tree, and we provide a (pseudopolynomial-time)
dynamic-programming algorithm to compute this maximum entropy summary
tree, when the weights are integral. The result is an automated way
to summarize large trees and retain as much information about them as
possible, while using (and displaying) only a fraction of the original
node set. We also provide an additive approximation algorithm and
a greedy heuristic that are faster than the optimal algorithm, and
generalize to trees with real-valued weights.

This is joint work with Ken Shirley.

Augustin Chaintreau

Columbia University

**Abstract:**
Algorithmic graph theory has thoroughly analyzed how, given a network
describing constraints between various nodes, groups can be formed
among these so that the resulting configuration optimizes a global
metric. In contrast, for various social and economic networks, groups
are formed de facto by the choices of selfish players. A fundamental
problem in this setting is the existence and convergence to a
self-enforcing configuration: assignment of players into groups such
that no player has an incentive to move into another group than hers.
Motivated by information sharing on social networks – and the
difficult tradeoff between its benefits and the associated privacy
risk – we study the possible emergence of such stable configurations
in a general selfish group formation game.

Our paper considers this general game for the first time, and it completes its analysis. We show that convergence critically depends on the level of collusions among the players – which allow multiple players to move simultaneously as long as all of them benefit. Solving a previously open problem we exactly show when, depending on collusions, convergence occurs within polynomial time, non-polynomial time, and when it never occurs. We also prove that previously known bounds on convergence time are all loose: by a novel combinatorial analysis of the evolution of this game we are able to provide the first asymptotically exact formula on its convergence. Moreover, we extend these results by providing a complete analysis when groups may overlap, and for general utility functions representing multi-modal interactions. Finally, we prove that collusions have a significant and positive effect on the efficiency of the equilibrium that is attained.

Joint work with Guillaume Ducoffe and Dorian Mazauric.

Daniel Kane

Stanford University

**Abstract:**
We study some problems relating to polynomials evaluated either at
random Gaussian or random Bernoulli inputs. We present some new work
on a structure theorem for degree-d polynomials with Gaussian inputs.
In particular, if p is a given degree-d polynomial, then p can be
written in terms of some bounded number of other polynomials
q_1,...,q_m so that the joint probability density function of
q_1(G),...,q_m(G) is close to being bounded. This says essentially
that any abnormalities in the distribution of p(G) can be explained by
the way in which p decomposes into the q_i. We then present some
applications of this result.

Esther Ezra

New York University

**Abstract:**
Motivated by **range counting** problems, **relative
approximations** have become a central tool in Computational Geometry.
In fact, they were introduced by Har-Peled and Sharir, who derived
them from the seminal work of Li, Long, and Srinivasan in the theory
of Machine Learning. In this talk I will discuss properties of
relative approximations, and present improved upper bounds for their
size under certain favorable assumptions. Our approach is
probabilistic, where we apply the constructive version of the
Asymmetric Lovasz Local Lemma.

Yoram Moses

Technion

**Abstract:**
Communicated messages play a central role in coordinating events in
distributed computer systems. In purely asynchronous systems,
Lamport's "happened-before" relation, based on message chains, has
been shown to govern causality and coordination. This talk will
consider the role that guarantees regarding message transmission times
play. The existence of clocks and such timing information
significantly affect the ways in which sites can coordinate their
actions. This talk will characterize communication structures that are
necessary and sufficient for coordinating basic patterns of actions in
the presence of clocks. In addition, it will draw parallels with
time/space diagrams and discuss causality. This talk is based on joint
work with Ido Ben Zvi.

Brendan Juba

Harvard University

**Abstract:**
Machine learning is often employed as one step in a larger
application, serving to perform information extraction or data mining
for example. The rules obtained by such learning are then used as
inputs to a further analysis. As a consequence of the inputs to this
analysis having been learned, however, its conclusions can only be
theoretically guaranteed to satisfy the same standard as the
inputs---that of "PAC Semantics" (Valiant, 2000). In this talk, we
explore the benefits of taking the problems of learning and logical
reasoning together, capturing a relatively general family of such
applications. Briefly, the benefitis that we can simultaneously (1)
handle incomplete information, (2) utilize rules that are a reasonable
but imperfect fit for the data, and (3) reach conclusions more
efficiently than is achieved by known algorithms for reasoning from
rules alone. Precisely, we consider a problem of testing the validity
of a "query" formula (hypothesis) against incomplete data. We present
algorithms for soundly verifying such queries that succeed whenever
there exist learnable rules that suffice to complete a simple proof of
the query, matching what can be achieved by such a two-stage learning
and reasoning process.

Angelina Vidali

Duke University

**Abstract:**
We study the problem where a task must be executed, there are multiple
machines/agents that can potentially perform the task, and our
objective is to minimize the expected sum of the agents' processing
times. Each agent does not know exactly how long it will take him to
finish the task; he only knows the distribution from which this time
is drawn. These times are independent across agents and the
distributions fulfill the monotone hazard rate condition. Agents are
selfish and will lie about their distributions if this increases their
expected utility.

We construct a mechanism that takes as input the agents' reported distributions and that outputs a schedule that minimizes the expected sum of processing times, as well as payments that make it an ex-post equilibrium for the agents to both truthfully report their distributions and exert full effort to complete the task.

Joint work with: Vincent Conitzer

Contact

There is a mailing list for the reading group. General information about the mailing list (including how to subscribe yourself to it) is available here. If you want to unsubscribe or change your options, send email to theoryread-request@lists.cs.columbia.edu with the word `help' in the subject or body (don't include the quotes), and you will get back a message with instructions.

Comments on this page are welcome; please send them to
` xichen-at-cs.columbia.edu
`