For Fall 2009, the usual time for the meetings will be Fridays at 12:00-1:00pm in the CS conference room, CSB 453. Abstracts for talks are given below the talk schedule.

Mariana Raykova

Columbia University

**Abstract:**
Computing Set Intersection privately and efficiently between two
mutually mistrusting parties is an important basic procedure in the area
of private data mining. Assuring robustness, namely, coping with
potentially arbitrarily misbehaving (i.e., malicious) parties, while
retaining protocol efficiency (rather than employing costly generic
techniques) is an open problem. In this work the first solution to this
problem is presented.

Joint work with Dana Dachman-Soled, Tal Malkin, and Moti Yung

Jon Feldman

Google NY

**Abstract:**
We study the online stochastic bipartite matching problem, in a form
motivated by display ad allocation on the Internet. In the online, but
adversarial case, the celebrated result of Karp, Vazirani and Vazirani
gives an approximation ratio of 1 - 1/e = 0.632, a very familiar bound
that holds for many online problems; further, the bound is tight in
this case. In the online, stochastic case when nodes are drawn
repeatedly from a known distribution, the greedy algorithm matches
this approximation ratio, but still, no algorithm is known that beats
the 1 - 1/e bound.

Our main result is a 0.67-approximation online algorithm for stochastic bipartite matching, breaking this 1 - 1/e barrier. Furthermore, we show that no online algorithm can produce a 1-epsilon approximation for an arbitrarily small epsilon for this problem.

Our algorithms are based on computing an optimal offline solution to the expected instance, and using this solution as a guideline in the process of online allocation. We employ a novel application of the idea of the power of two choices from load balancing: we compute two disjoint solutions to the expected instance, and use both of them in the online algorithm in a prescribed preference order. To identify these two disjoint solutions, we solve a max flow problem in a boosted flow graph, and then carefully decompose this maximum flow to two edge-disjoint (near-)matchings. In addition to guiding the online decision making, these two offline solutions are used to characterize an upper bound for the optimum in any scenario. This is done by identifying a cut whose value we can bound under the arrival distribution.

Joint work with Aranyak Mehta, Vahab Mirrokni and S. Muthukrishnan.

Ragesh Jaiswal

Columbia University

**Abstract:**
We extend recent results of Arthur and Vassilvitskii to provide a
pseudo-approximation algorithm
for the k-means clustering problem in the batch (non-streaming) setting.
We then use this algorithm in
an extended version of the divide-and-conquer strategy by Guha et al. to
obtain a streaming algorithm
for the k-means problem with good approximation/memory tradeoffs. More
specifically, we first extend
the k-means++ algorithm, due to Arthur and Vassilvitskii, which is a
randomized algorithm that gives
an O(log k) approximation to the k-means objective in the batch setting.
Our new algorithm, is shown
to give a constant approximation factor to the k-means objective, and
outputs O(k log k) centers. We
then provide a variant of Guha et al.'s algorithm for k-means in the
one-pass (streaming) setting, resulting
in an efficient algorithm which outputs exactly k centers and gives a
tradeoff of O(n^a) memory for an
O(c^(1/a) log(k)) approximation for any constant a<1 and some fixed
constant c>1. Empirical evaluations
on real and simulated data reveal the practical utility of our method.
I will also talk about a quantitatively improved version of our first
result obtained independently by
Aggarwal et al. and some interesting open questions.

This is a joint work with Claire Monteleoni (CCLS, Columbia) and Nir Ailon (Google).

Ariel Gabizon

Columbia University

**Abstract:**
Let F be the field of q elements. An (n,k)-affine extractor is a mapping D:Fn -> {0,1} such that for any k-dimensional affine subspace X ⊆ Fn, D(x) is an almost unbiased bit when x is chosen uniformly from X.

Loosely speaking, the problem of explicitly constructing affine extractors gets harder as q gets smaller and easier as k gets larger. This is reflected in previous results: When q is `large enough', specifically q=(n2), Gabizon and Raz construct affine extractors for any k ≥ 1. In the `hardest case', i.e. when q=2, Bourgain constructs affine extractors for k ≥ δ n for any constant (and even slightly sub-constant) δ >0. Our main result is the following: Fix any k ≥ 2 and let d=5n/k. Then whenever q> 2d^2 and p=char(F)>d, we give an explicit (n,k)-affine extractor. For example, when k= δn for constant δ>0, we get an extractor for a field of constant size Ω( (1/δ)^2).

Thus our result may be viewed as a `field-size/dimension' tradeoff for affine extractors. Although for large k we are not able to improve (or even match) the previous result of Bourgain, our construction and proof have the advantage of being very simple: Assume n is prime and d is odd, and fix any non-trivial linear map T:Fn |-> F. Define QR:F |-> {0,1} by QR(x)=1 if and only if x is a quadratic residue. Then, the function D:Fn |->{0,1} defined by D(x)=QR(T(xd)) is an (n,k)-affine extractor.

Our proof uses a result of Heur, Leung and Xiang giving a lower bound on the dimension of products of subspaces.

Joint Work with Matt DeVos.

Li-Yang Tan

Columbia University

**Abstract:**
We give the first non-trivial upper bounds on the average sensitivity
and noise sensitivity of degree-d polynomial threshold functions
(PTFs). These bounds hold both for PTFs over the Boolean hypercube
and for PTFs over R^n under the standard n-dimensional Gaussian
distribution. Our bound on the Boolean average sensitivity of PTFs
represents progress towards the resolution of a conjecture of Gotsman
and Linial, which states that the symmetric function slicing the
middle d layers of the Boolean hypercube has the highest average
sensitivity of all degree-d PTFs. Via the L_1 polynomial regression
algorithm of Kalai et al., our bounds on Gaussian and Boolean noise
sensitivity yield polynomial-time agnostic learning algorithms for
the broad class of constant-degree PTFs under these input distributions.

We also give a regularity lemma for degree-d PTFs over the Boolean cube {-1,1}^n. This result shows that every degree-d PTF can be decomposed into a constant number of subfunctions such that almost all of the subfunctions are close to being regular PTFs.

Joint work with Ilias Diakonikolas, Prasad Raghavendra, Rocco Servedio and Andrew Wan.

Seung Geol Choi

Columbia University

**Abstract:**
Multi-party secure computations are general important procedures to compute any
function while keeping the security of private inputs. In this work we ask
whether preprocessing can allow low latency (that is, small round) secure
multi-party protocols that are universally-composable (UC). In particular, we
allow any polynomial time preprocessing as long as it is independent of the
exact circuit and actual inputs of the specific instance problem to solve, with
only a bound k on the number of gates in the circuits known.

To address the question, we first define the model of "Multi-Party Computation on Encrypted Data"(MP-CED), implicitly described in [FH96,JJ00,CDN01,DN03]. In this model, computing parties establish a threshold public key in a preprocessing stage, and only then private data, encrypted under the shared public key, is revealed. The computing parties then get the computational circuit they agree upon and evaluate the circuit on the encrypted data. The MP-CED model is interesting since it is well suited for modern computing environments, where many repeated computations on overlapping data are performed.

We present two different round-efficient protocols in this model:

- The first protocol generates k garbled gates in the preprocessing stage and
requires only two (online) rounds.

- The second protocol generates a garbled universal circuit of size O(k log k)
in the preprocessing stage, and requires only one (online) round (i.e., an
obvious lower bound), and therefore it can run asynchronously.

Both protocols are secure against an active, static adversary controlling any number of parties. When the fraction of parties the adversary can corrupt is less than half, the adversary cannot force the protocols to abort. The MP-CED model is closely related to the general Multi-Party Computation (MPC) model and, in fact, both can be reduced to each other. The first (resp. second) protocol above naturally gives protocols for three-round (resp. two-round) universally composable MPC secure against active, static adversary controlling any number of parties (with preprocessing).

Joint work with Ariel Elbaz, Tal Malkin, and Moti Yung.

Andrew Wan

Columbia University

**Abstract:**
We give an efficient algorithm that takes as input any
(probabilistic) polynomial time algorithm A which purports to solve
SAT and finds, for infinitely many input lengths, SAT formulas phi
and witnesses w such that A claims phi is unsatisfiable, but w is a
satisfying assignment for phi (assuming NP notin RP). This solves an
open problem posed in the work of Gutfreund, Shaltiel, and Ta-Shma
(CCC 2005). Following Gutfreund et al., we also extend this to give an
efficient sampling algorithm (a "quasi-hard sampler") which generates
hard instance/witness pairs for all algorithms running in some fixed
polynomial time.

We ask how our sampling algorithm relates to various cryptographic notions. We show that our sampling algorithm gives a simple construction of quasi-one-way functions, a weakened notion of standard one-way functions. We also investigate the possibility of obtaining pseudorandom generators from our quasi-one-way functions and show that a large class of reductions that work in the standard setting must fail.

Joint work with Andrej Bogdanov and Kunal Talwar.

Michel Goemans

MIT

**Abstract:**
The traveling salesman problem (TSP) --- the problem of finding the
shortest tour through a set of cities --- is the prototypical
combinatorial optimization problem. It comes in two flavors, the
symmetric (STSP) and asymmetric (ATSP) versions; the latter
corresponds to the setting in which the cost of traveling from A to B
might differ from the cost from B to A. Whether one can efficiently
find solutions in the asymmetric case that are within a constant
factor of optimal has proved to be elusive despite considerable
efforts, while the same question in the symmetric case has been
settled for decades.

In this talk, I will describe a randomized algorithm for ATSP which
delivers a solution within a factor $O(\log n/ \log\log n)$ of the
optimum. This is the first approximation algorithm that breaks the
logarithmic glass ceiling of approximability. The algorithm is
somewhat reminiscent of Christofides' algorithm for the symmetric
case. It first solves the classical Held-Karp linear programming
relaxation of the problem, then generates a spanning tree T while
disregarding the orientations of the arcs, and finally augments it
into a directed Eulerian graph at minimum cost by solving a minimum
cost flow problem. The cost of the final step depends on the {\it
thinness} of the tree T. We show how to construct a suitable
probability distribution on spanning trees which are sufficiently thin
with high probability.

During the talk, our tour will bring us to visit polyhedral arguments,
negative correlation, random spanning trees, maximum entropy
distributions, approximate minimum cuts, Kirchhoff's matrix tree
theorem, and several other concepts.

This is joint work with Arash Asadpour, Aleksander Madry, Shayan Oveis
Gharan and Amin Saberi.

BIO

Michel Goemans is the Leighton Family Professor of Applied Mathematics
at the Massachusetts Institute of Technology, and a faculty member of
MIT Computer Science and Artificial Intelligence Laboratory (CSAIL)
and of MIT Operations Research Center (ORC). He is currently the
Chairman of Applied Mathematics at MIT. He has held an Adjunct
Professorship at the University of Waterloo, a Professorship at the
University of Louvain and a visiting Professorship at RIMS, Kyoto.

His research --- in the areas of discrete algorithms and combinatorial
optimization --- has been rewarded by several prizes, in particular
the 2000 AMS-MPS Fulkerson Prize and twice the SIAM Optimization
prize. He is an ACM Fellow, a Guggenheim Fellow and a Sloan Foundation
Fellow.

Contact

There is a mailing list for the reading group. General information about the mailing list (including how to subscribe yourself to it) is available here. If you want to unsubscribe or change your options, send email to theoryread-request@lists.cs.columbia.edu with the word `help' in the subject or body (don't include the quotes), and you will get back a message with instructions.

Comments on this page are welcome; please send them to
` tal-at-cs.columbia.edu
`

Last updated 11/10/2009.