For Fall 2011, the usual time for the meetings will be Friday at 12:00 - 13:30 in the CS conference room, CSB 453. Abstracts for talks are given below the talk schedule.
Abstract: Physarum is a slime mold. It was observed over the past 10 years that the mold is able to solve shortest path problems and to construct good Steiner networks (Nakagaki-Yamada-Toth, Tero-Takagi-etal). In a nutshell, the shortest path experiment is as follows: A maze is built and the mold is made to cover the entire maze. Food is then provided at two positions [;s;] and [;t;] and the evolution of the slime is observed. Over time, the slime retracts to the shortest [;s;]-[;t;]-path.
A mathematical model of the slime's dynamic behavior was proposed by Tero-Kobayashi-Nakagaki. The network is modelled as an undirected graph G = (V,E) with two special vertices [;s;] and [;t;]. Each edge [;e;] has a diameter [;D_e;] and a length [;L_e;]. The length is fixed and the diameter changes over time. A flow of one is sent from [;s;] to [;t;] according to the following rule. The network is interpreted as an electrical network where the edge [;e;] has resistance [;L_e/D_e;]. Let [;Q_e;] be the resulting flow over edge [;e;]. The edge diameters develop according to the rule [; \dot{D}_e = \abs{Q_e} - D_e ,;] where [;\dot{D}_e;] is the derivative of [;D_e;] with respect to time. If the current flow is larger (smaller) than the current diameter, the diameter increases (decreases).
Extensive computer simulations confirm the experimental findings. For the edges on the shortest [;s;]-[;t;]-path, the diameter converges to one, and for the edges off the shortest path, the diameter converges to zero.
We prove that the dynamics converges to the shortest path for arbitrary initial conditions.
Abstract: We propose a framework for studying property testing of collections of distributions, where the number of distributions in the collection is a parameter of the problem. Previous work on property testing of distributions considered single distributions or pairs of distributions. We suggest two models that differ in the way the algorithm is given access to samples from the distributions. In one model the algorithm may ask for a sample from any distribution of its choice, and in the other the distributions from which it gets samples are selected randomly. Our main focus is on the basic problem of distinguishing between the case that all the distributions in the collection are the same (or very similar), and the case that it is necessary to modify the distributions in the collection in a non-negligible manner so as to obtain this property. We give almost tight upper and lower bounds for this testing problem, as well as study an extension to a clusterability property. One of our lower bounds directly implies a lower bound on testing independence of a joint distribution, a result which was left open by previous work.
In a more recent work, we consider another basic problem of testing a basic property of collections of distributions: having similar means. Namely, the algorithm should accept collections of distributions in which all distributions have means that do not differ by more than some given parameter, and should reject collections that are relatively far from having this property. We provide upper and lower bounds in both models. In particular, in the query model, the complexity of the problem is polynomial in [;1/\epsilon;] (where [;\epsilon;] is the given distance parameter). On the other hand, in the sampling model, the complexity grows roughly as [;m^{1-\poly(1/\epsilon)};], where [;m;] is the number of distributions.
Joint work with Prof. Dana Ron and Prof. Ronitt Rubinfeld.
Abstract: A potential problem in outsourcing work to commercial cloud computing services is trust. If we store a large data set with a service provider, and ask them to perform a computation on that data set -- for example, to compute the eigenvalues of a large graph, or to compute a linear program on a large matrix derived from a database -- how can we know the computation was performed correctly? Obviously we don't want to compute the result ourselves, and we might not even be able to store all the data locally. This leads to new problems in the streaming paradigm: we consider a streaming algorithm (modelling a user with limited memory and computational resources) that can be assisted by a powerful helper (the service provider). The goal of the service provider is to not only provide the user with answer, but to convince the user the answer is correct.
In this talk, I will give a unified overview of a recent line of work exploring the application of proof systems to problems that are streaming in nature. In all of these protocols, an honest service provider can always convince the data owner that the answer is correct, while a dishonest prover will be caught with high probability. The protocols I will discuss utilize and extend powerful ideas from communication complexity and the theory of interactive proofs, and I will argue that many are highly practical, achieving millions of updates per second and requiring little space and communication.
Joint work with Amit Chakrabarti, Graham Cormode, Andrew McGregor, Michael Mitzenmacher, and Ke Yi
Abstract: A survey of the many ways in which applying the tools of complexity theory to problems in Game Theory results in crucial insights into important aspects of this field and often informs complexity as well.
Abstract: Multiplicative weights mechanisms have recently emerged as a powerful tool in the context of privacy-preserving query release. In this setting, a trusted database curator maintains a data set of sensitive information about individual participants and releases approximate answers to a set of statistical queries given by an untrusted data analyst. The goal is to achieve high accuracy of the answers while achieving the privacy standard known as differential privacy.
In this talk we will see a differentially private multiplicative weights framework suitable for answering a massive number of queries even if the queries arrive online and may be adaptively chosen. The accuracy of our algorithm in terms of database size and number of queries matches the statistical sampling error that already arises in the absence of any privacy concerns. Our algorithm can also be used to produce a differentially private synthetic data set encoding the curator's answers. For this task the running time of our algorithm--- which is linear in the data universe---is essentially optimal due to a prior cryptographic hardness result.
In spite of this hardness result, we give a simple and practical variant of our algorithm in the case where all queries are known in advance. Our experiments demonstrate significant improvements over previous state of the art on various real-world data sets.
Based on joint work with Guy Rothblum, and, joint work with Katrina Ligett and Frank McSherry
Abstract: The wide variety of small, computationally weak devices, and the growing number of computationally intensive tasks makes the delegation of computation to large data centers a desirable solution. However, computation outsourcing is useful only when the result returned can be trusted, which makes verifiable computation (VC) a must for such scenarios. In this work we extend the definition of verifiable computation in two important directions: public delegation and public verifiability, which have important applications in many practical delegation scenarios. Yet, existing VC constructions based on standard cryptographic assumptions fail to achieve these properties. As the primary contribution of our work, we establish an important (and somewhat surprising) connection between verifiable computation and attribute-based encryption (ABE), a primitive that has been widely studied. Namely, we show how to construct a VC scheme with public delegation and public verifiability from any ABE scheme. The VC scheme verifies any function in the class of functions covered by the permissible ABE policies. This scheme enjoys a very efficient verification algorithm that depends only on the output size. We show a similar construction from ABE with outsourced decryption~\cite{green2011outsource}, which gives us a multi-function VC scheme that allows the verifiable evaluation of multiple functions on the same preprocessed input. We also explore the opposite direction of the ABE-VC relationship and show an ABE construction from a modified VC scheme.
Joint work with Bryan Parno and Vinod Vaikuntanathan
Abstract: In the classical shortest path problem, we want to find a path that minimizes some single weight-parameter, e.g. cost, or length. But there are many scenarios where one needs to juggle several independent parameters. This comes up especially often in QoS (quality of service) routing, where a company wants to find a path that is cheap, but doesn't create too long a delay for the user.
The restricted shortest path problem, also know as the constrained or bi-criteria shortest path problem, models this scenario by assigning two weights to each edge in the graph: a cost-weight, and a delay-weight. The goal is to find the shortest path from s-t that minimizes total cost while having delay-length less than some given threshold T.
Dealing with two separate parameters turns out to be much harder; in fact, the restricted shortest path problem problem is NP-complete. But a sequence of papers in the 90's culminated in a O(mn), (1 + [;\epsilon;]) approximation algorithm. We present the first paper to go beyond this barrier, achieving a close to linear linear time (1 + [;\epsilon;]) approximation algorithm -- the actual running time is O(m [;2^{\sqrt{log(n)}loglog(n)};]), which is o(m[;n^c;]) for any fixed c > 0. Our algorithm only works for undirected graphs.
Abstract: Genus distribution of a graph is a surface-wise distribution of the number of cellular embeddings of the graph on all orientable surfaces. I describe an O([;n^2;])-time algorithm for calculating the genus distribution of any 4-regular outerplanar graph. Unlike many of the graph families for which genus distributions have been previously calculated, this family of graphs is not contrived and is of interest to graph theorists working in other areas.
The algorithm makes heavy use of constructs known as productions that were first introduced by Gross, Khan and Poshni in 2010. These are schematic representations that capture the effect of performing different types of graph operations on graph embeddings by modeling the changes produced in the face-boundary walks incident on predesignated vertices and edges known as roots. The productions used in this algorithm are adapted from a previous paper by Gross published in 2011. The algorithm uses a divide-and-conquer approach. It breaks down a given instance of a 4-regular outerplanar graph into a multi-component auxiliary graph, and then computes the genus distribution of the original graph by working with these smaller components in a bottom up fashion. The quadratic time-complexity of this algorithm is a significant improvement over the O([;6^n;]) time-complexity of the more general Heffter-Edmonds algorithm when restricted to 4-regular graphs.
Abstract: We give a new Satisfiability algorithm for constant-depth circuits. More generally, our algorithm describes the set of 1's of the circuit as a disjoint union of restrictions under which the circuit becomes constant. This allows us to count the number of solutions as well as decide Satisfiability. Let $d$ denote the depth of the circuit and $cn$ denote the number of gates. The algorithm runs in expected time |C| [;2^{n(1-\mu_{c.d})};] where |C| is the size of the circuit and [;\mu_{c,d} \ge \frac{1}{\bigO[\lg c + d \lg d]^{d-1}};].
As a corollary, we get a tight bound on the maximum correlation of such a circuit with the parity function. (Hastad has also recently shown a similar correlation bound.)
The tightness of this bound shows that the time our algorithm takes matches the size of the smallest description as above. By recent results of Ryan Williams, any substantial improvement in our running time, even for Satisfiability, would show that NEXP [;\subsetneq;] NC1. We use a new generalized switching lemma that applies to families of CNF's, rather than individual CNF's.
Joint work with William Matthews (UCSD, soon to be at Google) and Ramamohan Paturi, (UCSD)
Comments on this page are welcome; please send them to xichen-at-cs.columbia.edu