Moti Yung Named Recipient of the 2021 IEEE Computer Society Women of ENIAC Computer Pioneer Award

The award recognizes Moti Yung’s significant contributions to the field of cryptography, specifically, coinventing malicious cryptography and pioneering contributions to distributed cryptosystems.

Papers That Explore Novel Methods to Use Artificial Intelligence Accepted to AAAI 21

Research from the department was accepted to the 35th AAAI Conference on Artificial Intelligence. The conference promotes research in artificial intelligence (AI) and scientific exchange among AI researchers, practitioners, scientists, and engineers in affiliated disciplines.

Automated Symbolic Law Discovery: A Computer Vision Approach
Hengrui Xing Columbia University, Ansaf Salleb-Aouissi Columbia University, Nakul Verma Columbia University

One of the most exciting applications of modern artificial intelligence is to automatically discover scientific laws from experimental data. This is not a trivial problem as it involves searching for a complex mathematical relationship over a large set of explanatory variables and operators that can be combined in an infinite number of ways. Inspired by the incredible success of deep learning in computer vision, the authors tackle this problem by adapting various successful network architectures into the symbolic law discovery pipeline. The novelty of this new approach is in (1) encoding the input data as an image with super-resolution, (2) developing an appropriate deep network pipeline, and (3) predicting the importance of each mathematical operator from the relationship image. This allowed to prior the exponentially large search with the predicted importance of the symbolic operators, which can significantly accelerate the discovery process.

The model was then applied to a variety of plausible relationships—both simulated and from physics and mathematics domains—involving different dimensions and constituents. The authors show that their model is able to identify the underlying operators from data, achieving a high accuracy and AUC (91% and 0.96 on average resp.) for systems with as many as ten independent variables. Their method significantly outperforms the current state of the art in terms of data fitting (R^2), discovery rate (recovering the true relationship), and succinctness (output formula complexity). The discovered equations can be seen as first drafts of scientific laws that can be helpful to the scientists for (1) hypothesis building, and (2) understanding the complex underlying structure of the studied phenomena. This novel approach holds a real promise to help speed up the rate of scientific discovery.

Bounding Causal Effects on Continuous Outcome
Junzhe Zhang Columbia University, Elias Bareinboim Columbia University

One of the most common methods for policy learning used throughout the empirical sciences is the use of randomization of the treatment assignment. This method is considered the gold standard within many disciplines and can be traced back, at least, to Fisher (Fisher 1935) and Neyman (Neyman 1923). Whenever human subjects are at the center of the experiment, unfortunately, issues of non-compliance arise. Namely, subjects do not necessarily follow the experimental protocol and end up doing what they want. It is well-understood that under such conditions, unobserved confounding bias will emerge. For instance, subjects who did not comply with the treatment assignment may be precisely those who would have responded adversely to the treatment. Therefore, the actual causal effects of the treatment, when it is applied uniformly to the population, might be substantially less effective than the data reveals. Moreover, since one does not observe how subjects decide/respond to the realized treatment, the actual treatment effects are not uniquely computably from the collected data, called non-identifiable.

Robins (1989) and Manski (1990) derived the first informative bounds over the causal effects from studies with imperfect compliance under a set of non-parametric assumptions called instrumental variables (IV). In their seminal work, Balke and Pearl (1994a, 1997) improved earlier results by employing an algebraic method to derive analytic expressions of the causal bounds, which are provably optimal. However, this approach assumes the primary outcome to be discrete and finite. Solving such a program could be intractable when high-dimensional context variables are present.

This paper presents novel non-parametric methods to bound causal effects on the continuous outcome from studies with imperfect compliance. These methods could be generalized to settings with a high-dimensional context. Perhaps surprisingly, this paper introduced a latent data representation that could characterize all constraints on the observational and interventional distributions implied by IV assumptions, even when the primary outcome is continuous. Such representation allows one to reduce the original bounding problem to a series of linear programs. Solve these programs, therefore, leads to tight causal bounds.

Estimating Identifiable Causal Effects through Double Machine Learning
Yonghan Jung, Jin Tian, Elias Bareinboim Columbia University

Learning causal effects from observational data is a pervasive challenge found throughout the data-intensive sciences. General methods of determining the identifiability of causal effect from a combination of observational data and causal knowledge about the underlying system have been well-understood in theory. In practice, however, there are still challenges to estimating identifiable causal functionals from finite samples. Recently, a novel approach, named double/debiased machine learning (DML) (Chernozhukov et al. 2018), has been proposed to learn parameters leveraging modern machine learning techniques, which are both robust to model misspecification (‘doubly robust’) and slow convergence (‘debiased’). Still, DML has only been used for causal estimation in settings when the back-door condition (also known as conditional ignorability) holds.

This paper aims to bridge this gap by developing a general class of estimators for any identifiable causal functionals that exhibit robustness properties of DML estimators, which the authors called ‘DML-ID.’ In particular, they provide a complete procedure for deriving an essential ingredient of the DML estimator called an influence function (IF) and construct a general class of estimators based on the IF. This means that one can estimate any causal functional and enjoy two robustness properties, doubly robustness and debiasedness.

Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding
Long Chen Tencent AI Lab, Wenbo Ma Zhejiang University, Jun Xiao Zhejiang University, Hanwang Zhang Nanyang Technological University, Shih-Fu Chang Columbia University

The prevailing framework for solving referring expression grounding is based on a two-stage process: 1) detecting proposals with an object detector and 2) grounding the referent to one of the proposals. Existing two-stage solutions mostly focus on the grounding step, which aims to align the expressions with the proposals.

In this paper, the researchers argue that these methods overlook an obvious mismatch between the roles of proposals in the two stages: they generate proposals solely based on the detection confidence (i.e., expression-agnostic), hoping that the proposals contain all right instances in the expression (i.e., expression-aware). Due to this mismatch, current two-stage methods suffer from a severe performance drop between detected and ground-truth proposals.

The paper proposes Ref-NMS, which is the first method to yield expression-aware proposals at the first stage. Ref-NMS regards all nouns in the expression as critical objects, and introduces a lightweight module to predict a score for aligning each box with a critical object. These scores can guide the NMS operation to filter out the boxes irrelevant to the expression, increasing the recall of critical objects, resulting in a significantly improved grounding performance.

Since RefNMS is agnostic to the grounding step, it can be easily integrated into any state-of-the-art two-stage method. Extensive ablation studies on several backbones, benchmarks, and tasks consistently demonstrate the superiority of Ref-NMS. Codes are available at: https://github.com/ChopinSharp/ref-nms.

North Korea-Backed Hackers Targeting Security Researchers

Columbia Center of AI Technology Announces Faculty Research Awards and Two PhD Student Fellowships

CS researchers are among the recipients of the inaugural awards from the Columbia Center of Artificial Intelligence Technology (CAIT). Amazon is providing $5 million in funding over five years to support research, education, and outreach programs.

Q&A: Jihye Kwon on PhD Research Projects

Jihye Kwon, a computer engineering PhD student, talks about her research projects and what it took to win a Best Paper award.

What drew you to computer engineering, specifically the application of machine learning to computer-aided design? What questions or issues do you hope to answer?

I was attracted to the concept of a computer: a machine that performs calculations. I found it very interesting how modern computers evolved from executing one instruction at a time to executing many instructions simultaneously by exploiting multiple levels of parallelism. Still, various challenges remained, or newly arose, so I dreamed about designing a brand-new computer system. That is what I had in mind when coming to Columbia.

At the beginning of my PhD, I experimented and learned how to design the core parts of special-purpose computers, using computer-aided design tools. I also explored machine learning from both theoretical and practical perspectives. These activities led me to work on my current research problems.

In advanced computer-aided design of computer systems, computers solve many complex optimization problems in steps to generate a final design. They do so as guided by the designers via means of the configurable ‘knobs’. My focus is on the designers’ work.

For a target system, designers run the computer-aided design tools repeatedly with the many different knob configurations until the tools output final designs with optimal or desired properties, e.g., in timing, area, and power. I wondered if machines can learn, from designers’ previous work, how to configure the knobs to optimize a new target system. Can designers virtually collaborate across time and tasks through the machine learning models? These are the main questions that I hope to answer.

Could you talk about your research and how you collaborated with other groups? Was this something you considered when applying to Columbia – that there are opportunities to do multi-disciplinary work?

When I was applying to Columbia, I wished I could have collaboration opportunities to study and work in the interdisciplinary research communities at the center of New York City. I wanted to explore applications of computer science in different areas to eventually gain insight and inspiration for my own research, which is centered at computer engineering.

Fortunately, these were realized as I worked with my advisor, Professor Luca Carloni. I was invited to join the project “Energy Efficient Computing with Chip-Based Photonics”, which is a part of a large initiative supported by the government and industry. In this project, I worked closely with Lightwave Research Laboratory in Electrical Engineering on a new optical computing system. We proposed the concept of a next-generation computing system that is co-designed with silicon photonics and electronic circuitry, in order to overcome the fundamental and physical limitations of today’s computers.

Another project on optical communication was initiated from a student project that I mentored in my advisor’s class, Embedded Scalable Platforms. This project investigated the use of photonic switches in optically-connected memory systems for deep learning applications.

Outside Columbia, I have also collaborated with researchers at IBM TJ Watson Research Center via my summer internships on the project of auto-tuning computer-aided design flows for commercial supercomputers. All these collaborations opened new horizons for me.

You won the MLCAD 2020 Best Paper award for your research, can you talk about your process – how did the research come about? How long did it take you to complete the work? What were the things you had to overcome?

In this work, I proposed a novel machine learning approach for computer-aided design optimization of hardware accelerators. I wanted to address this problem because it is computationally very expensive to explore the entire optimization space. It took me about one year to complete the work. One of the biggest difficulties I faced was the limited availability of the data for applying machine learning to the problem.

Then, I found out that transfer learning has been recently successfully applied in other areas with limited data. In transfer learning, a model trained for a related problem (e.g., natural image recognition) is transferred to aid the machine learning for the target problem (e.g., face recognition). Hence, I tried to apply transfer learning to my research problem. I trained a neural network model for a different accelerator design, and transferred the model to predict the design properties of a target accelerator.

However, the transferred model did not perform well in this case. I realized that due to the diverse characteristics of the accelerators, I needed to distinguish which piece of the source information should be transferred. Based on this intuition, I constructed a series of new models, and eventually, proposed one with promising performance. While it was a long process of building new models without knowing the answers, my advisor greatly encouraged me in our discussions to keep moving forward, and it was very rewarding in the end.

The Machine Learning for Systems session from the 2nd ACM/IEEE Workshop on Machine Learning for CAD (MLCAD) can be viewed here and the Best Paper announcement here.

Looking back, how have you grown as a researcher and a person?

Besides expanding my problem-solving capabilities and technical skills, I have grown to be a better presenter and communicator. One of the tasks of a researcher is to explain one’s work to various groups and different types of audiences. I had a number of opportunities to present my work at academic conferences, seminars at companies, lightning talks, and annual project reviews. Initially, I struggled to meet the audience’s interests whose expertise spans a diverse range of areas and levels. Through those opportunities, I have received very helpful feedback, I have tried to ask myself questions from other people’s perspectives and progressively learned to keep a good balance between abstraction and elaboration.

Also, by interacting with a lot of students with heterogeneous backgrounds in the classes I TA’ed, I have learned to understand what their questions mean and where they come from. Based on that, I tried to adjust my answers to have more relatable conversations. From those conversations, sometimes the students found the topics very interesting, and sometimes I learned something new from them. It was such a great pleasure to inspire others and to be inspired. I think those experiences have made me a better researcher and person.

There are many organizations on campus, why did you choose to join Womxn in Computer Science (WiCS)?

In Fall 2017, I received an invitation from WiCS’ president, Julia Di, and was impressed by the passionate and caring board members working on the common goal of supporting the advancement of womxn in computer science. In my second year I launched the WiCS Lightning Talks for students with research experience to share their work and stories. The goal was for young students to get to know more about research and demystify the process.

I am one of the few women at Columbia in my research area of computer engineering and would like to contribute to inspiring the next generation to join us.

What was the highlight of your time at Columbia?

Every moment was special for me. Some of the highlights were during happy hour with members of the fishbowl. The fishbowl is a large office occupied by the majority of PhD students in computer engineering. We call it the fishbowl, because it is surrounded by large windows and students inside look like small fishes. Once, my colleagues talked about their memories of old computers that I had never seen. I enjoyed imagining the machines from their descriptions, and thinking about different types and generations of computers.

What is your advice to students on how to navigate their time at Columbia?

Explore, experience, and exploit. There are recommended lists of classes, activities, and companies, depending on your track and interests, but no one is exactly like you. There is such a great variety of opportunities and resources at Columbia and in New York City. I hope you can spend enough time exploring them and get involved in many ways before determining your academic and career goals.

Is there anything else that you think people should know?

Columbia is beautiful in the snow! It gets pretty windy in the winter, so please be aware if you are coming from warmer places. There are many places where you can study but Avery Library is my favorite library on campus. If you have any questions or opinions on this Q&A story, please feel free to drop me a line!

Maritza Johnson (PhD ’12) To Lead New Center For Data, AI and Society At The University of San Diego

The University of San Diego has named Maritza Johnson (PhD ’12) as the director of the new center that will address issues concerning big data and artificial intelligence, and their social implications — ethics, privacy, and fairness.

How She Made It

The Computer Science Department Student Services Team is excited to continue The How She Made It Series, a new Columbia University Computer Science Department initiative, which features fabulous womxn-identifying industry professionals and prestigious speakers!

How She Made It interviews diverse, trailblazing professionals, and allows students to learn more about how the speakers launched into industry, overcame obstacles, and became the successful superstars they are today!

Click on the thumbnails below to view the video recording of the presentations.

Zhou Yu and Henry Yuen Join the Department

The computer science department welcomes two new faculty members. These new CS faculty are facilitating research and learning in natural language processing and quantum theory.

Zhou Yu
Assistant Professor
PhD Computer Science, Carnegie Mellon University 2017
BS Computer Science and BA Linguistics, Chu Kochen Honors College at Zhejiang University 2011

Zhou Yu designs algorithms for real-time intelligent interactive systems that coordinate with user actions that are beyond spoken languages, including non-verbal behaviors to achieve effective and natural communications. In particular, she optimizes human-machine communication via studies of multimodal sensing and analysis, speech and natural language processing, machine learning and human-computer interaction.

Yu aims to bring together all the areas above to design, implement and deploy end-to-end real-time interactive intelligent systems that are able to plan globally considering interaction history and current user actions to achieve better user experience and task performance.

Prior to Columbia, Yu was an Assistant Professor in the Computer Science Department at University of California, Davis, where she directed the Davis NLP Lab. She is the winner of a Google GCP Credit Award and has been named an Amazon Alexa Prize Champion and a 2018 Forbes 30 Under 30 in Science.

Yu is teaching Dialog Systems this Spring and is looking for PhD, master, and undergrad students with a background in natural language processing to join her research projects.

Henry Yuen
Assistant Professor
PhD Computer Science, Massachusetts Institute of Technology (MIT) 2016
BA Mathematics, University of Southern California 2010

Henry Yuen is a theoretical computer scientist whose goal is to understand the fundamental principles of computation in a universe governed by quantum physics (such as ours). He studies questions at the interface of quantum information theory, computational complexity theory, and cryptography.

He has made a number of contributions to the theory of quantum multiprover interactive proofs, including the discovery that such interactive proofs can verify solutions to uncomputable problems. Yuen also works on quantum cryptography; some of his contributions include designing protocols for infinite randomness expansion using untrusted quantum hardware.

Yuen was a postdoctoral associate at UC Berkeley from 2016-2018 and an assistant professor in Computer Science and Mathematics at the University of Toronto between 2018–2020. He is the recipient of a Simons-Berkeley Research Fellowship and a Google Quantum Research Award.

This semester he is teaching Introduction to Quantum Computing that is aimed at graduate students and advanced undergraduates who are looking to learn about the basics of quantum computation. For his research projects, he is currently looking for talented students to work on fundamental questions that cut across mathematics, computer science, and physics.

Columbia Engineers Built Basic Empathy into a Robot

Papers from the Theory Group Accepted to SODA ’21

Six papers from CS researchers were accepted to the ACM-SIAM Symposium on Discrete Algorithms (SODA ’21). The conference focuses on efficient algorithms and data structures for discrete problems.

Approximate Nearest Neighbors Beyond Space Partitions
Alexandr Andoni Columbia University, Aleksandar Nikolov University of Toronto, Ilya Razenshteyn Microsoft Research, Erik Waingarten Columbia University

The paper studies the approximate nearest neighbor problem in high-dimensional spaces. Namely, the algorithmic task is to produce a data structure for answering queries of the following form: what is the (approximately) closest point to a query q within a dataset P. This problem is a fundamental task in modern data analysis, and the paper gives new and improved approximations for one of the most commonly studied metric spaces, the ℓp spaces, as well as generalized versions of the Hamming metric.

The surprising aspect of this work is the data-dependent decomposition schemes for high dimensional vectors; while LSH partitions are well-known for ℓp spaces when p ∈ [1, 2], these fail for higher values of p. The present work shows how to build decompositions that are not LSH, but nevertheless solve the approximate nearest neighbor problem efficiently.

Graph Spanners by Sketching in Dynamic Streams and the Simultaneous Communication Model
Arnold Filtser Columbia University, Michael Kapralov École polytechnique fédérale de Lausanne, Navid Nouri École polytechnique fédérale de Lausanne

In the graph dynamic stream model, a set of n vertices is given in advance, and a stream of edge insertions and deletions is observed by a player, which can use only a very small memory to store his impressions of the stream. Once the stream is exhausted, the player is required to answer a query (known in advance) regarding the observed graph. Algorithms in this model usually use linear sketching techniques. Most notably, Ahn, Guha, and McGregor ’12 showed how to compute a spanning tree using \tilde{O}(n) space, while Kapralov et al 14, devised an algorithm computing spectral sparsifier.

The question of computing a spanner, or even more generally shortest path distance estimation is poorly understood in this model. Previous multi-pass algorithms (which are allowed to observe the stream several times) were devised, however, no single-pass algorithm was known. This paper provides the first single-pass algorithm that uses \tilde{O}(n) space while returning \tilde{O}(n^{\frac23}) estimation of all distances. Even though this distortion is very large, the authors conjecture that it is close to optimal.

Static and Streaming Data Structures for Fréchet Distance Queries
Arnold Filtser Columbia University, Omrit Filtser Stony Brook University

The Fréchet distance between two curves P and Q is often described by the man-dog analogy, in which a man is walking along P, holding a leash connected to its dog who walks along Q, and the goal is to minimize the length of the leash that allows them to fully traverse their curves without backtracking. The Fréchet distance is well studied and has numerous applications. Given two curves with n points, a simple dynamic programming could be used to compute the Fréchet distance between them in O(n^2) time. However, under standard complexity assumptions (SETH), there is no strongly subquadratic algorithm computing the Fréchet distance, even if the solution may be approximated up to a factor of 3.

To overcome this quadratic barrier, this paper studies the question of distance oracle. Here a curve P is preprocessed in advance, such that given a query curve Q, the Fréchet distance between P and Q could be approximated up to 1+\epsilon factor in linear time. The authors constructed a distance oracle with an optimal tradeoff between approximation factor, storage space, and query time.

Surprisingly, when the length of the curve P is extensively large and its points can be observed only once in a streaming fashion, the authors constructed a distance oracle with the exact same parameters.

On Multi-Dimensional Gains from Trade Maximization
Yang Cai Yale University, Kira Goldner Columbia Univeristy, Steven Ma Yale University, Mingfei Zhao Yale University

Think of a two-sided market with a bunch of “sellers,” such as Etsy sellers, Airbnb hosts, or employees, and a bunch of “buyers,” such as Etsy customers, Airbnb renters, or employers. A platform sits in the middle matching buyers and sellers, such as Airbnb. In order to maximize the platform’s value to the market’s participants, the gains from trade should be maximized — or the increased value in the market due to the matches.

Maximizing gains from trade using an algorithm that (1) aligns participants incentives, (2) ensures participants don’t regret participating, and (3) does not require the platform to lose money is known to be provably impossible even for one buyer, one seller, and one item. Further, as with questions of revenue maximization, the complexity suffers drastically as soon as a buyer is interested in more than one item.

This paper investigates the setting where a buyer is interested in many different items, each owned by a different seller, and gives the first guarantee for gains from trade in this setting. It provides an O(log n)-approximation to the optimal gains from trade subject to properties (1-3) using a combination of simple mechanisms–fixed posted pricings, buyer offering mechanisms, and a new “seller-adjusted posted price” mechanism which is surprisingly capable of earning far more gains from trade and the others in some instances.

Polynomial-Time Trace Reconstruction in the Smoothed Complexity Model
Xi Chen Columbia University, Anindya De University of Pennsylvania, Chin Ho Lee Columbia University, Rocco A. Servedio Columbia University, Sandip Sinha Columbia University

In the trace reconstruction problem, an unknown source string is sent through a probabilistic deletion channel which independently deletes each bit with a certain deletion rate and concatenates the surviving bits, yielding a trace of the unknown string. The problem is to reconstruct the unknown string given access to its independent traces.

The main result of the paper is a polynomial-time algorithm for the trace reconstruction problem in the smoothed analysis model, where any “worst-case” string is perturbed and the goal is to reconstruct its perturbed version with high probability.

The researchers’ approach is based on reconstructing a string from the multiset of its short subwords and is quite different from previous algorithms for either the worst-case or average-case versions of the problem. The heart of the work is a new efficient procedure for reconstructing the multiset of all O(log n)-length subwords of any source string using its traces.

Random Restrictions of High-Dimensional Distributions and Uniformity Testing with Subcube Conditioning
Clement Canonne IBM Research, Xi Chen Columbia University, Gautam Kamath University of Waterloo, Amit Levi University of Waterloo, Erik Waingarten Columbia University

The paper gives a nearly-optimal algorithm for testing the uniformity of distributions supported on hypercubes under the subcube conditioning model, where one can draw samples from the unknown distribution after fixing a subset of variables. The key technical component is a natural notion of random restrictions for distributions over hypercubes, and a quantitative analysis of how such a restriction affects the mean vector of the distribution. Along the way, the researchers also considered the problem of mean testing with independent samples and provide a nearly-optimal algorithm.

Year: 2021