Q&A: Jihye Kwon on PhD Research Projects

Jihye Kwon, a computer engineering PhD student, talks about her research projects and what it took to win a Best Paper award.

Jihye Kwon

What drew you to computer engineering, specifically the application of machine learning to computer-aided design? What questions or issues do you hope to answer?

I was attracted to the concept of a computer: a machine that performs calculations. I found it very interesting how modern computers evolved from executing one instruction at a time to executing many instructions simultaneously by exploiting multiple levels of parallelism. Still, various challenges remained, or newly arose, so I dreamed about designing a brand-new computer system. That is what I had in mind when coming to Columbia.

At the beginning of my PhD, I experimented and learned how to design the core parts of special-purpose computers, using computer-aided design tools. I also explored machine learning from both theoretical and practical perspectives. These activities led me to work on my current research problems.

In advanced computer-aided design of computer systems, computers solve many complex optimization problems in steps to generate a final design. They do so as guided by the designers via means of the configurable ‘knobs’. My focus is on the designers’ work.

For a target system, designers run the computer-aided design tools repeatedly with the many different knob configurations until the tools output final designs with optimal or desired properties, e.g., in timing, area, and power. I wondered if machines can learn, from designers’ previous work, how to configure the knobs to optimize a new target system. Can designers virtually collaborate across time and tasks through the machine learning models? These are the main questions that I hope to answer.

Could you talk about your research and how you collaborated with other groups? Was this something you considered when applying to Columbia – that there are opportunities to do multi-disciplinary work?

When I was applying to Columbia, I wished I could have collaboration opportunities to study and work in the interdisciplinary research communities at the center of New York City. I wanted to explore applications of computer science in different areas to eventually gain insight and inspiration for my own research, which is centered at computer engineering.

Fortunately, these were realized as I worked with my advisor, Professor Luca Carloni. I was invited to join the project “Energy Efficient Computing with Chip-Based Photonics”, which is a part of a large initiative supported by the government and industry. In this project, I worked closely with Lightwave Research Laboratory in Electrical Engineering on a new optical computing system. We proposed the concept of a next-generation computing system that is co-designed with silicon photonics and electronic circuitry, in order to overcome the fundamental and physical limitations of today’s computers.

Another project on optical communication was initiated from a student project that I mentored in my advisor’s class, Embedded Scalable Platforms. This project investigated the use of photonic switches in optically-connected memory systems for deep learning applications.

Outside Columbia, I have also collaborated with researchers at IBM TJ Watson Research Center via my summer internships on the project of auto-tuning computer-aided design flows for commercial supercomputers. All these collaborations opened new horizons for me.

 

You won the MLCAD 2020 Best Paper award for your research, can you talk about your process – how did the research come about? How long did it take you to complete the work? What were the things you had to overcome?

In this work, I proposed a novel machine learning approach for computer-aided design optimization of hardware accelerators. I wanted to address this problem because it is computationally very expensive to explore the entire optimization space. It took me about one year to complete the work. One of the biggest difficulties I faced was the limited availability of the data for applying machine learning to the problem.

Then, I found out that transfer learning has been recently successfully applied in other areas with limited data. In transfer learning, a model trained for a related problem (e.g., natural image recognition) is transferred to aid the machine learning for the target problem (e.g., face recognition). Hence, I tried to apply transfer learning to my research problem. I trained a neural network model for a different accelerator design, and transferred the model to predict the design properties of a target accelerator.

However, the transferred model did not perform well in this case. I realized that due to the diverse characteristics of the accelerators, I needed to distinguish which piece of the source information should be transferred. Based on this intuition, I constructed a series of new models, and eventually, proposed one with promising performance. While it was a long process of building new models without knowing the answers, my advisor greatly encouraged me in our discussions to keep moving forward, and it was very rewarding in the end.

The Machine Learning for Systems session from the 2nd ACM/IEEE Workshop on Machine Learning for CAD (MLCAD) can be viewed here and the Best Paper announcement here

 

Looking back, how have you grown as a researcher and a person?

Besides expanding my problem-solving capabilities and technical skills, I have grown to be a better presenter and communicator. One of the tasks of a researcher is to explain one’s work to various groups and different types of audiences. I had a number of opportunities to present my work at academic conferences, seminars at companies, lightning talks, and annual project reviews. Initially, I struggled to meet the audience’s interests whose expertise spans a diverse range of areas and levels. Through those opportunities, I have received very helpful feedback, I have tried to ask myself questions from other people’s perspectives and progressively learned to keep a good balance between abstraction and elaboration.

Also, by interacting with a lot of students with heterogeneous backgrounds in the classes I TA’ed, I have learned to understand what their questions mean and where they come from. Based on that, I tried to adjust my answers to have more relatable conversations. From those conversations, sometimes the students found the topics very interesting, and sometimes I learned something new from them. It was such a great pleasure to inspire others and to be inspired. I think those experiences have made me a better researcher and person.

 

There are many organizations on campus, why did you choose to join Womxn in Computer Science (WiCS)?

In Fall 2017, I received an invitation from WiCS’ president, Julia Di, and was impressed by the passionate and caring board members working on the common goal of supporting the advancement of womxn in computer science. In my second year I launched the WiCS Lightning Talks for students with research experience to share their work and stories. The goal was for young students to get to know more about research and demystify the process.

I am one of the few women at Columbia in my research area of computer engineering and would like to contribute to inspiring the next generation to join us.

 

What was the highlight of your time at Columbia?

Every moment was special for me. Some of the highlights were during happy hour with members of the fishbowl. The fishbowl is a large office occupied by the majority of PhD students in computer engineering. We call it the fishbowl, because it is surrounded by large windows and students inside look like small fishes. Once, my colleagues talked about their memories of old computers that I had never seen. I enjoyed imagining the machines from their descriptions, and thinking about different types and generations of computers.

 

What is your advice to students on how to navigate their time at Columbia?

Explore, experience, and exploit. There are recommended lists of classes, activities, and companies, depending on your track and interests, but no one is exactly like you. There is such a great variety of opportunities and resources at Columbia and in New York City. I hope you can spend enough time exploring them and get involved in many ways before determining your academic and career goals.

 

Is there anything else that you think people should know?

Columbia is beautiful in the snow! It gets pretty windy in the winter, so please be aware if you are coming from warmer places. There are many places where you can study but Avery Library is my favorite library on campus. If you have any questions or opinions on this Q&A story, please feel free to drop me a line!

How She Made It

The Computer Science Department Student Services Team is excited to continue The How She Made It Series, a new Columbia University Computer Science Department initiative, which features fabulous womxn-identifying industry professionals and prestigious speakers!

How She Made It interviews diverse, trailblazing professionals, and allows students to learn more about how the speakers launched into industry, overcame obstacles, and became the successful superstars they are today!

Click on the thumbnails below to view the video recording of the presentations.

Zhou Yu and Henry Yuen Join the Department

The computer science department welcomes two new faculty members. These new CS faculty are facilitating research and learning in natural language processing and quantum theory.

Zhou Yu

Zhou Yu
Assistant Professor 
PhD Computer Science, Carnegie Mellon University 2017
BS Computer Science and BA Linguistics, Chu Kochen Honors College at Zhejiang University 2011

Zhou Yu designs algorithms for real-time intelligent interactive systems that coordinate with user actions that are beyond spoken languages, including non-verbal behaviors to achieve effective and natural communications. In particular, she optimizes human-machine communication via studies of multimodal sensing and analysis, speech and natural language processing, machine learning and human-computer interaction. 

Yu aims to bring together all the areas above to design, implement and deploy end-to-end real-time interactive intelligent systems that are able to plan globally considering interaction history and current user actions to achieve better user experience and task performance. 

Prior to Columbia, Yu was an Assistant Professor in the Computer Science Department at University of California, Davis, where she directed the Davis NLP Lab. She is the winner of a Google GCP Credit Award and has been named an Amazon Alexa Prize Champion and a 2018 Forbes 30 Under 30 in Science.

Yu is teaching Dialog Systems this Spring and is looking for PhD, master, and undergrad students with a background in natural language processing to join her research projects.

 

Henry YuenHenry Yuen
Assistant Professor 
PhD Computer Science, Massachusetts Institute of Technology (MIT) 2016
BA Mathematics, University of Southern California 2010

Henry Yuen is a theoretical computer scientist whose goal is to understand the fundamental principles of computation in a universe governed by quantum physics (such as ours). He studies questions at the interface of quantum information theory, computational complexity theory, and cryptography. 

He has made a number of contributions to the theory of quantum multiprover interactive proofs, including the discovery that such interactive proofs can verify solutions to uncomputable problems. Yuen also works on quantum cryptography; some of his contributions include designing protocols for infinite randomness expansion using untrusted quantum hardware. 

Yuen was a postdoctoral associate at UC Berkeley from 2016-2018 and an assistant professor in Computer Science and Mathematics at the University of Toronto between 2018–2020. He is the recipient of a Simons-Berkeley Research Fellowship and a Google Quantum Research Award.

This semester he is teaching Introduction to Quantum Computing that is aimed at graduate students and advanced undergraduates who are looking to learn about the basics of quantum computation. For his research projects, he is currently looking for talented students to work on fundamental questions that cut across mathematics, computer science, and physics.

Papers from the Theory Group Accepted to SODA ’21

Six papers from CS researchers were accepted to the ACM-SIAM Symposium on Discrete Algorithms (SODA ’21). The conference focuses on efficient algorithms and data structures for discrete problems.

Approximate Nearest Neighbors Beyond Space Partitions
Alexandr Andoni Columbia University, Aleksandar Nikolov University of Toronto, Ilya Razenshteyn Microsoft Research, Erik Waingarten Columbia University

The paper studies the approximate nearest neighbor problem in high-dimensional spaces. Namely, the algorithmic task is to produce a data structure for answering queries of the following form: what is the (approximately) closest point to a query q within a dataset P. This problem is a fundamental task in modern data analysis, and the paper gives new and improved approximations for one of the most commonly studied metric spaces, the ℓp spaces, as well as generalized versions of the Hamming metric.

The surprising aspect of this work is the data-dependent decomposition schemes for high dimensional vectors; while LSH partitions are well-known for ℓp spaces when p ∈ [1, 2], these fail for higher values of p. The present work shows how to build decompositions that are not LSH, but nevertheless solve the approximate nearest neighbor problem efficiently.

 

Graph Spanners by Sketching in Dynamic Streams and the Simultaneous Communication Model
Arnold Filtser Columbia University, Michael Kapralov École polytechnique fédérale de Lausanne, Navid Nouri École polytechnique fédérale de Lausanne

In the graph dynamic stream model, a set of n vertices is given in advance, and a stream of edge insertions and deletions is observed by a player, which can use only a very small memory to store his impressions of the stream. Once the stream is exhausted, the player is required to answer a query (known in advance) regarding the observed graph. Algorithms in this model usually use linear sketching techniques. Most notably, Ahn, Guha, and McGregor ’12 showed how to compute a spanning tree using \tilde{O}(n) space, while Kapralov et al 14, devised an algorithm computing spectral sparsifier.

The question of computing a spanner, or even more generally shortest path distance estimation is poorly understood in this model. Previous multi-pass algorithms (which are allowed to observe the stream several times) were devised, however, no single-pass algorithm was known. This paper provides the first single-pass algorithm that uses \tilde{O}(n) space while returning \tilde{O}(n^{\frac23}) estimation of all distances. Even though this distortion is very large, the authors conjecture that it is close to optimal.

 

Static and Streaming Data Structures for Fréchet Distance Queries
Arnold Filtser Columbia University, Omrit Filtser Stony Brook University

The Fréchet distance between two curves P and Q is often described by the man-dog analogy, in which a man is walking along P, holding a leash connected to its dog who walks along Q, and the goal is to minimize the length of the leash that allows them to fully traverse their curves without backtracking. The Fréchet distance is well studied and has numerous applications. Given two curves with n points, a simple dynamic programming could be used to compute the Fréchet distance between them in O(n^2) time. However, under standard complexity assumptions (SETH), there is no strongly subquadratic algorithm computing the Fréchet distance, even if the solution may be approximated up to a factor of 3.

To overcome this quadratic barrier, this paper studies the question of distance oracle. Here a curve P is preprocessed in advance, such that given a query curve Q, the Fréchet distance between P and Q could be approximated up to 1+\epsilon factor in linear time. The authors constructed a distance oracle with an optimal tradeoff between approximation factor, storage space, and query time.

Surprisingly, when the length of the curve P is extensively large and its points can be observed only once in a streaming fashion, the authors constructed a distance oracle with the exact same parameters.

 

On Multi-Dimensional Gains from Trade Maximization
Yang Cai Yale University, Kira Goldner Columbia Univeristy, Steven Ma Yale University, Mingfei Zhao Yale University

Think of a two-sided market with a bunch of “sellers,” such as Etsy sellers, Airbnb hosts, or employees, and a bunch of “buyers,” such as Etsy customers, Airbnb renters, or employers. A platform sits in the middle matching buyers and sellers, such as Airbnb. In order to maximize the platform’s value to the market’s participants, the gains from trade should be maximized — or the increased value in the market due to the matches.

Maximizing gains from trade using an algorithm that (1) aligns participants incentives, (2) ensures participants don’t regret participating, and (3) does not require the platform to lose money is known to be provably impossible even for one buyer, one seller, and one item. Further, as with questions of revenue maximization, the complexity suffers drastically as soon as a buyer is interested in more than one item.

This paper investigates the setting where a buyer is interested in many different items, each owned by a different seller, and gives the first guarantee for gains from trade in this setting. It provides an O(log n)-approximation to the optimal gains from trade subject to properties (1-3) using a combination of simple mechanisms–fixed posted pricings, buyer offering mechanisms, and a new “seller-adjusted posted price” mechanism which is surprisingly capable of earning far more gains from trade and the others in some instances.

 

Polynomial-Time Trace Reconstruction in the Smoothed Complexity Model
Xi Chen Columbia University, Anindya De University of Pennsylvania, Chin Ho Lee Columbia University, Rocco A. Servedio Columbia University, Sandip Sinha Columbia University

In the trace reconstruction problem, an unknown source string is sent through a probabilistic deletion channel which independently deletes each bit with a certain deletion rate and concatenates the surviving bits, yielding a trace of the unknown string. The problem is to reconstruct the unknown string given access to its independent traces.

The main result of the paper is a polynomial-time algorithm for the trace reconstruction problem in the smoothed analysis model, where any “worst-case” string is perturbed and the goal is to reconstruct its perturbed version with high probability.

The researchers’ approach is based on reconstructing a string from the multiset of its short subwords and is quite different from previous algorithms for either the worst-case or average-case versions of the problem. The heart of the work is a new efficient procedure for reconstructing the multiset of all O(log n)-length subwords of any source string using its traces.

 

Random Restrictions of High-Dimensional Distributions and Uniformity Testing with Subcube Conditioning
Clement Canonne IBM Research, Xi Chen Columbia University, Gautam Kamath University of Waterloo, Amit Levi University of Waterloo, Erik Waingarten Columbia University

The paper gives a nearly-optimal algorithm for testing the uniformity of distributions supported on hypercubes under the subcube conditioning model, where one can draw samples from the unknown distribution after fixing a subset of variables. The key technical component is a natural notion of random restrictions for distributions over hypercubes, and a quantitative analysis of how such a restriction affects the mean vector of the distribution. Along the way, the researchers also considered the problem of mean testing with independent samples and provide a nearly-optimal algorithm.

Robot Displays a Glimmer of Empathy to a Partner Robot

A Columbia Engineering robot has learned to predict its partner robot’s future actions and goals based on just a few initial video frames. The study is part of a broader effort to endow robots with the ability to understand and anticipate the goals of other robots, purely from visual observations.