Top CS graduates awarded Jonathan Gross Prize

Julia Hirschberg, chair of the Computer Science Department, with the recipients of the Jonathan Gross Prize: Michael Tong, Jenny Ni, Andy Arditi, and Nishant Puri. Not pictured: Jeremy Ng.

The Jonathan Gross Prize honors students who graduate at the top of their class with a track record of promising innovative contributions to computer science. Five graduating computer science students—one senior from each of the four undergraduate schools and one student from the master’s program—were selected to receive the prize, which is named for Jonathan L. Gross, the popular professor emeritus who taught for 47 years at Columbia and cofounded the Computer Science Department. The prize is made possible by an endowment established by Yechiam (Y.Y.) Yemini, who for 31 years until his retirement in 2011, was a professor of computer science at Columbia.

Andy Arditi, Columbia College

Andy Arditi

Developing computational solutions to complex problems like language processing or facial recognition is amazingly interesting and gives us insight into the nature of intelligence.”

Andy Arditi’s plan to major in a math-related field—mathematics, economics, or physics—changed in his sophomore year when he took his first CS class. Exposed to programming for the first time, he knew immediately computer science was for him. Working on programming assignments, he wouldn’t notice the hours passing. “I would be so engaged and motivated to solve whatever problem I was working on that nothing else mattered.” As he took more CS classes, he became increasingly intrigued with how mathematical and statistical modeling can solve complex problems, including those requiring human-like intelligence.

He chose Columbia for a couple of reasons. “I wanted a place where I could both be challenged academically, and at the same time keep up my jazz saxophone playing. Columbia, having an amazing jazz performance program filled with super talented musicians and being located in the jazz capital of the world, was a perfect fit.”

In August, he starts work at Microsoft in Seattle as a software engineer with the idea of one day returning to academia to pursue a master’s or PhD in a field of artificial intelligence.

Jeremy Ng

Jeremy Ng

“I was drawn most immediately to the rigor of computer science—a method will typically return the same output on a given input.”

Jeremy Ng’s first two college years were spent in France studying politics, law, economics, and sociology as part of the Dual BA Program between Columbia University and Sciences Po. Not until his junior year did he take his first computer science class—Introduction to Computer Science with Professor Adam Cannon. He so enjoyed the class that he signed himself up the next semester for the 4000-level Introduction to Databases with Professor Luis Gravano.

As a self-described language nerd who reads grammar books for fun, Ng fondly remembers taking Natural Language Processing with Professor Kathy McKeown. He describes the class as “eye-opening” because it gave him a chance to see computational approaches applied to a subject he’d previously studied from a more historical and sociological perspective.

Currently Ng is interning at the United Nations Office for the Coordination of Humanitarian Affairs (OCHA). “Humanitarian actors in conflict zones or hard-to-reach areas generally lack the resources to report detailed data and statistics; at the same time, these data are used to secure billions of dollars in humanitarian financing from donors. The challenge is producing meaningful trends and insights from a set of very limited information, without sacrificing rigor or defensibility—a challenge I very much enjoy.”

In September, he will take up a macroeconomic research-related role at the investment management firm Bridgewater Associates; his long-term goal is to work in development finance, using data to evaluate the potential impact of humanitarian and development projects in under-served regions such as Central Asia, where he worked previously as a Padma Desai fellow.

Jenny Ni, Barnard College

Jenny Ni

“In my first CS class, I immediately was impressed by the logics of coding and how much a simple program could do.”

Enrolling in classes at both Barnard and Columbia, Jenny Ni took full advantage of the wide variety of resources and challenging academic curriculum offered by the two schools, and followed her curiosity. At Columbia she took her first CS class and discovered she enjoyed programming, and become curious to learn more about the field and its applications to other, especially non-STEM, fields. In particular, she enjoys thinking about the interesting and novel ways new developments in CS can be integrated with archaeology. “It’s amazing reading about how techniques like LIDAR and drone surveying are changing the ways archaeology is done.” She graduates with both a Computer Science major and an Anthropology major from Barnard’s Archaeology track.

Over the next year, she will be applying to Archaeology research positions and PhD programs. Her goal is to apply computer science technology to expand archaeological methods.

Nishant Puri

“Artificial intelligence—specifically deep learning—is poised to have a revolutionary impact on virtually every industry.”

Nishant Puri

Nishant Puri should know, having spent two years at the data analytics company Opera Solutions where he built personalized predictive models on vast streams of transactional data. The experience so spurred his interest in machine learning he decided to pursue a second post-graduate degree. (His first was in applicable mathematics from the London School of Economics, where he received distinction in all his papers.)

He chose Columbia for a master’s degree in machine learning both for its faculty and the flexibility to specialize in his area of interest. “The program aligned with my personal objective of building on my knowledge of Computer Science fundamentals acquired from my previous work experience, with a specific focus on machine learning. I was able to take various courses that linked well with one another for a comprehensive understanding of the field.”

Puri received his Master’s last fall and is now working as a Deep Learning software engineer on Nvidia’s Drive IX team. “We are using data from car sensors to build a profile of the driver and the surrounding environment. With deep learning networks, we can track head movement and gaze, and converse with the driver using advanced speech recognition, lip reading, and natural language understanding. It’s a new and really exciting area and it’s great to be part of a team that’s helping drive progress.”

Michael Tong

“I was amazed by how powerful coding could be to solve all kinds of problems.”

Michael Tong

Michael Tong, Columbia Engineering’s valedictorian, came to Columbia intending to study mathematics. “You learn things from a very general perspective, but when you apply it to something specific these seemingly complicated definitions actually encode very simple and intuitive ideas.” But yet he declared for computer science. “When I took my first computer science class freshman year, Python with Professor Adam Cannon, it was like this whole new world opened up: I had no idea you could write code to do all these different things!”

In time, he gravitated toward theoretical computer science, which is not surprising given its close relationship to math. “It is cool when math techniques are used in computer science, to see, for instance, how basic algebraic geometry can define certain examples of error correcting codes. Studying theoretical computer science, especially complexity theory, makes me feel like I have not left the world of math.”

After graduation, Tong will work at Jane Street, a technology-focused quantitative trading firm, before applying to math doctoral programs.

 

Posted 05/16/18

 

 

 

Interview with Henning Schulzrinne on the coming COSMOS testbed

Henning Schulzrinne

Last month, the National Science Foundation announced that it will fund COSMOS, a large-scale, outdoor testbed for a new generation of wireless technologies and applications that will eventually replace 4G. Covering one square mile in densely populated West Harlem just north of Columbia’s Morningside Heights campus, COSMOS is a high-bandwidth and low-latency network that, coupled with edge computing, gives researchers a platform to experiment with the data-intensive applications—in robotics, immersive virtual reality, and traffic safety—that future wireless networks will enable.

Columbia is one of three New York-area universities involved (with Rutgers and New York University being the other two). Columbia’s COSMOS proposal was overseen by principal investigator Gil Zussman (Electrical Engineering) who worked with a number of Columbia professors, including Henning Schulzrinne.

A researcher in applied networking who is also investigating an overall architecture for the Internet of Things (IoT), Schulzrinne has previous experience in setting up an experimental wireless network. In 2011, he oversaw the installation of WiMAX, an early 4G contender, on the roof of Mudd. WiMAX helped demonstrated the feasibility of doing wireless experiments on campus while also setting a precedent for the Computer Science Department to work with Columbia facilities, which ran the fiber, provided power, and installed the antennas.

In this interview, Schulzrinne discusses the benefits of COSMOS to researchers and students within and outside the department.

What is your role?

I have three separate roles. In the area of licensing and public policy, I did the spectrum license applications so the campus could use frequencies in the millimeter-wave radio band—not previously used for telecommunications—and I’ll work to ensure we don’t interfere with licensees who will someday be assigned those frequencies.

The way licensing is done will change under 5G. Instead of specifically requesting licenses for certain frequencies and then waiting for permission, or maybe an auction, as is done now, spectrum will be shared dynamically as needed, a more efficient way to use a limited resource. Part of COSMOS is to better understand how sharing of spectrum will work.

In the application area, I will be looking at both stationary and mobile IoT devices, putting mobile nodes on Columbia vehicles—buses and public safety and utility vehicles—to measure noise pollution, for example. I am also working with Dan Rubenstein on testing safety precautions with connected vehicles that use wireless connectivity and looking for ways to protect pedestrians and bicyclists.

And thirdly, for many years I have collaborated with researchers in Finland on a series of wireless spectrum projects, including mobile edge computing. These projects are funded by the NSF and the Academy of Finland through the WiFiUS program (Wireless Innovation between Finland and US). COSMOS gives us the chance to deploy and study mobile edge computing on a high-bandwidth, low-latency network.

Who will be able to access the test bed?

COSMOS is unique as an NSF project for building an infrastructure available to anyone, both on- and off-campus. In principle, anyone who has an email address at a US education institute can submit a proposal to use COSMOS for teaching and research.

While K-12 education plays an important role in this project—high school students, for example, might run experiments showing the propagation of radio waves—most educational interest and activity will likely be at the higher-education level, especially in networking and wireless classes, with both faculty and student researchers using the testbed to run experiments at a scale not possible in a lab.

What does COSMOS mean for Columbia researchers and students?

Columbia students, being physically close to the test bed, will have obvious advantages, like being able to easily enlist pedestrians or students for experiments on devices that are worn or carried.

I expect we’ll see many proposals coming out of the Computer Science Department in particular as we understand what becomes possible on these new networks where processing is done on edge servers rather than being sent up to cloud servers further away. Can we for instance control traffic lights to more efficiently move traffic along and increase safety of pedestrians and bicyclists? Can we use the network for augmented reality? Can public safety applications benefit from high-bandwidth video processing? Do the propagation characteristics force designers of network protocols to deal with short-lived connections with highly-variable transmission speed? The COSMOS testbed gives us a wireless infrastructure to find out.

Henning Schulzrinne is the Julian Clarence Levi Professor of Mathematical Methods and Computer Science at Columbia University. He is a researcher in applied networking and is particularly known for his contributions in developing the Session Initiation Protocol (SIP) and Real-Time Transport Protocol (RTP), the key protocols that enable Voice-overIP (VoIP) and other multimedia applications. Active in public policy, he has twice served as Chief Technology Officer at the Federal Communications Committee (his most recent term ended October 2017). Currently he serves on the North American Numbering Council. Schulzrinne is a Fellow of the ACM and IEEE and a member of the Internet Hall of Fame. Among his awards, he has received the New York City Mayor’s Award for Excellence in Science and Technology, the VON Pioneer Award, TCCC service award, IEEE Internet Award, IEEE Region 1 William Terry Award for Lifetime Distinguished Service to IEEE, the UMass Computer Science Outstanding Alumni recognition.

Posted 05/14/2018

In discrete choice modeling, estimating parameters from two or three—not all—options  

Daniel Hsu and his student Arushi Gupta show it is possible to build discrete choice models by estimating parameters from a small subset of options rather than having to analyze all possible options that may easily number in the thousands, or more. Using linear equations to estimate parameters in Hidden Markov chain choice models, the researchers were surprised to find they could estimate parameters from as few as two or three options. The results, theoretical for now, mean that in a world of choices, retailers and others can offer a small, meaningful set of well-selected items from which consumers are likely to find an acceptable choice even if their first choice is not available. For researchers, it means another tool to better model real-world behavior in ways not possible before.

Given several choices, what option will people choose? From the 1960s this simple question—of interest to urban planners, social scientists, and naturally, retailers—has been studied by economists and statisticians using discrete choice models, which statistically model the probabilities people will choose an option among a set of alternatives. (While an individual’s choice can’t be predicted, it is possible to study groups of people and infer patterns of behavior from sales, survey, and other data.)

In recent years, a type of discrete choice model called multinomial logit models has been commonly used. These models, easy to fit and understand, work by associating regression coefficients to each option based on behavior observed from real-world data. If ridership data shows people choose the subway 50% of the time and the bus 50% of the time, the model’s coefficients are tweaked to assign equal probability to each option.

On the whole, multinomial logit models perform well but fail to easily accommodate the introduction of new options that substitute for existing ones. If a second bus line is added and is similar to the existing one—same route, same frequency, same cost—except that the new bus line has articulated buses, a multinomial logit model would consider all three alternatives independently, splitting the proportion equally and assigning a probability of 0.33 to all three options.

“But that is not what happens,” says Daniel Hsu, a computer science professor and a member of the Data Science Institute who explores new algorithmic approaches to statistical problems. “People don’t care whether a bus is articulated or not. People just want to get to work.” Train riders especially don’t care about a new bus option because a bus is not a substitute for a train, and for this reason, the coefficient for trainer riders should not change.

The bus-train example is small, limited to three choices. In the real world, especially in retailing, choices are so numerous as to be overwhelming. People can’t be shown all options so retailers and others must offer a small set of choices from which people will find a substitute choice they are happy with even if their most preferred choice is not included.

To better model substitution behavior, researchers in Columbia’s Industrial Engineering and Operations Research (IEOR) Department—Jose Blanchet (now at Stanford University), Guillermo Gallego, and Vineet Goyal—two years ago proposed the use of Markov Chain models, which make predictions based on previous behaviors, or states. All options have probabilities as before, but the probability changes based on preceding options. What fraction of people will choose A when the previous states are B and C, or are C and D?  Removing one option, the most preferred one for example, changes the chain of probabilities, providing way for decide the next most preferred option among those remaining.

In a Markov chain choice mode, the current state—here the item currently desired—determines the probabilities to assign to the next state (i.e., the next desired product).

 

Switching from multinomial logit models to Markov chain choice models better captured substitution behavior but at the expense of much more complexity; previously, n possible options meant n parameters, but in the Markov chain models, n parameters just describes the initial probabilities; each removal of an option requires recalculating all possible combinations for a total of n squared parameters. Instead of 100 parameters, now it’s 10k parameters. Computers can handle this complexity; a consumer can’t and won’t.

The IEOR researchers showed it was possible to model substitution behavior by analyzing all possible options; left open was the possibility of doing so using less data, that is, some subset of the entire data set.

This was the starting point for Hsu and Arushi Gupta, a master’s student with an operations research background who had previously worked on estimating parameters for older discrete choice models; Gupta relished the opportunity to do the same for a more complex model.

“We looked at it as an algebraic puzzle,” says Hsu. “If you know two-thirds of people choose this option when these three are offered, how do you get there from the original Markov chain choice model? What are the parameters of the model that would give you this?”

Initially they set out to find an approximate solution but, changing tack, tried for the harder thing: getting a single exact solution. “Arushi discovered that all model parameters were related to the choice probabilities via linear equations, where the coefficients of the equations were given by yet other choice probabilities,” says Hsu.  “This is great because it means choice probabilities can be estimated from data. And this gives a way to potentially indirectly estimate all of the model parameters just by solving a system of equations.”

Says Gupta, “It is also quite lucky that it turned out to be just linear equations because even very big systems of linear equations are easy to solve on a computer. If the relationship had turned out to be other polynomial equations, then it would have been much more difficult.”

The final piece of the puzzle was to prove that the linear equations uniquely determined the parameters. “We crafted a new algebraic argument using the theory of a special class of matrices called M-matrices, which naturally arise in the study of Markov chains” says Hsu. “Our proof suggested that assortments as small as two and three item sets could yield linear equations that pin down the model parameters.”

Their results, detailed in Parameter identification in Markov chain choice models, showed a small amount of data suffices to pinpoint consumer choice behavior. The paper, which was presented at last year’s Conference on Algorithmic Learning Theory, is now published in the Proceedings of Machine Learning Research.

The researchers hope their results will lead to better ways of fitting choice models to data while helping understand the strengths and weaknesses of the models used to predict consumer choice behavior.

About the Researchers

Arushi Gupta, who received her bachelors at Columbia in 2016 in computer science and operations research, will earn a master’s degree in machine learning this May. In the fall, she starts her PhD in computer science at Princeton where she will study machine learning. In addition to Parameter identification in Markov chain choice models, she is coauthor of two additional papers, Do dark matter halos explain lensing peaks? and Non-Gaussian information from weak lensing data via deep learning.

Daniel Hsu is associate professor in the Computer Science Department and a member of the Data Science Institute, both at Columbia University, with research interests in algorithmic statistics, machine learning, and privacy. His work has produced the first computationally efficient algorithms for several statistical estimation tasks (including many involving latent variable models such as mixture models, hidden Markov models, and topic models), provided new algorithmic frameworks for solving interactive machine learning problems, and led to the creation of scalable tools for machine learning applications.

Before coming to Columbia in 2013,  Hsu he was a postdoc at Microsoft Research New England, and the Departments of Statistics at Rutgers University and the University of Pennsylvania. He holds a Ph.D. in Computer Science from UC San Diego, and a B.S. in Computer Science and Engineering from UC Berkeley. Hsu has been recognized with a Yahoo ACE Award (2014), selected as one of “AI’s 10 to Watch” in 2015 by IEEE Intelligent Systems, and received a 2016 Sloan Research Fellowship. Just last year, he was made a 2017 Kavli Fellow.

Posted 05/09/18
Linda Crane

Mihalis Yannakakis elected to National Academy of Sciences

Mihalis Yannakakis

Mihalis Yannakakis, the Percy K. and Vida L. W. Hudson Professor of Computer Science at Columbia University, is among the 84 new members and 21 foreign associates just elected to the National Academy of Sciences (NAS) for distinguished and continuing achievements in original research. Established under President Abraham Lincoln in 1863, the NAS recognizes achievement in science, and NAS membership is considered one of the highest honors a scientist can receive.

Yannakakis is being recognized for fundamental contributions to theoretical computer science, particularly in algorithms and computational complexity, and applications to other areas. His research studies the inherent difficulty of computational problems, and the power and limitations of methods for solving them. For example, Yannakakis characterized the power of certain common approaches to optimization (based on linear programming), and proved that they cannot solve efficiently hard optimization problems like the famous traveling salesman problem. In the area of approximation algorithms, he defined with Christos Papadimitriou a complexity class that unified many important problems and helped explain why the research community had not been able to make progress in approximating a number of optimization problems.

“Complexity is the essence of computation and its most advanced research problem, the study of how hard computational problems can be,” says Papadimitriou. “Mihalis Yannakakis has changed in many ways, through his work during the past four decades, the way we think about complexity:  He classified several problems whose complexity had been a mystery for decades, and he identified new forms and modes of complexity, crucial for understanding the difficulty of some very fundamental problems.”

In the area of database theory, Yannakakis contributed in the initiation of the study of acyclic databases and of non-two-phase locking. In the area of computer aided verification and testing, he laid the rigorous algorithmic and complexity-theoretic foundations of the field. His interests extend also to combinatorial optimization and algorithmic game theory.

For the significance, impact, and astonishing breadth of his contributions to theoretical computer science, Yannakakis was awarded the seventh Donald E. Knuth Prize in 2005. He is also a member of the National Academy of Engineering, of Academia Europaea, a Fellow of the ACM, and a Bell Labs Fellow.

Yannakakis received his PhD from Princeton University. Prior to joining Columbia in 2004, he was Head of the Computing Principles Research Department at Bell Labs and at Avaya Labs, and Professor of Computer Science at Stanford University. He has served on the editorial boards of several journals, including as the past editor-in-chief of the SIAM Journal on Computing, and has chaired various conferences, including the IEEE Symposium on Foundations of Computer Science, the ACM Symposium on Theory of Computing and the ACM Symposium on Principles of Database Systems.

 

 

Posted 05/02/18