Assistant professor Baishakhi Ray has won a VMware Early Career Faculty Award to develop machine learning tools that will improve software security. The grant program recognizes the next generation of exceptional faculty members. The gift is made in support of early-career faculty’s research and to promote excellence in teaching.
In today’s world, almost every aspect of our lives is controlled by software. Unfortunately, most software tends to be buggy, often threatening even the most safety- and security-critical software. According to a recent report, 50% of software developers’ valuable time is wasted at finding and fixing bugs costing the global economy around USD$1.1 trillion in 2016 alone.
“The goal of my research is to address this problem and figure out how to automatically detect and fix bugs to improve software robustness, for both traditional and machine learning-based software,” said Ray, who joined the department in 2018.
In particular, her research will address two main challenges of software robustness: (i) traditional software have numerous implicit and explicit specifications; it is often difficult to know all of them in advance. (ii) With the advent of machine learning-based systems (e.g., self-driving cars), explicitly providing such specifications involving natural inputs, like image and text, is very hard.
Ray’s plan is a two-pronged approach. First, she and her team will build novel machine learning models to learn implicit specifications/rules from traditional programs and leverage these rules to detect and fix bugs automatically. However, such techniques are not easily extendable to machine learning-based systems as they follow different software paradigms (e.g., finite state machine vs. neural network). To improve the robustness of such systems, they will also devise new analysis techniques to examine the internal states of the models for potential violations.
“A successful outcome of this project will produce new techniques to detect and remediate software errors and vulnerabilities with increased accuracy to make software more secure,” said Ray.
IBM has selected assistant professor Baishakhi Ray for an IBM Faculty Award. The highly selective award is given to professors in leading universities worldwide to foster collaboration with IBM researchers. Ray will use the funds to continue research on artificial intelligence-driven program analysis to understand software robustness.
Although much research has been done, there are still countless vulnerabilities that make system robustness brittle. Hidden vulnerabilities are discovered all the time – either through a system hack or monitoring system’s functionalities. Ray is working to automatically detect system weaknesses using artificial intelligence (AI) with her project, “Improving code representation for enhanced deep learning to detect and remediate security vulnerabilities”.
One of the major challenges in AI-based security vulnerability detection is finding the best source code representation that can distinguish between vulnerable versus benign code. Such representation can further be used as an input in supervised learning settings for automatic vulnerability detection and fixes. Ray is tackling this problem by building new machine-learning models for source code and applying machine learning techniques such as code embeddings. This approach could open new ways of encoding source code into feature vectors.
“It will provide new ways to make systems secure,” said Ray, who joined the department in 2018. “The goal is to reduce the hours of manual effort spent in automatically detecting vulnerabilities and fixing them.”
A successful outcome of this project will produce a new technique to encode source code with associated trained models that will be able to detect and remediate a software vulnerability with increased accuracy.
IBM researchers Jim Laredo and Alessandro Morari will collaborate closely with Ray and her team on opportunities around design, implementation, and evaluation of this research.
The department welcomes Baishakhi Ray, Ronghui Gu, Carl Vondrick, and Tony Dear.
Assistant Professor, Computer Science
PhD, University of Texas, Austin, 2013; MS, University of Colorado, Boulder, 2009; BTech, Calcutta University, India, 2004; BSc, Presidency College, India, 2001
Baishakhi Ray works on end-to-end software solutions and treats the entire software system – anything from debugging, patching, security, performance, developing methodology, to even the user experience of developers and users.
At the moment her research is focused on machine learning bias. For example, some models see a picture of a baby and a man and identify it as a woman and child. Her team is developing ways on how to train a system and to solve practical problems.
Ray previously taught at the University of Virginia and was a postdoctoral fellow in computer science at the University of California, Davis. In 2017, she received Best Paper Awards at the SIGSOFT Symposium on the Foundations of Software Engineering and the International Conference on Mining Software Repositories.
Assistant Professor, Computer Science
PhD, Yale University, 2017; Tsinghua University, China, 2011
Ronghui Gu focuses on programming languages and operating systems, specifically language-based support for safety and security, certified system software, certified programming and compilation, formal methods, and concurrency reasoning. He seeks to build certified concurrent operating systems that can resist cyberattacks.
Gu previously worked at Google and co-founded Certik, a formal verification platform for smart contracts and blockchain ecosystems. The startup grew out of his thesis, which proposed CertiKOS, a comprehensive verification framework. CertiKOS is used in high-profile DARPA programs CRASH and HACMS, is a core component of an NSF Expeditions in Computing project DeepSpec, and has been widely considered “a real breakthrough” toward hacker-resistant systems.
Assistant Professor, Computer Science
PhD, Massachusetts Institute of Technology, 2017; BS, University of California, Irvine, 2011
Carl Vondrick’s research focuses on computer vision and machine learning. His work often uses large amounts of unlabeled data to teach perception to machines. Other interests include interpretable models, high-level reasoning, and perception for robotics.
His past research developed computer systems that watch video in order to anticipate human actions, recognize ambient sounds, and visually track objects. Computer vision is enabling applications across health, security, and robotics, but they currently require large labeled datasets to work well, which is expensive to collect. Instead, Vondrick’s research develops systems that learn from unlabeled data, which will enable computer vision systems to efficiently scale up and tackle versatile tasks. His research has been featured on CNN and Wired and in a skit on the Late Show with Stephen Colbert, for training computer vision models through binge-watching TV shows.
Recently, three research papers he worked on were presented at the European Conference for Computer Vision (EECV). Vondrick comes to Columbia from Google Research, where he was a research scientist.
Lecturer in Discipline, Computer Science
PhD, Carnegie Mellon University, 2018; MS, Carnegie Mellon University, 2015; BS, University of California, Berkeley, 2012
Tony Dear’s research and pedagogical interests lie in bringing theory into practice. In his PhD research, this idea motivated the application of analytical tools to motion planning for “real” or physical locomoting robotic systems that violate certain ideal assumptions but still exhibit some structure – how to get unconventional robots to move around with stealth of animals and biological organisms. Also, how to simplify tools and expand that to other systems, as well as how to generalize mathematical models to be used in multiple robots.
In his teaching, Dear strives to engage students with relatable examples and projects, alternative ways of learning, such as an online curriculum with lecture videos. He completed the Future Faculty Program at the Eberly Center for Teaching Excellence at Carnegie Mellon and has been the recipient of a National Defense Science and Engineering Graduate Fellowship.
At Columbia, Dear is looking forward to teaching computer science, robotics, and AI. He hopes to continue small-scale research projects in robotic locomotion and conduct outreach to teach teens STEM and robotics courses.