Peter Allen and Eugene Wu of Columbia’s Computer Science Department are each recipients of a Google Faculty Research Award. This highly competitive award (15% acceptance rate) recognizes and financially supports university faculty working on research in fields of interest to Google. The award amount, given as an unrestricted gift, is designed to support the cost of one graduate student for one year. The intent is for projects funded through this award to be made openly available to the research community.
Peter Allen: Visual-Tactile Integration for Reinforcement Learning of Robotic Grasping
To incorporate a visual-tactile learning component into a robotic grasping and manipulation system, Peter Allen will receive $71K. A Professor of Computer Science at Columbia, Allen heads the Columbia University Robotics Group.
In grasping objects, robots for the most part rely on visual feedback to locate an object and guide the grasping movements. But visual information by itself is not sufficient to achieve a stable grasp: it cannot measure how much force or pressure to apply and is of limited value when an object is partially occluded or hidden at the bottom of a gym bag. Humans rely as much or more on touch, easily grasping objects sight unseen and feeling an object to deduce its shape, apply the right amount of pressure and force, and detect when it begins to slip from grasp.
With funding provided by Google, Allen will add simulated rich tactile sensor data (both capacitive and piezoelectric) to robotic hand simulators, capturing and transforming low-level sensor data into high-level information useful for grasping and manipulation. This high-level information can then be used to build reinforcement learning (RL) algorithms that enable an agent to find stable grasps of complex objects using multi-fingered hands. The RL system can learn a grasping control policy and an ability to reason about 3D geometry from both raw depth and tactile sensory observations.
This project builds on previous work by Allen’s group on shape completion—also with a Google research grant—where an occluded object’s complete 3D shape is inferred by comparing it to hundreds of thousands of models contained within a data set. The shape completion work used machine learning with a convolutional neural network (CNN), with simulated 3D vision data on models of hundreds of thousands of views of everyday objects. This allowed the CNN to generalize a full 3D model from a single limited view. With this project, the CNN will now be trained on both visual and tactile information to produce more realistic simulation environments that can be used by Google and others (the project code and datasets will be open source) for more accurate robot grasping.
Eugene Wu: NeuroFlash: System to Inspect, Check, and Monitor Deep Neural Networks
For NeuroFlash, a system to explain, check, and monitor deep neural networks, Eugene Wu will receive $69K. An Assistant Professor of Computer Science at Columbia, Wu heads the WuLab research group and is a member of the Data Science Institute, where he co-chairs the Center for Data, Media and Society.
Deep neural networks, consisting of many layers of interacting neurons, are famously difficult to debug. Unlike traditional software programs, which are structured into modular functions that can be separately interpreted and debugged, deep neural networks are more akin to a single, massive block of code, often described as a black box, where logic is smeared throughout the behavior of thousands of neurons with no clear way of disentangling interactions among neurons. Without an ability to reason about how neural networks make decisions, it is not possible to understand if and how they can fail, for example, when used for self-driving cars or managing critical infrastructure.
To bring modularity to deep neural networks, NeuroFlash aims to identify functional logic components within the network. Using a method similar to MRI in medical diagnosis, which looks for brain neuron activations when humans perform specific tasks, NeuroFlash observes the behavior of neural network neurons to identify which ones are mimicking a user-provided high-level function (e.g., identify sentiment, detect vertical line, parse verb). To verify that the neurons are indeed following the function’s behavior and not spurious, NeuroFlash alters the neural network in systematic ways to examine and compare its behavior with the function.
The ability to attribute high-level logic to portions of a deep neural network forms the basis for being able to introduce “modules” into neural networks; test and debug these modules in isolation in order to ensure they work when deployed; and develop monitoring tools for machine learning pipelines.
Work on the project is driven by Thibault Sellam, Kevin Lin, Ian Huang, and Carl Vondrick and complements recent work from Google on model attribution, where neuron output is attributed to specific input features. With a focus on scalability, NeuroFlash generalizes this idea by attributing logical functions.
Wu will make all code and data open source, hosting it on Github for other researchers to access.
– Linda Crane