Professor Kathy McKeown talks with DeepLearning.AI’s Andrew Ng about how she started in artificial intelligence (AI), her research projects, how her understanding of AI has changed through the decades, and AI career advice for learners of NLP.
Almost 400,000 babies were born prematurely—before 37 weeks gestation—in 2018 in the United States. One of the leading causes of newborn deaths and long-term disabilities, preterm birth (PTB) is considered a public health problem with deep emotional and challenging financial consequences to families and society. If doctors were able to use data and artificial intelligence (AI) to predict which pregnant women might be at risk, many of these premature births might be avoided.
The J.P. Morgan AI Research Awards 2019 partners with research thinkers across artificial intelligence. The program is structured as a gift that funds a year of study for a graduate student.
Prediction semantics and interpretations that are
grounded in real data
Principal Investigator: Daniel Hsu Computer Science Department & Data Science Institute
The importance of transparency in predictive technologies is by now well-understood by many machine learning practitioners and researchers, especially for applications in which predictions may have serious impacts on human lives (e.g., medicine, finance, criminal justice). One common approach to providing transparency is to ensure interpretability in the models and predictions produced by an application, or to accompany predictions with explanations. Interpretations and explanations may help individuals understand predictions that affect them, and also help developers reason about failure cases of their applications.
However, there are numerous possibilities for what constitutes a suitable interpretation or explanation, and the semantics of such provided by existing systems are not always clear.
Suppose, for example, that a bank uses a linear model to predict whether or not a loan applicant will forfeit on a loan. A natural strategy is to seek a sparse linear model, which are often touted as highly interpretable. However, attributing significance to variables with non-zero regression coefficients (e.g., zip-code) and not others (e.g., race, age) is suspect when variables may be correlated. Moreover, an explanation based on pointing to individual variables or other parameters of a model ignores the source of the model itself: the training data (e.g., a biased history of borrowers and forfeiture outcomes) and the model fitting procedure. Invalid or inappropriate explanations may create a “transparency fallacy” that creates more problems than are solved.
The researchers propose a general class of mechanisms that provide explanations based on training or validation examples, rather than any specific component or parameters of a predictive model. In this way, the explanation will satisfy two key features identified in successful human explanations: the explanation will be contrastive, allowing an end-user to compare the present data to the specific examples chosen from the training or validation data, and the explanation will be pertinent to the actual causal chain that results in the prediction in question. These features are missing in previous systems that seek to explain predictions based on machine learning methods.
“We expect this research to lead to new methods for interpretable machine learning,” said Daniel Hsu, the principal investigator of the project. Because the explanations will be based on actual training examples, the methods will be widely applicable, in essentially any domain where examples can be visualized or communicated to a human. He continued, “This stands in contrast to nearly all existing methods for explanatory machine learning, which either require strong assumptions like linearity or sparsity, or do not connect to the predictive model of interest or the actual causal chain leading to a given prediction of interest.”
Efficient Formal Safety Analysis of Neural Networks
Principal Investigators: Suman Jana Computer Science Department, Jeannette M. Wing Computer Science Department & Data Science Institute, Junfeng Yang Computer Science Department
Over the last few years, artificial intelligence (AI), in particular Deep Learning (DL) and Deep Neural Networks (DNNs), has made tremendous progress, achieving or surpassing human-level performance for a diverse set of tasks including image classification, speech recognition, and playing games such as Go. These advances have led to widespread adoption and deployment of DL in critical domains including finance, healthcare, autonomous driving, and security. In particular, the financial industry has embraced AI in applications ranging from portfolio management (“Robo-Advisor”), algorithmic trading, fraud detection, loan and insurance underwriting, sentiment and news analysis, customer service, to sales.
“Machine learning models are used in more and more safety and security-critical applications such as autonomous driving and medical diagnosis,” said Suman Jana, one of the principal investigators of the project. “Yet they are known to be fragile and frequently mispredicts on edge cases.“
In many critical domains including finance and autonomous driving, such incorrect behaviors can lead to disastrous consequences such as a gigantic loss in automated financial trading or a fatal collision of a self-driving car. For example, in 2016, a Google self-driving car crashed into a bus because it expected the bus to yield under a set of rare conditions but the bus did not. Also in 2016, a Tesla car in autopilot crashed into a trailer because the autopilot system failed to recognize the trailer as an obstacle due to its ‘white color against a brightly lit sky’ and the ‘high ride height.’
Before AI can become the next technological revolution, it must be robust against such corner-case inputs and does not cause disasters. The researchers believe AI robustness is one of the biggest challenges that needs to be solved in order to fully tame AI for good.
“Our research aims to create novel tools to verify that a machine learning model will not mispredict on certain important input ranges, ensuring safety and security,” said Junfeng Yang, one of the investigators of the research.
The proposed work enables rigorous analysis of autonomous AI systems and machine learning (ML) algorithms, enabling data scientists to (1) verify that their AI models function correctly within certain input regions and violate no critical properties they specify (e.g., bidding price is never higher than a given maximum) or (2) locate all sub-regions where their models misbehave and repair their model accordingly. This capability will also enable data scientists to explain and interpret the outputs from autonomous AI systems and ML algorithms by understanding how different input regions may lead to different output predictions. Said Yang,”If successful, our work will dramatically boost the robustness, explainability, and interpretability of today’s autonomous AI systems and ML algorithms, benefiting virtually every individual, business, and government that relies on AI and ML.”
They could have been at the beach enjoying the summer. Instead, high school students gathered from across New York City and New Jersey for the AI4All program hosted by the Columbia community. The students came to learn about artificial intelligence (AI) but this program had a special twist – computer science (CS) and social work concepts were combined for a deeper, more meaningful look at AI.
“We created a space for young people to think critically about the social implications of artificial intelligence for the communities that they live in,” said Desmond Patton, the program co-director and associate professor of the School of Social Work. “We wanted them to understand how things like race, power, privilege and oppression can be baked into algorithms and their adverse effects on communities.”
The program participants, composed of 9th, 10th and 11th graders, are from racial and ethnic groups underrepresented in AI: Black, Hispanic, and Asian. Girls as well as youth from lower-income backgrounds were particularly encouraged to apply. For three weeks the students attended lectures, went on field trips to visit local companies (LinkedIn and Samsung) involved in the program, and visited other AI4All programs, like at Princeton University. Their work culminated in a final project which they presented to their classmates, mentors, and industry professionals.
“I believe that it is important to bring more ethics to AI,” said Augustin Chaintreau, the program co-director and a CS assistant professor. He sees ethics integrated into technical concepts and taught at the same time. Instead of learning about the social consequences and fixing it after, to solve an issue. Shared Chaintreau, “It shouldn’t be thought about just in passing but as a central part of why this is a tool and its implications.”
An interdisciplinary approach to AI was even part of how the classes were structured. Technical CS concepts, such as machine learning and Python, were taught in the morning by professors and student volunteers. While in the afternoon, guest speakers came to talk about their perspective to the day’s lesson. So, on the same day, students learned about supervised and unsupervised learning, and in the afternoon, someone who was formerly incarcerated described how the criminal policing that survey people on social media had a role in making a case against them.
“We were learning college courses meant to be taught in a month but for us it was just a couple of weeks and that was really impressive,” said Genesis Lopez, who is part of the robotics team at her school. Lopez loves robotics but works more on the mechanical side. She goes back to the team knowing how to use Python and is confident she can step up and code if needed. Continued Lopez, “I learned a lot but my favorite part was the people, we became a family.”
Most people take for granted that when they speak, they will be heard and understood. But for the millions who live with speech impairments caused by physical or neurological conditions, trying to communicate with others can be difficult and lead to frustration. While there have been a great number of recent advances in automatic speech recognition (ASR; a.k.a. speech-to-text) technologies, these interfaces can be inaccessible for those with speech impairments. Further, applications that rely on speech recognition as input for text-to-speech synthesis (TTS) can exhibit word substitution, deletion, and insertion errors. Critically, in today’s technological environment, limited access to speech interfaces, such as digital assistants that depend on directly understanding one’s speech, means being excluded from state-of-the-art tools and experiences, widening the gap between what those with and without speech impairments can access.
Kara Schechtman (CC ‘19) has been selected as one of the recipients of the Senator George J. Mitchell Scholarship Program to pursue a one-year postgraduate degree in Ireland. The fellowship is awarded to 12 individuals between the ages of 18 and 30 by the US-Ireland Alliance. Schechtman is headed to Trinity College Dublin where she will study philosophy.
“Artificial intelligence is advancing and we are at a point where ethics has to be considered,” said Schechtman, who is majoring in English and computer science at Columbia. “I believe studying philosophy will help me prepare for further studies in computer science.”
A point of frustration for her has been finding an area in computer science where both the humanities and computing overlap in a way that fits her interests. In artificial intelligence (AI), computing and philosophical questions can overlap, an intersection she finds fulfilling.
The development of AI poses a whole gamut of challenges to humanity, ranging from legislation challenges, to AI bias, to ethical concerns about potential machine consciousness, and even to potential existential threat. But these challenges have remained while the technical developments of AI has grown leaps and bounds.
One thing Schechtman hopes to answer through her studies is how society can act responsibly despite all that is unknown. “I think it also demands technical expertise to suggest actionable paths for responsibility now,” continued Schechtman. “Which is why it is so important for computer scientists and philosophers to work together, and for some people to study both.”
The Ireland location of the fellowship is also ideal. Dublin hosts the European Union headquarters of a number of tech giants. And having double majored in English, she looks forward to “nerding out” over Samuel Beckett and James Joyce in their home country.
“More broadly, the circumstances couldn’t be better — the other fellowship recipients seem amazing and I can’t wait to get to know them better, Trinity College Dublin is a wonderful school, and I am sure I will have a lot of fun exploring Ireland. I’m excited to grow from the experience in ways I don’t yet even expect,” said Schechtman.
Kai-Fu Lee (B.S. ’83) included in WIRED’s anniversary issue for his work that brings humanity to artificial intelligence.