One Thing Leads to Another

Teaching AI to distinguish between causation and correlation would be a game changer— well, the game may be about to change.

A simple model of the brain provides new directions for AI research

Google Research held an online workshop on the conceptual understanding of deep learning. The workshop discussed how new findings in deep learning and neuroscience can help create better artificial intelligence systems. Christos Papadimitriou discussed how our growing understanding of information-processing mechanisms in the brain might help create algorithms that are more robust in understanding and engaging in conversations. Papadimitriou presented a simple and efficient model that explains how different areas of the brain inter-communicate to solve cognitive problems.

Heroes of Natural Language Processing

Professor Kathy McKeown talks with DeepLearning.AI’s Andrew Ng about how she started in artificial intelligence (AI), her research projects, how her understanding of AI has changed through the decades, and AI career advice for learners of NLP. 

Can AI Help Doctors Predict and Prevent Preterm Birth?

Almost 400,000 babies were born prematurely—before 37 weeks gestation—in 2018 in the United States. One of the leading causes of newborn deaths and long-term disabilities, preterm birth (PTB) is considered a public health problem with deep emotional and challenging financial consequences to families and society. If doctors were able to use data and artificial intelligence (AI) to predict which pregnant women might be at risk, many of these premature births might be avoided.

CS Professors Part of the Inaugural J.P. Morgan Faculty Research Awards

The J.P. Morgan AI Research Awards 2019 partners with research thinkers across artificial intelligence. The program is structured as a gift that funds a year of study for a graduate student.

Prediction semantics and interpretations that are grounded in real data
Principal Investigator: Daniel Hsu Computer Science Department & Data Science Institute

The importance of transparency in predictive technologies is by now well-understood by many machine learning practitioners and researchers, especially for applications in which predictions may have serious impacts on human lives (e.g., medicine, finance, criminal justice). One common approach to providing transparency is to ensure interpretability in the models and predictions produced by an application, or to accompany predictions with explanations. Interpretations and explanations may help individuals understand predictions that affect them, and also help developers reason about failure cases of their applications.

However, there are numerous possibilities for what constitutes a suitable interpretation or explanation, and the semantics of such provided by existing systems are not always clear. 

Suppose, for example, that a bank uses a linear model to predict whether or not a loan applicant will forfeit on a loan. A natural strategy is to seek a sparse linear model, which are often touted as highly interpretable. However, attributing significance to variables with non-zero regression coefficients (e.g., zip-code) and not others (e.g., race, age) is suspect when variables may be correlated. Moreover, an explanation based on pointing to individual variables or other parameters of a model ignores the source of the model itself: the training data (e.g., a biased history of borrowers and forfeiture outcomes) and the model fitting procedure. Invalid or inappropriate explanations may create a “transparency fallacy” that creates more problems than are solved.

The researchers propose a general class of mechanisms that provide explanations based on training or validation examples, rather than any specific component or parameters of a predictive model. In this way, the explanation will satisfy two key features identified in successful human explanations: the explanation will be contrastive, allowing an end-user to compare the present data to the specific examples chosen from the training or validation data, and the explanation will be pertinent to the actual causal chain that results in the prediction in question. These features are missing in previous systems that seek to explain predictions based on machine learning methods.

“We expect this research to lead to new methods for interpretable machine learning,” said Daniel Hsu, the principal investigator of the project. Because the explanations will be based on actual training examples, the methods will be widely applicable, in essentially any domain where examples can be visualized or communicated to a human. He continued, “This stands in contrast to nearly all existing methods for explanatory machine learning, which either require strong assumptions like linearity or sparsity, or do not connect to the predictive model of interest or the actual causal chain leading to a given prediction of interest.”

Efficient Formal Safety Analysis of Neural Networks
Principal Investigators: Suman Jana Computer Science Department, Jeannette M. Wing Computer Science Department & Data Science Institute, Junfeng Yang Computer Science Department

Over the last few years, artificial intelligence (AI), in particular Deep Learning (DL) and Deep Neural Networks (DNNs), has made tremendous progress, achieving or surpassing human-level performance for a diverse set of tasks including image classification, speech recognition, and playing games such as Go. These advances have led to widespread adoption and deployment of DL in critical domains including finance, healthcare, autonomous driving, and security. In particular, the financial industry has embraced AI in applications ranging from portfolio management (“Robo-Advisor”), algorithmic trading, fraud detection, loan and insurance underwriting, sentiment and news analysis, customer service, to sales.  

“Machine learning models are used in more and more safety and security-critical applications such as autonomous driving and medical diagnosis,” said Suman Jana, one of the principal investigators of the project. “Yet they are known to be fragile and frequently mispredicts on edge cases.“ 

In many critical domains including finance and autonomous driving, such incorrect behaviors can lead to disastrous consequences such as a gigantic loss in automated financial trading or a fatal collision of a self-driving car. For example, in 2016, a Google self-driving car crashed into a bus because it expected the bus to yield under a set of rare conditions but the bus did not. Also in 2016, a Tesla car in autopilot crashed into a trailer because the autopilot system failed to recognize the trailer as an obstacle due to its ‘white color against a brightly lit sky’ and the ‘high ride height.’

Before AI can become the next technological revolution, it must be robust against such corner-case inputs and does not cause disasters. The researchers believe AI robustness is one of the biggest challenges that needs to be solved in order to fully tame AI for good.

“Our research aims to create novel tools to verify that a machine learning model will not mispredict on certain important input ranges, ensuring safety and security,” said Junfeng Yang, one of the investigators of the research. 

The proposed work enables rigorous analysis of autonomous AI systems and machine learning (ML) algorithms, enabling data scientists to (1) verify that their AI models function correctly within certain input regions and violate no critical properties they specify (e.g., bidding price is never higher than a given maximum) or (2) locate all sub-regions where their models misbehave and repair their model accordingly. This capability will also enable data scientists to explain and interpret the outputs from autonomous AI systems and ML algorithms by understanding how different input regions may lead to different output predictions. Said Yang,”If successful, our work will dramatically boost the robustness, explainability, and interpretability of today’s autonomous AI systems and ML algorithms, benefiting virtually every individual, business, and government that relies on AI and ML.”