Columbia University NLP Seminar

The seminar is a series of both invited faculty talks and student speakers. All are welcome to attend.

This semester (Spring 2023) the standard time is 4:00-5:00pm ET on Monday. However, we will also have multiple talks scheduled on different days/at different times so please check the days and times carefully on the calendar below so you don't miss out on any talks!

The seminar will be hybrid this semester. In-person meetings will be held in the CS Conference room (unless noted otherwise). The Zoom link will be sent out to the NLP mailing list. If you are not on the mailing list but would like the link, please email us.

The seminar is co-orgnaized by Emily Allaway and Fei-Tzin Lee. Please contact us with any questions.

Upcoming Talk: Tao Yu (Friday April 28 @ 3:30pm)

Building Natural Language Interfaces through Grounding Language Models into Executable Actions

Abstract: Executable language grounding focuses on mapping natural language instructions into code or actions executable within real-world contexts, including databases, web applications, and robotic environments. The field of natural language processing (NLP) has recently experienced significant advancements, particularly in language grounding, facilitated by large language models (LLMs) such as Codex, GPT-4, and ChatGPT-Plugins. This progress paves the way for the development of next-generation natural language interfaces. In this presentation, I will discuss our latest efforts to harness the capabilities of LLMs, predominantly Codex/GPT-4, to create natural language interfaces capable of addressing a broader spectrum of data analysis requirements. Firstly, I will explicate a neural-symbolic framework based on Codex, which enhances code generation to address a more diverse range of questions by incorporating API calls into LLMs within programming languages (e.g., SQL, Python). In the subsequent portion of the talk, I will introduce our recent endeavors in data science code generation using LLMs, which produce code solutions in response to StackOverflow inquiries about data science Python libraries, including NumPy and Pandas. Lastly, I will talk about ongoing and future research prospects in this domain.

Bio: Tao Yu is an Assistant Professor of Computer Science at The University of Hong Kong and serves as Co-Director of the HKU NLP group. His main research interest is in natural language processing. He completed his Ph.D. at Yale University and was a postdoctoral fellow in the UW NLP group at the University of Washington. His research aims to develop and design the next generation of natural language interfaces employing large language models to facilitate human interaction with data analysis, web applications, and robotic instruction through conversation. It involves executable language grounding, such as semantic parsing and code generation, efficient and generalizable large language models, and interactive systems. Tao is the recipient of the Google Research Scholar Award and the Amazon Research Award.

Spring 2023

Feb. 6: Weiyan Shi (PhD Student @ Columbia) -- Interactive AI Systems Specialized in Social Influence
Feb. 8 (Wed @ 10:30am): Isabelle Augenstein (Faculty @ University of Copenhagen) -- Beyond Fact Checking - Modeling Information Change in Scientific Communication
Feb. 16 (Thur @ 3pm): Preslav Nakov (Faculty @ Mohamed BinZayed University of Artificial Intelligence) -- Fighting the Global Social Media Infodemic: from Fake News to Harmful Content
Feb. 20: Arya McCarthy (PhD Student @ JHU) -- Learning, Projection, and Translation of 1000+ Languages
Feb. 27: Benjamin Ruppik (Post-doc @ Heinrich-Heine University) -- Topological Data Analysis
Mar. 20: Shi Feng (Post-doc @ University of Chicago) -- Pragmatic Interpretability
Mar. 27: Kolya Malkin (Post-doc @ Mila) -- Probabilistic inference for reasoning with large language models
April 3: Max Chen (PhD Student @ Columbia) -- Prompting for Conversation Synthesis in Low-Resource Conversational Tasks
April 17: Tuhin Chakrabarty and Arkadiy Saakyan (PhD Students @ Columbia) -- I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors
April 24: Wonjin Yoon (Research @ Harvard Medical School/Boston Children's Hospital) -- Characteristics, Methods and applications of Biomedical Natural Language Processing
April 28 (Friday @ 3:30pm): Tao Yu (Faculty @ Hong Kong University of Science and Technology) -- Building Natural Language Interfaces through Grounding Language Models into Executable Actions
May 1: Pascale Fung (Faculty @ Hong Kong University of Science and Technology) -- Safer Generative ConvAI
May 15: Emily Allaway (PhD Student @ Columbia University)

Fall 2022

Sept. 6: Aline Villavicencio (Faculty @ University of Sheffield) -- Computational modelling of Multiword Expressions and idiomaticity
Sept. 13: Matthew Marge (Researcher @ DEVCOM Army Research Laboratory) -- Robot Concept Learning in Situated Dialogue
Sept. 21: He He (Faculty @ NYU) -- Methods and applications for controllable text generation
Sept. 29: Alex Tamkin (PhD Student @ Stanford) -- Self-Supervised Learning for the Real World
Oct. 5: Jason Weston (Researcher @ Meta AI) -- The quest to build open-source open-domain conversational agents
Oct. 12: Violet Peng (Faculty @ UCLA) -- Controllable Text Generation For Open-Domain Creativity
Oct. 13: Wei Xu (Faculty @ Georgia Tech) -- Capturing Human Language Diversity & (Mis-)Information Spreading Online
Oct. 17:Alan Ritter (Faculty @ Georgia Tech) -- Towards Economically Efficient Use of Pre-Trained Language Models
Oct. 19: Pamela Mishkin (Researcher @ OpenAI) -- Co-designing generative systems without users
Oct. 24: Ruiqi Zhong (PhD Student @ Berkeley) -- Supervising AI to Do What I Can't Do
Nov. 14: Thomas Scialom (Researcher @ FAIR - Meta AI) -- Large Language Models, Instruction Tuning and possible applications to Science

Spring 2022

Feb. 1: Dragomir Radev (Faculty @ Yale University) -- Closing the loop in natural language interfaces to databases: parsing, dialogue, and generation
Feb. 8: Ellie Pavlick (Faculty @ Brown University) -- Implementing Symbols and Rules with Neural Networks
Feb. 22: Faisal Ladhak (PhD Student @ Columbia University) -- Towards Better Practices for Evaluation in Text Generation
Mar. 8: Karthik Narasimhan (Faculty @ Princeton University) -- Language-guided machine learning
Mar. 22: Mohit Bansal (Faculty @ UNC Chapel Hill) -- Knowledgeable & Spatial-Temporal Vision+Language
Mar. 29: Kyunghyun Cho (Faculty @ NYU) -- Learned data augmentation in natural language processing
Apr. 19: Liang Huang (Faculty @ Oregon State University) -- Two Sides of the Same Coin: Parsing Algorithms for COVID-19
Apr. 26: Tariq Alhindi (PhD Student @ Columbia University)
May 9: Ray Mooney (Faculty @ UT Austin) -- Deep Learning for Automating Software Documentation Maintenance

Fall 2021

Oct. 26: Fei-Tzin Lee (PhD Candidate @ Columbia University) -- Controlling syntax in generation with BART using structured content plans
Nov. 2: Marianna Apidianaki (Researcher @ University of Helsinki) -- What do language models know about lexical polysemy and intensity?
Nov. 16: Kun Qian (PhD Student @ Columbia University) -- Disambiguation for Task-Oriented Dialogs
Nov. 23: Heng Ji (Faculty @ UIUC) -- Schema Induction for Event Prediction
Nov. 30: Weiyan Shi (PhD Student @ Columbia University) -- Selective Differential Privacy for Language Modeling
Dec. 7: Qingyang Wu (PhD Student @ Columbia University) -- Memformer: A Memory-Augmented Transformer for Sequence Modeling

Spring 2021

Feb. 12: Adam Poliak (Faculty @ Barnard College) -- Exploring Reasoning Capabilities of NLP systems [recording]
Feb. 19: Ido Dagan (Faculty @ Bar-Ilan University) -- Three inspirations towards (multi-) text comprehension: QA-based modeling, consolidation, and interaction [recording]
Feb. 26: Zhou Yu/Weiyan Shi (Faculty/PhD student @ Columbia University) -- Two Methods to Reduce Repetitions and Contradictions in Dialog Generation [recording]
Mar. 19: Emily Allaway (PhD Candidate @ Columbia University) -- Implicit Meaning and Domain Adaptation in Stance Detection [recording]
Mar. 26: Elsbeth Turcan (PhD Candidate @ Columbia Univeristy) -- Emotion-Infused Models for Psychological Stress Detection [recording]
Apr. 2: Hannah Rashkin (Research Scientist @ Google) -- Commonsense reasoning about social dynamics in text

Fall 2020

Oct. 9: Tuhin Chakrabarty (PhD student @ Columbia University) -- The role of CommonSense in Figurative Language Generation [slides]
Oct. 23: Violet Peng (Faculty @ UCLA) -- From Language Understanding to Creative Generation[recording]
Nov. 6: Hanna Hajishirzi (Faculty @ UW) -- Knowledge-Rich Neural Text Comprehension and Reasoning [recording]
Nov. 13: Diyi Yang (Faculty @ Georgia Tech) -- When Social Context Meets NLP: Learning with Less Data and More Structures [recording]
Dec. 4: Melanie Subbiah (PhD student @ Columbia University) -- GPT-3: Few-shot Learning with a Giant Language Model