Columbia University NLP Seminar

The seminar is a series of both invited faculty talks and student speakers. All are welcome to attend.
The standard time is 2:00-3:00pm ET on Tuesday (new time this semester).
The seminar is currently online only, though we hope to move to a hybrid format later in the semester. The Zoom link will be sent out to the NLP mailing list. If you are not on the mailing list but would like the link, please email us.
The seminar is orgnaized by Emily Allaway. Please contact me with any questions.

Upcoming Talk: Ray Mooney (MONDAY May 9 @ 2pm ET)
Deep Learning for Automating Software Documentation Maintenance

Abstract: Applying deep learning to large open-source software repositories offers the potential to develop many useful tools for aiding software development, including automated program synthesis and documentation generation. Specifically, we have developed methods that learn to automatically update existing natural language comments based on changes to the body of code they accompany. Developers frequently forget to update comments when they change code, which is detrimental to the software development cycle, causing confusion and bugs . First, we use methods for "just in time" comment/code inconsistency detection which learn to recognize when changes to code render it incompatible with its existing documentation. We then learn a model that appropriately updates a comment when it is judged to be inconsistent. Our approach learns to correlate changes across two distinct language representations, generating a sequence of edits that are applied to an existing comment to reflect source code modifications. We train and evaluate our model using a large dataset collected from commit histories of open-source Java software projects, with each example consisting of an update to a method and any concurrent edit to its corresponding comment. We compare our approach against multiple baselines using both automatic metrics and human evaluation. Results reflect the challenge of this task and that our model outperforms many baselines with respect to detecting inconsistent comments and appropriately updating them.

Spring 2022

  • Feb. 1: Dragomir Radev (Faculty @ Yale University)
  • Feb. 8: Ellie Pavlick (Faculty @ Brown University)
  • Feb. 22: Faisal Ladhak (PhD Student @ Columbia University)
  • Mar. 8: Karthik Narasimhan (Faculty @ Princeton University)
  • Mar. 22: Mohit Bansal (Faculty @ UNC Chapel Hill)
  • Mar. 29: Kyunghyun Cho (Faculty @ NYU)
  • Apr. 19: Liang Huang (Faculty @ Oregon State University)
  • Apr. 26: Tariq Alhindi (PhD Student @ Columbia University)
  • May 9: Ray Mooney (Faculty @ UT Austin)

