Research Fair Fall 2023


Blue Computer Science "CS@CU" logo with Columbia crown

Research Fair Fall 2023

The Fall 2023 Research Fair will be held on Thursday, September 7th, and Friday, September 8th, from 11:30 to 13:00 in the CS Lounge (CSB 452) and Joseph F. Traub Conference Room (CSB 453). This is an opportunity to meet faculty and Ph.D. Students working in areas of interest to you and possibly work on these projects.

Please read their requirements carefully! There will be a couple of Zoom sessions and recordings available – see below for all details.


Thursday, September 7th

Faculty/Lab: Prof. Rubenstein, Prof. Misra: DNA Lab

Brief Research Project Description: We have several projects that focus on a range of topics spanning computer networking, synthetic control, large language models, and quantum computing.

Required/preferred prerequisites and qualifications:  Solid math skills preferred, and/or computer networking and/or programming in Python or C/C++


Faculty/Lab: Prof. Kender

Brief Research Project Description: My High-Level Vision lab (Prof. Kender, Schapiro CEPSR 6LW5) is continuing work on finding cultural differences in how news videos present the same event. For example, we have found that Chinese reports of international diseases tend to be more focused on national impact and historical precedents and tend to have a more optimistic tone of overcoming adversity, whereas American reports tend to focus on personal impact and immediate effects, and tend to have a more anxious tone in stressing danger. The work combines computer vision, natural language understanding, deep learning, and user interface and sentiment work, resulting in a novel cross-cultural news browser prototype that visualizes cross-cultural similarities and differences. We would like (a) to extend the work to analyze and visualize other cultures and events, (b) to better incorporate viewers’ comments and tweets, and (c) to do user studies to examine user experience with the browser and other tools. The long-term goal is to show the trans-cultural impact on user perceptions of events, given tools that make it easy to explore different cultural views.

Required/preferred prerequisites and qualifications: Some experience in one or more of the following: CV, NLP, ML, UI, Graphics. Multimedia experience, either in class or in industry, is preferred but not required. Mostly master’s students have been involved in this project, but upper-division undergrads (juniors, seniors) have also been part of the work and are encouraged to apply. Please note: there is no financial support available for students on this project this semester, although in the past some students have become co-authors on published papers. From three to eight students will be selected.


Faculty/Lab: Prof. Feiner, Computer Graphics and User Interfaces Lab: Research in AR, VR, 3DUI

Brief Research Project Description: The Computer Graphics and User Interfaces Lab (Prof. Feiner, PI) does research in the design of 3D and 2D user interfaces, including augmented reality and virtual reality, and mobile and wearable systems, for people interacting individually and together, indoors and outdoors. We use a range of displays and devices: head-worn, hand-held, and table-top, including Varjo XR-3, HP Reverb G2 Omnicept, Meta Quest Pro/2, Valve Index, HoloLens 2, Magic Leap 2, Nreal Light, Snap Next-Generation Spectacles, 3D Systems Touch, and phones. Multidisciplinary projects potentially involve working with faculty and students in other schools and departments, from medicine and dentistry to earth and environmental sciences and social work.

Required/preferred prerequisites and qualifications: We’re looking for students who have done excellent work in at least one of the following courses or their equivalents elsewhere: COMS W4160 (Computer graphics), COMS W4170 (User interface design), COMS W4172 (3D user interfaces and augmented reality), and COMS E6998 (Topics in VR & AR), and who have software design and development expertise. For those projects involving 3D user interfaces, we’re especially interested in students with Unity experience.

**** Bonus Zoom Session! 16:00 to 17:30 also on Thursday ****


Faculty/Lab: Prof. Lipson

Brief Research Project Description: Robotics, Machine Learning

Required/preferred prerequisites and qualifications: Pytorch, Transformers, Deep learning


PhD Student/Lab: Luoyao Hao, Internet Real-Time Lab

Brief Research Project Description: We build management systems and networking functionalities to provide better programmability, security, and reliability, for the Internet of Things (IoT). We will have two opportunities this semester: (1) Policy server. We will implement a policy server for smart building management systems that focuses on parsing relationships and properties of IoT devices and evaluating upcoming requests against policies (e.g., security or energy-related policies). We will study how to present policies to the system and build a prototype.

Required/preferred prerequisites and qualifications: (1) One or more of the following: Proficient in Python (Flask framework is a plus). Familiar with databases (graph database is a plus). Experienced with dataset processing. (2) Only for the second project: knowledge of Web3 and comfort with reading research papers are required.


Faculty/Lab: Prof. Pe’er, Computational Genomics

Brief Research Project Description: The lab works with large amounts of high throughput genomics data that requires specialized methods and pipelines for processing, An example project includes processing genome sequencing data from an isolated population cohort to detect the association of copy number changes to schizophrenia

Required/preferred prerequisites and qualifications: Being an independent programmer; interest in computational genomics; ability to learn new material and tools independently.


PhD Student/Lab: Chengzhi Mao

Brief Research Project Description: Large Language Models (LLM), like GPT, are becoming the foundation for an expanding array of AI applications, but the models often fail to be trustworthy. Our research generally tackles how to build trustworthy large language models. The goal is to submit papers to top conferences in machine learning, NLP, and vision.

Required/preferred prerequisites and qualifications: 1. Experience in deep learning and machine learning. 2. Strong coding 3. Dedicate 20 hours of research per week 4. Good communication skills


Faculty/Lab: Prof. McKeown

Brief Research Project Description: In this project, we’re building systems that facilitate cross-cultural understanding. How can we make predictions about the relationships between participants in a text, audio, or video document? How can we do this in a way that is easy to adapt to new languages and cultures? This requires a working understanding of the state-of-the-art in NLP and the ability to think creatively when building on it (e.g. interesting data sources, novel architectures and learning approaches, etc.).

We are also looking for students to work on a project involving detection of emotions of distress and events triggering such emotions. We are working on this project in an interdisciplinary environment, including faculty from social work and linguistics. Our focus is on analysis of social media and other texts from the Black community and thus, one direction of our research is on development of large language models that can interpret African American Language.

Required/preferred prerequisites and qualifications: required: NLP, preferred: machine learning, deep learning


Faculty/Lab: Albert Boulanger agb6@cumc.columbia.edu

Brief Research Project Descriptions: Neuroimaging data organization and analysis: The objective of this endeavor is to enhance the existing analysis pathways for neuroimaging data derived from two sources: 1) 3 Tesla and 2) Ultra-low field (64mT), Portable MRI data. These pathways play a vital role in advancing ongoing continuous studies centered around cognition, Multiple Sclerosis, and Alzheimer’s Disease. The initiative also encompasses responsibilities linked to the orderly transfer and arrangement of data.

Computer vision/radiomics project: The focus of this endeavor is to employ computer vision techniques, particularly radiomics, on medical imaging data encompassing MR images. The aim is to extract pertinent features with clinical significance related to cognitive impairment in Multiple Sclerosis (MS). The project entails the utilization of pre-existing MRI segmentation pipelines using software such as Freesurfer, FSL, SPM to generate masks for crucial brain regions (e.g., thalamus). Subsequently, radiomics analysis using the PyRadiomics package will be conducted on these regions of interest to extract relevant features. Following these steps, machine learning approaches will be applied for feature selection and the development of predictive models or for classification problems.

Data Engineering for Multiple Studies of Alzheimer’s Disease: Joint Cohort Explorer This project, the Joint Cohort Explorer, is centered on data engineering for the Thompson Project, a large-scale initiative to use multiple datasets to move the needle on Alzheimer’s Disease and now being used for Multiple Sclerosis. The development currently involves the use of a graph database (Neo4J) along with Django CMS, Django, and Plotly Dash, WebAssembly, Bootstrap, and React as underlying web technologies. NLP/Text Processing used in integrating JCE with REDCap.

Fast Reinforcement Learning via Quantum Computing. We are investigating ways to use quantum computing to speed up dynamic programming/reinforcement learning. Looking for people, ideally with a physics background, knowledgeable in quantum computing and (deep) reinforcement learning/approximate dynamic programming.

Required/preferred prerequisites and qualifications: Depending on the project, Python, Linux, cluster computing, computer vision, image processing, NLP, machine learning, full stack development, quantum computing, and any of the packages mentioned in the descriptions.


Faculty/Lab: Prof. Ross

Brief Research Project Description: Support for In-DBMS machine learning using low-level optimization techniques. We wish to recruit students for the following projects. 1. Accelerating database queries that can be mapped to variations of matrix multiplication: We have previously developed a compiler that can optimize variations of matrix multiplications. We will explore how this compiler can speed up ML-related use cases in databases, such as gradient descent, as well as applications, such as query explanation. This project will be based on the DuckDB system. 2. Using Processing-In-Memory to speed up ML workloads: Processing-In-Memory (PIM) is a paradigm that aims to alleviate the bottleneck of memory access from the CPU by pushing down computation to memory. We would like to explore how PIM can accelerate ML workloads such as model trees. This project will be based on UPMEM PIM technology.

Required/preferred prerequisites and qualifications: 4111 or equivalent. Desirable: 4112, 6111, and/or 6113.


Faculty/Lab: Creative Machines Lab, Prof Hod Lipson, working with Phd
Student Judah Goldfeder.

Brief Research Project Description: We are recruiting for several projects
in the Deep Learning space. There are projects in Auxiliary Learning,
Reinforcement Learning, Reverse Engineering Neural Networks, Crystallography Structure Prediction, and robotics navigation.

Required/preferred prerequisites and qualifications: We are looking for students with experience in Machine Learning, pytorch, and general computer science and algorithms knowledge, with a strong coding background.

Friday, September 8th


Faculty/Lab: Prof. Connelly

Brief Research Project Description: Student researchers will apply machine learning and NLP techniques to large (>3 million) corpora of declassified documents. They will work on one of two ongoing projects: 1) the development of a document archive and database consisting of records on government responses to COVID-19 released through the Freedom of Information Act (FOIA) and 2) the application of NLP techniques (such as named entity recognition and topic modeling) to document collections from the State Department, CIA, etc. For both projects, we are particularly interested in evaluating the quality of OCR output.

Required/preferred prerequisites and qualifications: knowledge of Python, SQL; interest in NLP, image processing, and its application to research in international history and politics


PhD Student/Lab: Caspar Lant

Brief Research Project Description: LoRa geospatial coverage mapping Project. The IRT lab recently installed a LoRaWAN gateway (a popular technology for IoT devices) on campus. We would like to generate a coverage map for the gateway.

Required/preferred prerequisites and qualifications: Willingness to be outside. This project involves fieldwork!! Access to wheels (car, motorcycle, bicycle, etc.) a plus. 

AND

Brief Research Project Description: IoT power profiling project. We will be characterizing a LoRa- based IoT device and its power consumption. We will compare power consumption for different operations against an off-the shelf device, as well as see if our own firmware performs better.

Required/preferred prerequisites and qualifications: EE/circuits experience a plus


Faculty/Lab: Prof. Hirschberg

Brief Research Project Description: 1) Identifying code-switching in Spanish-English corpus 2) Re-aligning the Switchboard Dialog Act corpus

Required/preferred prerequisites and qualifications: 1) Very good knowledge of Spanish and English 2) Some knowledge of speech and NLP if possible

AI Research Intern – NLP for Malicious Intent Detection on Social Media (SemaFor)

Job Description:

We are seeking an AI Research Intern to join our team, focused on Natural Language Processing (NLP). The primary project involves constructing a dataset for malicious intent detection in social media platforms. The intern will work closely with our research team to develop, implement, and analyze NLP algorithms for this specific application.

Job Requirements:

  • Familiarity with NLP techniques, evidenced by at least 2 related projects (these can be class projects or previous intern experience).
  • Experience with LLMs.
  • Strong academic paper-writing skills.
  • Demonstrable self-initiative and commitment to project goals.
  • Required technical skills: Python, GitHub, Overleaf.

This is an exciting opportunity for someone looking to deepen their expertise in NLP and contribute to a project with real-world implications. If you meet these qualifications and are excited about the potential impact of this work, we encourage you to apply.

AI Research Engineer – LLM-Powered Social App Development

Job Description:

We are on the hunt for a talented AI Research Engineer to join our dynamic startup team. The primary focus of this role will be the development of an LLM-powered social application. The engineer will work in collaboration with both our research and development teams to conceptualize, implement, and refine AI algorithms that power our application.

Job Requirements:

  • Strong coding skills, with at least 2 industry intern experiences.
  • Solid understanding of LLMs and NLP, as evidenced by at least 2 completed related class projects.
  • Self-initiative, agility, and strong commitment to project deliverables.
  • Excellent communication and collaboration skills, especially in a startup environment.
  • Required technical skills: Python, JavaScript, GitHub.
  • Preferred technical skills: Firebase.

This position offers a unique opportunity to be at the forefront of AI-powered social networking, all while being a part of a fast-paced and innovative startup environment. If you fit the criteria and are excited about pushing the boundaries of AI technologies, we’d love to hear from you.


Faculty/Lab: Prof. Yang, Prof. Cidon

Brief Research Project Description: Due to their low cost and the need to run computationally-intensive algorithms locally, satellites and spacecraft are increasingly employing off-the-shelf computing hardware. However, hardware in space is exposed to significantly higher amounts of radiation than on Earth, causing hardware to permanently fail or affecting the correct functioning of the application. We envision that solely using software fault tolerance techniques, commodity hardware operating in space can achieve fault tolerance equivalent or close to expensive and slow radiation-hardened hardware. To achieve this goal, we need to address the two main radiation fault scenarios: hardware latchups, which cause overheating, and silent data corruption. Students will develop a memory scrubber and a latch-up detection tool that will be tested on real-world spacecraft, including SmallSats and the Perseverance Mars rover.

Required/preferred prerequisites and qualifications: Required: – Taken Advanced Programming (COMS W3157 or equivalent) or have systems programming experience – Able to dedicate ~15-20 hours a week over the semester – Will count for directed research credit Preferred: – Taken a Operating Systems class (COMS W4118 or equivalent) – Taken a Computer Architecture class (CSEE W3827 or equivalent)


Faculty/Lab: Prof. Yang, Software System Lab

Brief Research Project Description: Improving model robustness on object detection for adversarial attack and out-of-distribution data via Stable-diffusion

Required/preferred prerequisites and qualifications: Student needs to have a strong background in machine learning, computer vision, and AI security, including adversarial attack and defense. Familiar with using Python, Pytorch, and the skills of building model pipelines for both the train and test phases.


Faculty/Lab: Matei Ciocarlie, Robotic Manipulation and Mobility Lab
 
Brief Research Project Description
Multi-fingered Learning. The research focus is broadly on robotic manipulation and particularly on dexterous manipulation with multi-fingered robotic hands. Learning such manipulation skills is challenging as the policies need to generalize to arbitrary objects over a range of tasks for real-world application. To achieve such skills, it is necessary to develop data-efficient training methods. It is also important to leverage visual and language models for their semantic representations for generalization. We are pursuing these goals as a couple of complementary projects and are looking for highly motivated students to help us with them.
 
Modeling physical hand interactions with exoskeletons. We are studying ways to computationally design interfaces between human skin and contact surfaces of wearable devices. The goal is to build safer and more effective robotic hand orthoses to assist people in daily life after a stroke. We are currently developing first-order models to rapidly identify correlations between finger biomechanical properties and the design of comfortable robot attachment mechanisms. We are seeking motivated students to (a) simulate quasi-static force effects on hand and robot during assisted grasping and (b) compare with state of the art biomechanics models. Students have the option to work with real hardware and humans.
 
Generative AI for assistive robotics. We are developing generative AI methods such as GPT, diffusion and GAN to expand the limited training data for assistive robotic applications. One of the major challenges in machine learning for assistive robotics, such as wearable devices, is the lack of training data from disabled participants. In order for us to realize the power of high-capacity ML models, a large amount of data is required. In this project, we want to model human biosignals as languages and use generative AI methods to increase the amount of data by adding synthetic samples.
 
Robotic manipulation with tactile sensors. Tactile sensing is the sense of touch and it is becoming more and more important in robotics. Humans are experts at using touch sensing. For example, we can manipulate wires under the desk and retrieve a particular object from our pockets in the absolute absence of vision. In this project, we aim to achieve similar tasks on robots with state-of-the-art tactile fingers.
 
Required/preferred prerequisites and qualifications: Strong coding experience with Python, strong communication skills (written and spoken), demonstrated commitment to project goals.
For Multi-Fingered Learning: additional requirements include strong background in PyTorch, Experience with reinforcement learning for robot learning. Preferences towards experience with simulators like MuJoCo / IssacGym and experience with large-vision language models.
For Exoskeleton Modeling: basic knowledge of rigid-body dynamics required, interest in physics simulation engines such as MuJoCo and hand biomechanics/rehabilitation a plus.
For Generative AI: preferences for expertise in machine learning, deep learning, large language models (LLMs) and generative AI, but all experience levels welcome.
For Tactile: interests in deep learning, reinforcement learning, physical simulators such as MuJoCo, PyBullet or Isaac Gym.

ZOOM Sessions

Wednesday, September 6th 10:00

Faculty/Lab: Albert Boulanger agb6@cumc.columbia.edu

Brief Research Project Descriptions: Neuroimaging data organization and analysis. The objective of this endeavor is to enhance the existing analysis pathways for neuroimaging data derived from two sources: 1) 3 Tesla and 2) Ultra low field (64mT), Portable MRI data. These pathways play a vital role in advancing ongoing continuous studies centered around cognition, Multiple Sclerosis, and Alzheimer’s Disease. The initiative also encompasses responsibilities linked to the orderly transfer and arrangement of data.

Computer vision/radiomics project. The focus of this endeavor is to employ computer vision techniques, particularly radiomics, on medical imaging data encompassing MR images. The aim is to extract pertinent features with clinical significance related to cognitive impairment in Multiple Sclerosis (MS). The project entails the utilization of pre-existing MRI segmentation pipelines using software such as Freesurfer, FSL, SPM to generate masks for crucial brain regions (e.g., thalamus). Subsequently, radiomics analysis using the PyRadiomics package will be conducted on these regions of interest to extract relevant features. Following these steps, machine learning approaches will be applied for feature selection and the development of predictive models or for classification problems. 

Required/preferred prerequisites and qualifications: Python, Linux, cluster computing, computer vision, machine learning, statistics, and any of the packages listed in the descriptions.

– Link to join Zoom Session! –


Friday, September 8th 11:00

Saturday, September 9th 11:00

PhD Student/Lab: Chengzhi Mao cm3797@columbia.edu

Brief Research Project Descriptions: Trustworthy Foundation Models Robust Robotic Navigation/Manipulation Adversarial Learning

Required/preferred prerequisites and qualifications: Pytorch; Python; Machine Learning; Willing to contribute 20hr a week and deliver results biweekly at least.

– Link to join Zoom Session! –



Updated 9/01/2023