
Research Fair Fall 2023
The Fall 2023 Research Fair will be held on Thursday, September 7th, and Friday, September 8th, from 11:30 to 13:00 in the CS Lounge (CSB 452) and Joseph F. Traub Conference Room (CSB 453). This is an opportunity to meet faculty and Ph.D. Students working in areas of interest to you and possibly work on these projects.
Please read their requirements carefully! There will be a couple of Zoom sessions and recordings available – see below for all details.
Thursday, September 7th
Faculty/Lab: Prof. Rubenstein, Prof. Misra: DNA Lab
Brief Research Project Description: We have several projects that focus on a range of topics spanning computer networking, synthetic control, large language models, and quantum computing.
Required/preferred prerequisites and qualifications: Solid math skills preferred, and/or computer networking and/or programming in Python or C/C++
Faculty/Lab: Prof. Kender
Brief Research Project Description: My High-Level Vision lab (Prof. Kender, Schapiro CEPSR 6LW5) is continuing work on finding cultural differences in how news videos present the same event. For example, we have found that Chinese reports of international diseases tend to be more focused on national impact and historical precedents and tend to have a more optimistic tone of overcoming adversity, whereas American reports tend to focus on personal impact and immediate effects, and tend to have a more anxious tone in stressing danger. The work combines computer vision, natural language understanding, deep learning, and user interface and sentiment work, resulting in a novel cross-cultural news browser prototype that visualizes cross-cultural similarities and differences. We would like (a) to extend the work to analyze and visualize other cultures and events, (b) to better incorporate viewers’ comments and tweets, and (c) to do user studies to examine user experience with the browser and other tools. The long-term goal is to show the trans-cultural impact on user perceptions of events, given tools that make it easy to explore different cultural views.
Required/preferred prerequisites and qualifications: Some experience in one or more of the following: CV, NLP, ML, UI, Graphics. Multimedia experience, either in class or in industry, is preferred but not required. Mostly master’s students have been involved in this project, but upper-division undergrads (juniors, seniors) have also been part of the work and are encouraged to apply. Please note: there is no financial support available for students on this project this semester, although in the past some students have become co-authors on published papers. From three to eight students will be selected.
Faculty/Lab: Prof. Feiner, Computer Graphics and User Interfaces Lab: Research in AR, VR, 3DUI
Brief Research Project Description: The Computer Graphics and User Interfaces Lab (Prof. Feiner, PI) does research in the design of 3D and 2D user interfaces, including augmented reality and virtual reality, and mobile and wearable systems, for people interacting individually and together, indoors and outdoors. We use a range of displays and devices: head-worn, hand-held, and table-top, including Varjo XR-3, HP Reverb G2 Omnicept, Meta Quest Pro/2, Valve Index, HoloLens 2, Magic Leap 2, Nreal Light, Snap Next-Generation Spectacles, 3D Systems Touch, and phones. Multidisciplinary projects potentially involve working with faculty and students in other schools and departments, from medicine and dentistry to earth and environmental sciences and social work.
Required/preferred prerequisites and qualifications: We’re looking for students who have done excellent work in at least one of the following courses or their equivalents elsewhere: COMS W4160 (Computer graphics), COMS W4170 (User interface design), COMS W4172 (3D user interfaces and augmented reality), and COMS E6998 (Topics in VR & AR), and who have software design and development expertise. For those projects involving 3D user interfaces, we’re especially interested in students with Unity experience.
**** Bonus Zoom Session! 16:00 to 17:30 also on Thursday ****
Faculty/Lab: Prof. Lipson
Brief Research Project Description: Robotics, Machine Learning
Required/preferred prerequisites and qualifications: Pytorch, Transformers, Deep learning
PhD Student/Lab: Luoyao Hao, Internet Real-Time Lab
Brief Research Project Description: We build management systems and networking functionalities to provide better programmability, security, and reliability, for the Internet of Things (IoT). We will have two opportunities this semester: (1) Policy server. We will implement a policy server for smart building management systems that focuses on parsing relationships and properties of IoT devices and evaluating upcoming requests against policies (e.g., security or energy-related policies). We will study how to present policies to the system and build a prototype.
Required/preferred prerequisites and qualifications: (1) One or more of the following: Proficient in Python (Flask framework is a plus). Familiar with databases (graph database is a plus). Experienced with dataset processing. (2) Only for the second project: knowledge of Web3 and comfort with reading research papers are required.
Faculty/Lab: Prof. Pe’er, Computational Genomics
Brief Research Project Description: The lab works with large amounts of high throughput genomics data that requires specialized methods and pipelines for processing, An example project includes processing genome sequencing data from an isolated population cohort to detect the association of copy number changes to schizophrenia
Required/preferred prerequisites and qualifications: Being an independent programmer; interest in computational genomics; ability to learn new material and tools independently.
PhD Student/Lab: Chengzhi Mao
Brief Research Project Description: Large Language Models (LLM), like GPT, are becoming the foundation for an expanding array of AI applications, but the models often fail to be trustworthy. Our research generally tackles how to build trustworthy large language models. The goal is to submit papers to top conferences in machine learning, NLP, and vision.
Required/preferred prerequisites and qualifications: 1. Experience in deep learning and machine learning. 2. Strong coding 3. Dedicate 20 hours of research per week 4. Good communication skills
Faculty/Lab: Prof. McKeown
Brief Research Project Description: In this project, we’re building systems that facilitate cross-cultural understanding. How can we make predictions about the relationships between participants in a text, audio, or video document? How can we do this in a way that is easy to adapt to new languages and cultures? This requires a working understanding of the state-of-the-art in NLP and the ability to think creatively when building on it (e.g. interesting data sources, novel architectures and learning approaches, etc.).
We are also looking for students to work on a project involving detection of emotions of distress and events triggering such emotions. We are working on this project in an interdisciplinary environment, including faculty from social work and linguistics. Our focus is on analysis of social media and other texts from the Black community and thus, one direction of our research is on development of large language models that can interpret African American Language.
Required/preferred prerequisites and qualifications: required: NLP, preferred: machine learning, deep learning
Faculty/Lab: Albert Boulanger agb6@cumc.columbia.edu
Brief Research Project Descriptions: Neuroimaging data organization and analysis: The objective of this endeavor is to enhance the existing analysis pathways for neuroimaging data derived from two sources: 1) 3 Tesla and 2) Ultra-low field (64mT), Portable MRI data. These pathways play a vital role in advancing ongoing continuous studies centered around cognition, Multiple Sclerosis, and Alzheimer’s Disease. The initiative also encompasses responsibilities linked to the orderly transfer and arrangement of data.
Computer vision/radiomics project: The focus of this endeavor is to employ computer vision techniques, particularly radiomics, on medical imaging data encompassing MR images. The aim is to extract pertinent features with clinical significance related to cognitive impairment in Multiple Sclerosis (MS). The project entails the utilization of pre-existing MRI segmentation pipelines using software such as Freesurfer, FSL, SPM to generate masks for crucial brain regions (e.g., thalamus). Subsequently, radiomics analysis using the PyRadiomics package will be conducted on these regions of interest to extract relevant features. Following these steps, machine learning approaches will be applied for feature selection and the development of predictive models or for classification problems.
Data Engineering for Multiple Studies of Alzheimer’s Disease: Joint Cohort Explorer This project, the Joint Cohort Explorer, is centered on data engineering for the Thompson Project, a large-scale initiative to use multiple datasets to move the needle on Alzheimer’s Disease and now being used for Multiple Sclerosis. The development currently involves the use of a graph database (Neo4J) along with Django CMS, Django, and Plotly Dash, WebAssembly, Bootstrap, and React as underlying web technologies. NLP/Text Processing used in integrating JCE with REDCap.
Fast Reinforcement Learning via Quantum Computing. We are investigating ways to use quantum computing to speed up dynamic programming/reinforcement learning. Looking for people, ideally with a physics background, knowledgeable in quantum computing and (deep) reinforcement learning/approximate dynamic programming.
Required/preferred prerequisites and qualifications: Depending on the project, Python, Linux, cluster computing, computer vision, image processing, NLP, machine learning, full stack development, quantum computing, and any of the packages mentioned in the descriptions.
Faculty/Lab: Prof. Ross
Brief Research Project Description: Support for In-DBMS machine learning using low-level optimization techniques. We wish to recruit students for the following projects. 1. Accelerating database queries that can be mapped to variations of matrix multiplication: We have previously developed a compiler that can optimize variations of matrix multiplications. We will explore how this compiler can speed up ML-related use cases in databases, such as gradient descent, as well as applications, such as query explanation. This project will be based on the DuckDB system. 2. Using Processing-In-Memory to speed up ML workloads: Processing-In-Memory (PIM) is a paradigm that aims to alleviate the bottleneck of memory access from the CPU by pushing down computation to memory. We would like to explore how PIM can accelerate ML workloads such as model trees. This project will be based on UPMEM PIM technology.
Required/preferred prerequisites and qualifications: 4111 or equivalent. Desirable: 4112, 6111, and/or 6113.
Faculty/Lab: Creative Machines Lab, Prof Hod Lipson, working with Phd
Student Judah Goldfeder.
Brief Research Project Description: We are recruiting for several projects
in the Deep Learning space. There are projects in Auxiliary Learning,
Reinforcement Learning, Reverse Engineering Neural Networks, Crystallography Structure Prediction, and robotics navigation.
Required/preferred prerequisites and qualifications: We are looking for students with experience in Machine Learning, pytorch, and general computer science and algorithms knowledge, with a strong coding background.
Friday, September 8th
Faculty/Lab: Prof. Connelly
Brief Research Project Description: Student researchers will apply machine learning and NLP techniques to large (>3 million) corpora of declassified documents. They will work on one of two ongoing projects: 1) the development of a document archive and database consisting of records on government responses to COVID-19 released through the Freedom of Information Act (FOIA) and 2) the application of NLP techniques (such as named entity recognition and topic modeling) to document collections from the State Department, CIA, etc. For both projects, we are particularly interested in evaluating the quality of OCR output.
Required/preferred prerequisites and qualifications: knowledge of Python, SQL; interest in NLP, image processing, and its application to research in international history and politics
PhD Student/Lab: Caspar Lant
Brief Research Project Description: LoRa geospatial coverage mapping Project. The IRT lab recently installed a LoRaWAN gateway (a popular technology for IoT devices) on campus. We would like to generate a coverage map for the gateway.
Required/preferred prerequisites and qualifications: Willingness to be outside. This project involves fieldwork!! Access to wheels (car, motorcycle, bicycle, etc.) a plus.
AND
Brief Research Project Description: IoT power profiling project. We will be characterizing a LoRa- based IoT device and its power consumption. We will compare power consumption for different operations against an off-the shelf device, as well as see if our own firmware performs better.
Required/preferred prerequisites and qualifications: EE/circuits experience a plus
Faculty/Lab: Prof. Hirschberg
Brief Research Project Description: 1) Identifying code-switching in Spanish-English corpus 2) Re-aligning the Switchboard Dialog Act corpus
Required/preferred prerequisites and qualifications: 1) Very good knowledge of Spanish and English 2) Some knowledge of speech and NLP if possible
AI Research Intern – NLP for Malicious Intent Detection on Social Media (SemaFor)
Job Description:
We are seeking an AI Research Intern to join our team, focused on Natural Language Processing (NLP). The primary project involves constructing a dataset for malicious intent detection in social media platforms. The intern will work closely with our research team to develop, implement, and analyze NLP algorithms for this specific application.
Job Requirements:
- Familiarity with NLP techniques, evidenced by at least 2 related projects (these can be class projects or previous intern experience).
- Experience with LLMs.
- Strong academic paper-writing skills.
- Demonstrable self-initiative and commitment to project goals.
- Required technical skills: Python, GitHub, Overleaf.
This is an exciting opportunity for someone looking to deepen their expertise in NLP and contribute to a project with real-world implications. If you meet these qualifications and are excited about the potential impact of this work, we encourage you to apply.
AI Research Engineer – LLM-Powered Social App Development
Job Description:
We are on the hunt for a talented AI Research Engineer to join our dynamic startup team. The primary focus of this role will be the development of an LLM-powered social application. The engineer will work in collaboration with both our research and development teams to conceptualize, implement, and refine AI algorithms that power our application.
Job Requirements:
- Strong coding skills, with at least 2 industry intern experiences.
- Solid understanding of LLMs and NLP, as evidenced by at least 2 completed related class projects.
- Self-initiative, agility, and strong commitment to project deliverables.
- Excellent communication and collaboration skills, especially in a startup environment.
- Required technical skills: Python, JavaScript, GitHub.
- Preferred technical skills: Firebase.
This position offers a unique opportunity to be at the forefront of AI-powered social networking, all while being a part of a fast-paced and innovative startup environment. If you fit the criteria and are excited about pushing the boundaries of AI technologies, we’d love to hear from you.
Faculty/Lab: Prof. Yang, Prof. Cidon
Brief Research Project Description: Due to their low cost and the need to run computationally-intensive algorithms locally, satellites and spacecraft are increasingly employing off-the-shelf computing hardware. However, hardware in space is exposed to significantly higher amounts of radiation than on Earth, causing hardware to permanently fail or affecting the correct functioning of the application. We envision that solely using software fault tolerance techniques, commodity hardware operating in space can achieve fault tolerance equivalent or close to expensive and slow radiation-hardened hardware. To achieve this goal, we need to address the two main radiation fault scenarios: hardware latchups, which cause overheating, and silent data corruption. Students will develop a memory scrubber and a latch-up detection tool that will be tested on real-world spacecraft, including SmallSats and the Perseverance Mars rover.
Required/preferred prerequisites and qualifications: Required: – Taken Advanced Programming (COMS W3157 or equivalent) or have systems programming experience – Able to dedicate ~15-20 hours a week over the semester – Will count for directed research credit Preferred: – Taken a Operating Systems class (COMS W4118 or equivalent) – Taken a Computer Architecture class (CSEE W3827 or equivalent)
Faculty/Lab: Prof. Yang, Software System Lab
Brief Research Project Description: Improving model robustness on object detection for adversarial attack and out-of-distribution data via Stable-diffusion
Required/preferred prerequisites and qualifications: Student needs to have a strong background in machine learning, computer vision, and AI security, including adversarial attack and defense. Familiar with using Python, Pytorch, and the skills of building model pipelines for both the train and test phases.
For Multi-Fingered Learning: additional requirements include strong background in PyTorch, Experience with reinforcement learning for robot learning. Preferences towards experience with simulators like MuJoCo / IssacGym and experience with large-vision language models.
For Exoskeleton Modeling: basic knowledge of rigid-body dynamics required, interest in physics simulation engines such as MuJoCo and hand biomechanics/rehabilitation a plus.
For Tactile: interests in deep learning, reinforcement learning, physical simulators such as MuJoCo, PyBullet or Isaac Gym.
ZOOM Sessions
Wednesday, September 6th 10:00
Faculty/Lab: Albert Boulanger agb6@cumc.columbia.edu
Brief Research Project Descriptions: Neuroimaging data organization and analysis. The objective of this endeavor is to enhance the existing analysis pathways for neuroimaging data derived from two sources: 1) 3 Tesla and 2) Ultra low field (64mT), Portable MRI data. These pathways play a vital role in advancing ongoing continuous studies centered around cognition, Multiple Sclerosis, and Alzheimer’s Disease. The initiative also encompasses responsibilities linked to the orderly transfer and arrangement of data.
Computer vision/radiomics project. The focus of this endeavor is to employ computer vision techniques, particularly radiomics, on medical imaging data encompassing MR images. The aim is to extract pertinent features with clinical significance related to cognitive impairment in Multiple Sclerosis (MS). The project entails the utilization of pre-existing MRI segmentation pipelines using software such as Freesurfer, FSL, SPM to generate masks for crucial brain regions (e.g., thalamus). Subsequently, radiomics analysis using the PyRadiomics package will be conducted on these regions of interest to extract relevant features. Following these steps, machine learning approaches will be applied for feature selection and the development of predictive models or for classification problems.
Required/preferred prerequisites and qualifications: Python, Linux, cluster computing, computer vision, machine learning, statistics, and any of the packages listed in the descriptions.
– Link to join Zoom Session! –
Friday, September 8th 11:00
Saturday, September 9th 11:00
PhD Student/Lab: Chengzhi Mao cm3797@columbia.edu
Brief Research Project Descriptions: Trustworthy Foundation Models Robust Robotic Navigation/Manipulation Adversarial Learning
Required/preferred prerequisites and qualifications: Pytorch; Python; Machine Learning; Willing to contribute 20hr a week and deliver results biweekly at least.