Giannis Karamanolakis, a natural language processing and machine learning PhD student, talks about his research projects and how he is developing machine learning techniques for natural language processing applications.
Can you talk about your background and why you decided to pursue a PhD?
At NTUA, taking part in machine learning (ML) research was not planned but rather a spontaneous outcome stemming from my love for music. The initial goal for my undergraduate thesis was to build an automatic music transcription system that converts polyphonic raw audio into music sheets. However, after realizing that such a system would not be possible to develop in a limited amount of time, I worked on the simpler task of automatically tagging audio clips with descriptive tags (e.g., “car horn” for audio clips where a car horn is sound). Right after submitting a new algorithm as a conference paper, I realized that I love doing ML research.
After NTUA, I spent one and a half years working as an ML engineer at a startup called Behavioral Signals, where we trained statistical models for the recognition of core emotions from speech and text data. After a few months of ML engineering, I found myself spending more time reading research papers and evaluating new research ideas on ML and natural language processing (NLP). By then, I was confident about my decision to pursue a PhD in ML/NLP.
What about NLP did you like and when did you realize that you wanted to do research on it?
I am fascinated by the ability of humans to understand complex natural language. At the moment of writing this response, I submitted the following 10-word query to Google: “when did you realize that you wanted to do research” by keeping quotation marks so that Google looks for exact matches only. Can you guess the number of the documents returned by Google that contain this exact sequence of 10 words?
The answer that I got was 0 (zero) documents, no results! In other words, Google, a company with huge collections of documents, did not detect any document that contains this specific sequence of words. Sentences rarely recur but humans easily understand the semantics of such rare sentences.
I decided to do research on NLP when I realized that current NLP algorithms are far away from human-level language understanding. As an example back from my time at Behavioral Signals, emotion classifiers were misclassifying sentences that contained sarcasm, negation, and other complex linguistic phenomena. I could not directly fix those issues (which are prevalent beyond emotion classification), which initially felt both surprising and frustrating, but then evolved into my excitement for research on NLP.
Why did you apply to Columbia and how was that process?
The computer science department at Columbia was one of my top choices for several reasons, but I will discuss the first one.
I was excited to learn about the joint collaboration between Columbia University and the New York City Department of Health and Mental Hygiene (DOHMH), on a project that aims to understand user-generated textual content in social media (e.g., Yelp reviews, tweets) for critical public health applications, such as detecting and acting on foodborne illness outbreaks in restaurants. I could see that the project would offer the unique opportunity to do research in ML and NLP and at the same time contribute to this important public application in collaboration with epidemiologists at DOHMH. Fortunately, I have been able to work on the project, advised by Professor Luis Gravano and Associate Professor Daniel Hsu.
Applying to Columbia and other American universities was quite a stressful experience. For many months, my days were filled with working for Behavioral Signals, studying hard for high scores in GRE and TOEFL exams (both of which were required at that time by all US universities), creating a short CV for the first time, and writing a distinct statement-of-purpose for each university. I am glad to observe the recent promising changes in the PhD application procedure for our department, such as waiving the GRE requirements and offering the Pre-submission Application Review (PAR) program, in which current PhD students help applicants improve their applications. (Both of which I would have liked to have been able to take advantage of.)
What sort of research questions or issues do you hope to answer?
My research in the past few years focuses on the following question: Can we effectively train ML classifiers for NLP applications with limited training data using alternative forms of human supervision?
An important limitation of current “supervised ML” techniques is that they require large amounts of training data, which is expensive and time-consuming to obtain manually. Thus, while supervised ML techniques (especially deep neural networks) thrive in standard benchmarks, it would be too expensive to apply to emerging real-world applications with limited labeled data.
Our work attempts to address the expensive requirement of manually labeled data through novel frameworks that leverage alternative, less expensive forms of human supervision. In sentiment classification, for example, we allow domain experts to provide a small set of domain-specific rules (e.g., “happy” keyword indicates positive sentiment, “diarrhea” is a symptom of food poisoning). Under low-resource settings with no labeled data, can we leverage expert-defined rules as supervision for training state-of-the-art neural networks?
For your research papers, how did you decide to do research on those topics? How long did it take you to complete the work? Was it easy?
For my first research project at Columbia, my goal was to help epidemiologists in health departments with daily inspections of restaurant reviews that discuss food poisoning events. Restaurant reviews can be quite long, with many irrelevant sentences surrounding the truly important ones that discuss food poisoning or relevant symptoms. Thus, we developed a neural network that highlights only important sentences in potentially long reviews and deployed it for inspections in health departments, where epidemiologists could quickly focus on the relevant sentences and safely ignore the rest.
Each project took about 6 months to complete. None of them were easy; each required substantial effort in reading relevant papers, discussing potential solutions with my advisors, implementing executable code, evaluating hypotheses on real data, and repeating the same process until we were all satisfied with the solutions and evaluation results. The projects also involved meeting with epidemiologists at DOHMH, re-designing our system to satisfy several (strict) data transfer protocols imposed by health departments, and overcoming several issues related to missing data for training ML classifiers.
Your advisors are not part of the NLP group, how has that worked out for you and your projects?
It has worked great in my humble opinion. For the public health project, the expertise of Professor Gravano on information extraction, combined with the expertise of Professor Hsu on machine learning, and the technical needs of the project have contributed without any doubt to the current formulation of our NLP-related frameworks. My advisors’ feedback covers a broad spectrum of research, ranging from core technical challenges to more general research practices, such as problem formulation and paper writing.
Among others, I appreciate the freedom I have been given for exploring new interesting research questions as well as the frequent and insightful feedback that helps me to reframe questions and forming solutions. At the same time, discussions with members of the NLP group, including professors and students, have been invaluable and have clearly influenced our projects.
What do you think is the most interesting thing about doing research?
I think it is the high amount of surprise it encompasses. For many research problems that I have tried to tackle, I started by shaping an initial solution in my mind but in the process discovered surprising findings that undoubtedly changed my way of thinking – such as that my initial solution did not actually work, simpler approaches worked better than more sophisticated approaches, data followed unexpected patterns, etc. These instances of surprise turned research into an interesting experience, similar to solving riddles or listening to jazz music.
Please talk about your internships – the work you did, how was it, what did you learn?
In the summer of 2019, I worked at Amazon’s headquarters in Seattle with a team of more than 15 scientists and engineers. Our goal was to automatically extract and store knowledge about billions of products in a product knowledge graph. As part of my internship, we developed TXtract, a deep neural network that efficiently extracts information from product descriptions for thousands of product categories. TXtract has been a core component of Amazon’s AutoKnow, which provides the collected knowledge for Amazon search and product detail pages.
During the summer of 2020, I worked for Microsoft Research remotely from New York City (because of the pandemic). In collaboration with researchers at the Language and Information Technologies team, we developed a weak supervision framework that enables domain experts to express their knowledge in the form of rules and further integrates rules for training deep neural networks.
These two internships equipped me with invaluable experiences. I learned new coding tools, ML techniques, and research practices. Through the collaboration with different teams, I realized that even researchers who work on the same subfield may think in incredibly different ways, so to carry out a successful collaboration within a limited time, one needs to listen carefully, pre-define expected outcomes (with everyone in the team), and adapt fast.
Do you think your skills were improved by your time at Columbia? In which ways?
Besides having improved my problem-finding and -solving skills, I have expanded my presentation capabilities. In the beginning, I was frustrated when other people (even experienced researchers) could not follow my presentations and I was worried when I could not follow other presenters’ work. Later, I realized that if (at least part of) the audience is not able to follow a presentation, then the presentation is either flawed or has been designed for the wrong audience.
Over the past four years, I have presented my work at various academic conferences and workshops, symposiums at companies, and student seminars, and after having received constructive feedback from other researchers, I can say that my presentation skills have vastly improved. Without any doubt, I feel more confident and can explain my work to a broader type of audience with diverse expertise. That said, I’m still struggling to explain my PhD topic to my family. 🙂
What has been the highlight of your time at Columbia?
The first thing that comes to mind is the “Greek Happy Hour” that I co-organized in October 2019. More than 40 PhD students joined the happy hour, listened to Greek music (mostly “rempetika”), tasted greek specialties (including spanakopita), and all toasted loudly by saying “Γειά μας” (ya mas; the greek version of “cheers”).
Was there anything that was tough to handle while taking your PhD?
It is hard to work from home during a pandemic. A core part of my PhD used to involve multi-person collaborations, drawing illustrations on the whiteboards of the Data Science Institute, random chats in hallways, happy hours, and other social events. All these have been harder or impossible to retain during the pandemic. I miss it and look forward to enjoying it again soon.
Looking back, what would you have done differently?
If I could, I would have engaged in more discussions and collaborations, taken more classes, played more music, and slept less. 🙂
What is your advice to students on how to navigate their time at Columbia? If they want to do NLP research what should they know or do to prepare?
They should register for diverse courses; Columbia offers the opportunity to attend courses from multiple departments. They should reach out to as many people as possible and do not hesitate to email graduate students and professors. I love receiving emails from people that I haven’t met before, some of which stimulated creative collaborations.
For those that want to do NLP research (which I highly recommend–subjectively speaking), you should contact me or any person in the NLP group.
What are your plans after Columbia?
I plan to continue working on research, either as a faculty member or in an industry research and development department.
Is there anything else that you think people should know?
Columbia offers free and discounted tickets to museums and performances around New York City, even virtual art events. I personally consider New York as the “state-of-the-art”.
Graduate students from the department have been selected to receive scholarships. The diverse group is a mix of those new to Columbia and students who have received fellowships for the year.
The Google PhD Fellowship Program was created to recognize outstanding graduate students doing exceptional and innovative research in areas relevant to computer science and related fields.
Yiru Chen Yiru Chen is a fourth-year Ph.D. student who works with Associate Professor Eugene Wu. Her research interests are database systems, human-computer interaction, and data exploration. Her work focuses on improving database usability by automatically generating database interfaces for interactive data analysis.
Chen graduated from Peking University with a B.S. in computer science summa cum laude and a B.A. in Economics in 2018. She enjoys cycling and playing the violin whenever she has free time.
NSF Graduate Research Fellowship Program (GRFP)
The GRFP is a five-year fellowship that recognizes and supports outstanding graduate students in NSF-supported STEM disciplines who are pursuing research-based master’s and doctoral degrees.
Philippe Chlenski Philippe Chlenski is interested in developing and applying computational techniques to biological problems, particularly machine learning for microbial dynamics. He is a second-year PhD student in the Pe’er lab. Prior to Columbia, he worked for two years at the Fellowship for Interpretation of Genomes at the Argonne National Lab.
Chlenski graduated in 2018 from Yale University with a Bachelor’s degree in mathematics and philosophy. He also holds an Associate’s degree in liberal arts from Deep Springs College.
Sam Fereidooni Sam Fereidooni is interested in investigating semantic representations through the lens of both cognitive neuroscience and natural language processing. He particularly hopes that the eventual findings from his work will lead to ameliorated treatments for those who suffer from language processing and production disorders. He is a first-year PhD student in the Theory group, and he is advised by Professor Christos Papadimitriou.
Fereidooni graduated in 2021 from Yale University with a B.S. in Cognitive Science, and a B.S. in Statistics and Data Science. Sam’s undergraduate studies were supported by the Questbridge Foundation National College Match scholarship, the Richter Undergraduate Research fellowship, and the Yale Club of New York City Charles S. Guggenheimer scholarship.
Shashaank N Shashaank N is a first-year PhD student who will be advised by assistant professor David Knowles. His research interests are in computational genomics and neuroscience, with a focus on auditory processing disorders in the brain.
Shashaank recently graduated with an MS in Computer Science from Columbia University in 2021. He completed a BS in Interdisciplinary Studies from Western Kentucky University (WKU) in 2019 and received the Scholar of the College academic award.
Meghna Pancholi Meghna Pancholi is a second-year PhD student advised by Associate Professor Simha Sethumadhavan. She is interested in cloud computing, systems security, and microservices. Before Columbia, Meghna was an undergraduate researcher at Cornell University where she worked on improving the performance of microservices applications with machine learning techniques.
Meghna graduated from Cornell University in 2020 with a BS in Computer Science.
Clayton Sanford Clayton Sanford is a third-year PhD student working with Professors Rocco Servedio and Daniel Hsu on machine learning theory. The motivating goal of his research is to understand mathematically why deep learning performs so well in practice. Clayton’s work on the approximation capabilities of neural networks has been published at the COLT 2021 conference. He is a member of the CS Theory Group.
Clayton received an ScB in Applied Math and Computer Science with honors from Brown University in 2018.
Sky Wang Sky Wang is an incoming first-year PhD student set to work with Assistant Professors Zhou Yu and Smaranda Muresan. His work focuses on natural language processing and he is interested in leveraging computational methods to understand social aspects of language and to use such insights in creating more effective and more equitable language technologies. He is particularly interested in the areas of situated dialogue systems, computational social science, and cultural analytics.
Wang graduated in 2020 from the University of Michigan with a B.S.E in Computer Science. He is a 2021 recipient of the University of Michigan’s EECS Undergraduate Outstanding Research Award and also received an honorable mention for the Computing Research Association Outstanding Undergraduate Research Award in 2021. He received a Best Poster award from the University of Michigan AI Symposium in 2018 and was recognized as a finalist in the NASA Goddard Space Flight Center Intern Research Fair in 2018.
Joseph Zuckerman Joseph Zuckerman is a second-year PhD student in computer science at Columbia University, where he works in the System-Level Design group, advised by Professor Luca Carloni. His research interests include architectures, runtime management, and agile design methodologies for many-accelerator systems-on-chip.
Zuckerman contributes as one of the main developers to ESP, an open-source research platform for heterogeneous system-on-chip design. In 2019, he completed his S.B in electrical engineering at Harvard University, during which he completed internships at NVIDIA and the NASA Jet Propulsion Lab.
Columbia School of Engineering and Applied Sciences established the Presidential and SEAS fellowships to recruit outstanding students from around the world to pursue graduate studies at the school.
Sebastian Salazar Sebastian Salazar’s research interests include Machine Learning and Ethical AI. At Columbia, his work will be focused on counterfactual predictions and actionability of Machine Learning models. He is a first-year PhD student who will be working under the guidance of Ansaf Salleb-Aouissi.
Sebastian graduated magna cum laude from Columbia University in 2021 with a B.S. in Applied Physics.
Huy Ha Huy Ha is an incoming first-year PhD student interested in computer vision, natural language processing, and robot learning. His research studies how embodied intelligence could combine information from different modalities (vision, language, interaction) to understand its environment, solve tasks, and assist people. He is advised by Assistant Professor Shuran Song and is a member of the Columbia Artificial Intelligence and Robotics (CAIR) lab.
Ha graduated in 2021with a BS in Computer Science from Columbia University. He was a Dean’s Fellow and received the Theodore Bashkow Award. He did research during the summer as a Bonomi Summer Scholar. During his free time, Ha likes to take photos, rock climb, bike, and train his two border collies for frisbee.
Yun-Yun Tsai A first-year PhD student, Yun-Yun Tsai works with Professor Junfeng Yang. Her research interests are in security and artificial intelligence. In particular, she is interested in improving robustness over neural networks and machine learning (ML) algorithms so that they make fewer mistakes on malicious samples. She will work on research related to making AI applications less fragile against unusual inputs.
Tsai received a B.Sc. and M.Sc. degrees in computer science at National Tsing Hua University (NTHU) Taiwan in 2014 and 2018, respectively. Previously, she was advised by Professor Tsung-Yi Ho and Dr. Pin-Yu Chen from Trusted AI group, IBM Thomas J. Watson Research Center, NY USA.
Anjali Das Anjali Das is a first-year PhD student who works with Professors Itsik Pe’er and David Knowles. Her research interest is in developing and applying machine learning methods to problems in genomics. Specifically, she is interested in the genetics of neurological diseases.
Das graduated from the University of Chicago in June of 2020 with a BS in statistics and a minor in computer science. After graduating, she worked as a data scientist at UChicago’s Research Computing Center before joining Columbia.
Jihye Kwon, a computer engineering PhD student, talks about her research projects and what it took to win a Best Paper award.
What drew you to computer engineering, specifically the application of machine learning to computer-aided design? What questions or issues do you hope to answer?
I was attracted to the concept of a computer: a machine that performs calculations. I found it very interesting how modern computers evolved from executing one instruction at a time to executing many instructions simultaneously by exploiting multiple levels of parallelism. Still, various challenges remained, or newly arose, so I dreamed about designing a brand-new computer system. That is what I had in mind when coming to Columbia.
At the beginning of my PhD, I experimented and learned how to design the core parts of special-purpose computers, using computer-aided design tools. I also explored machine learning from both theoretical and practical perspectives. These activities led me to work on my current research problems.
In advanced computer-aided design of computer systems, computers solve many complex optimization problems in steps to generate a final design. They do so as guided by the designers via means of the configurable ‘knobs’. My focus is on the designers’ work.
For a target system, designers run the computer-aided design tools repeatedly with the many different knob configurations until the tools output final designs with optimal or desired properties, e.g., in timing, area, and power. I wondered if machines can learn, from designers’ previous work, how to configure the knobs to optimize a new target system. Can designers virtually collaborate across time and tasks through the machine learning models? These are the main questions that I hope to answer.
Could you talk about your research and how you collaborated with other groups? Was this something you considered when applying to Columbia – that there are opportunities to do multi-disciplinary work?
When I was applying to Columbia, I wished I could have collaboration opportunities to study and work in the interdisciplinary research communities at the center of New York City. I wanted to explore applications of computer science in different areas to eventually gain insight and inspiration for my own research, which is centered at computer engineering.
Fortunately, these were realized as I worked with my advisor, Professor Luca Carloni. I was invited to join the project “Energy Efficient Computing with Chip-Based Photonics”, which is a part of a large initiative supported by the government and industry. In this project, I worked closely with Lightwave Research Laboratory in Electrical Engineering on a new optical computing system. We proposed the concept of a next-generation computing system that is co-designed with silicon photonics and electronic circuitry, in order to overcome the fundamental and physical limitations of today’s computers.
Another project on optical communication was initiated from a student project that I mentored in my advisor’s class, Embedded Scalable Platforms. This project investigated the use of photonic switches in optically-connected memory systems for deep learning applications.
Outside Columbia, I have also collaborated with researchers at IBM TJ Watson Research Center via my summer internships on the project of auto-tuning computer-aided design flows for commercial supercomputers. All these collaborations opened new horizons for me.
You won the MLCAD 2020 Best Paper award for your research, can you talk about your process – how did the research come about? How long did it take you to complete the work? What were the things you had to overcome?
In this work, I proposed a novel machine learning approach for computer-aided design optimization of hardware accelerators. I wanted to address this problem because it is computationally very expensive to explore the entire optimization space. It took me about one year to complete the work. One of the biggest difficulties I faced was the limited availability of the data for applying machine learning to the problem.
Then, I found out that transfer learning has been recently successfully applied in other areas with limited data. In transfer learning, a model trained for a related problem (e.g., natural image recognition) is transferred to aid the machine learning for the target problem (e.g., face recognition). Hence, I tried to apply transfer learning to my research problem. I trained a neural network model for a different accelerator design, and transferred the model to predict the design properties of a target accelerator.
However, the transferred model did not perform well in this case. I realized that due to the diverse characteristics of the accelerators, I needed to distinguish which piece of the source information should be transferred. Based on this intuition, I constructed a series of new models, and eventually, proposed one with promising performance. While it was a long process of building new models without knowing the answers, my advisor greatly encouraged me in our discussions to keep moving forward, and it was very rewarding in the end.
Looking back, how have you grown as a researcher and a person?
Besides expanding my problem-solving capabilities and technical skills, I have grown to be a better presenter and communicator. One of the tasks of a researcher is to explain one’s work to various groups and different types of audiences. I had a number of opportunities to present my work at academic conferences, seminars at companies, lightning talks, and annual project reviews. Initially, I struggled to meet the audience’s interests whose expertise spans a diverse range of areas and levels. Through those opportunities, I have received very helpful feedback, I have tried to ask myself questions from other people’s perspectives and progressively learned to keep a good balance between abstraction and elaboration.
Also, by interacting with a lot of students with heterogeneous backgrounds in the classes I TA’ed, I have learned to understand what their questions mean and where they come from. Based on that, I tried to adjust my answers to have more relatable conversations. From those conversations, sometimes the students found the topics very interesting, and sometimes I learned something new from them. It was such a great pleasure to inspire others and to be inspired. I think those experiences have made me a better researcher and person.
In Fall 2017, I received an invitation from WiCS’ president, Julia Di, and was impressed by the passionate and caring board members working on the common goal of supporting the advancement of womxn in computer science. In my second year I launched the WiCS Lightning Talks for students with research experience to share their work and stories. The goal was for young students to get to know more about research and demystify the process.
I am one of the few women at Columbia in my research area of computer engineering and would like to contribute to inspiring the next generation to join us.
What was the highlight of your time at Columbia?
Every moment was special for me. Some of the highlights were during happy hour with members of the fishbowl. The fishbowl is a large office occupied by the majority of PhD students in computer engineering. We call it the fishbowl, because it is surrounded by large windows and students inside look like small fishes. Once, my colleagues talked about their memories of old computers that I had never seen. I enjoyed imagining the machines from their descriptions, and thinking about different types and generations of computers.
What is your advice to students on how to navigate their time at Columbia?
Explore, experience, and exploit. There are recommended lists of classes, activities, and companies, depending on your track and interests, but no one is exactly like you. There is such a great variety of opportunities and resources at Columbia and in New York City. I hope you can spend enough time exploring them and get involved in many ways before determining your academic and career goals.
Is there anything else that you think people should know?
Columbia is beautiful in the snow! It gets pretty windy in the winter, so please be aware if you are coming from warmer places. There are many places where you can study but Avery Library is my favorite library on campus. If you have any questions or opinions on this Q&A story, please feel free to drop me a line!
The University of San Diego has named Maritza Johnson (PhD ’12) as the director of the new center that will address issues concerning big data and artificial intelligence, and their social implications — ethics, privacy, and fairness.
This summer seminar series highlights 14 computer science PhD students. The handpicked group of students hosted individual Zoom sessions to discuss their experiences and research projects.
The Columbia Engineering community has come together to combat the coronavirus pandemic on multiple fronts. In close collabo-ration with the Columbia University Irving Medical Center, we’re leveraging our expertise and innovation to address short term medical needs and long term societal impacts.
Dean Boyce's statement on amicus brief filed by President Bollinger
President Bollinger announced that Columbia University along with many other academic institutions (sixteen, including all Ivy League universities) filed an amicus brief in the U.S. District Court for the Eastern District of New York challenging the Executive Order regarding immigrants from seven designated countries and refugees. Among other things, the brief asserts that “safety and security concerns can be addressed in a manner that is consistent with the values America has always stood for, including the free flow of ideas and people across borders and the welcoming of immigrants to our universities.”
This recent action provides a moment for us to collectively reflect on our community within Columbia Engineering and the importance of our commitment to maintaining an open and welcoming community for all students, faculty, researchers and administrative staff. As a School of Engineering and Applied Science, we are fortunate to attract students and faculty from diverse backgrounds, from across the country, and from around the world. It is a great benefit to be able to gather engineers and scientists of so many different perspectives and talents – all with a commitment to learning, a focus on pushing the frontiers of knowledge and discovery, and with a passion for translating our work to impact humanity.
I am proud of our community, and wish to take this opportunity to reinforce our collective commitment to maintaining an open and collegial environment. We are fortunate to have the privilege to learn from one another, and to study, work, and live together in such a dynamic and vibrant place as Columbia.
Mary C. Boyce
Dean of Engineering
Morris A. and Alma Schapiro Professor