Voices of CS: Adam Lin

Adam Lin didn’t set out to become a computer scientist. In fact, his academic journey began in finance—driven by family expectations and a clear, conventional career path. But a chance encounter working with Excel macros during an internship lit a spark. What started as a curiosity in programming soon evolved into a deep commitment to using AI to solve real-world problems, especially in healthcare.

That commitment became personal the moment Adam stepped into a neonatal intensive care unit (NICU) to support a research project predicting Necrotizing Enterocolitis (NEC), a gastrointestinal problem in preterm infants. Surrounded by fragile newborns and determined clinicians, he realized that his work could make a real impact on lives. Now a PhD student working on cutting-edge machine learning techniques—like using privileged information for better prediction in medical settings—Lin is focused on applying AI to improve maternal and neonatal health outcomes.

His story is one of curiosity, compassion, and the drive to turn complex algorithms into tools for real human benefit.

Q: Your academic journey started in finance. What led you to pivot to computer science and machine learning?
My plan was to build a career in finance, but during the last year of my undergraduate study, I interned at Fund of Funds, where I worked on maintaining their MATLAB-based fund analyzer. I found myself enjoying the programming aspect of the work more than the finance itself. 

That led me to explore computer science, and while taking an Introduction to AI course with Professor Ansaf Salleb-Aouissi, I became fascinated with the potential of AI to solve real-world problems. At the end of the semester, I asked Professor Salleb-Aouissi if there were any projects I could pursue. A few months later, she invited me to work on a research project that utilizes machine learning to predict Necrotizing Enterocolitis (NEC) in preterm infants.

Q: What was the turning point that made you realize you wanted to focus on AI for healthcare?
Initially, I was excited about the technical challenges of the NEC prediction project. But when I visited the neonatal intensive care unit (NICU) and saw those tiny, vulnerable babies, everything changed. It was no longer just an academic exercise—this research had real-life stakes. That moment solidified my motivation to use AI for something meaningful, and it ultimately guided my decision to pursue graduate studies in computer science with a focus on healthcare applications.

Q: When did you decide to pursue a PhD? What motivated you to continue beyond your master’s?
Even while completing my master’s, I knew I wanted to pursue a PhD at some point. I’ve always been someone who enjoys deep exploration of problems and truly understanding a field. Since I was already conducting research with Professor Ansaf Salleb-Aouissi and working closely with clinicians, the transition felt natural. When she invited me to officially join the PhD program, it just clicked—I wanted to keep working on meaningful AI research, particularly in healthcare.

Q: Your research covers a range of healthcare challenges. How did your projects evolve over time?
My first major project was predicting NEC in preterm infants, which introduced me to multiple instance learning—where we had stool microbiome samples from different time points but no exact onset for the disease. We used an attention mechanism to weigh each sample’s importance dynamically, improving prediction accuracy. This project ignited my interest in exploring how advanced machine learning techniques could be applied in healthcare. 

That research expanded into predicting preterm birth and preeclampsia, using data from the Nulliparous Pregnancy Outcomes Study (nuMoM2b). We found that while we could build robust models for predicting indicated preterm birth (clinician-initiated due to complications), spontaneous preterm birth remained elusive with standard clinical data. This highlighted the need for additional predictive factors, such as microbiome analysis.

As we analyzed the dataset further, we realized many features were underutilized, especially information available only after delivery or adverse pregnancy outcomes (APOs). By leveraging privileged information, we can effectively incorporate this information in training our model, where it serves as essentially a “teacher” that guides the “learner” to build better models. During this phase, we found that XGBoost was particularly robust against overfitting, and we built upon it to incorporate privileged information, introducing XGBoost+. 

This work became even more personal when my baby was born during my PhD. Experiencing the healthcare system firsthand and seeing how much uncertainty surrounds maternal and neonatal health deepened my commitment to this research. It reinforced the importance of developing AI-driven solutions to provide better clinical insights and improve outcomes for mothers and babies.

My thesis brings together all these experiences, focusing on combining transfer learning and privileged information to enhance the prediction model in maternal and neonatal health. We are actively exploring large datasets such as CDC records, clinical studies like the Maternal-Fetal Medicine Units (MFMU) Network preterm dataset, and alternative data modalities such as the vaginal microbiome to improve predictions for preeclampsia and indicated preterm birth. Also, we hope that by incorporating different modalities of datasets and external data sources, we can develop a reliable predictive model for spontaneous preterm birth.

 

Q: Can you explain your research on predicting Proximal Junctional Kyphosis (PJK) and why it matters?
My paper, “A LUPI Distillation-Based Approach: Application to Predicting Proximal Junctional Kyphosis (PJK),” focuses on predicting PJK. PJK is a post-operative complication in adult spinal deformity patients, occurring in about 17–46% of cases. It leads to abnormal spinal curvature and significant patient morbidity. Predicting PJK early is crucial for better prevention strategies and patient outcomes.

A limited amount of patient data and only a few features are available, making predicting PJK a challenging problem. In our work, we propose XGBoost+, a novel extension of the widely used XGBoost algorithm, to incorporate privileged information – data available at training time but not at inference time. In the case of PJK, privileged information is the post-operative data. By leveraging Learning Using Privileged Information (LUPI) in a distillation framework, our model improves predictive performance over traditional methods like standard XGBoost and Support Vector Machines (SVM). Our results demonstrate that XGBoost+ significantly enhances predictive accuracy, especially in healthcare applications with common data limitations at inference time.

Our study introduced a LUPI distillation-based approach, using XGBoost+ to incorporate privileged information (post-operative data) during training. This method outperformed traditional models like standard XGBoost and Support Vector Machines (SVM), offering a practical way to improve predictions even with limited available data. The research is essential because it introduces an effective way to handle prediction tasks with limited data, a significant challenge in machine learning in healthcare. In the case of PJK, we incorporated a third of the features that would likely be unused in traditional models.

 

Q: Your research focuses heavily on privileged information. What is it, and why is it important in healthcare AI?
In healthcare, many prediction models face a fundamental limitation—certain crucial information is available only after an outcome occurs. For example, in maternal health, some of the most predictive data comes after delivery or complications, making it unavailable for real-time decision-making.

Traditional approaches in healthcare machine learning often assume that if a condition appears unpredictable from known risk factors, the solution is to collect more data or search for entirely new attributes. However, in many clinical settings, data is limited, and certain features may only become predictive when considered in combination with others. This means we can’t outright dismiss a feature because it doesn’t appear predictive.

 

Adam Lin with familyQ: How has your personal life influenced your research direction?
During my PhD, my own child was born, and experiencing the healthcare system firsthand gave me a deeper appreciation for the challenges faced by clinicians and patients alike. Seeing the uncertainty surrounding maternal and neonatal health reinforced my commitment to developing AI-driven solutions that can provide better clinical insights.

 

Q: What’s next after your PhD? Do you plan to continue in this research field?
First, I plan to take a well-earned break—hopefully traveling to China with my son before he starts 3-K!

Research-wise, I’m particularly excited about the potential of large language models (LLMs) for risk factor selection and transfer learning in healthcare AI. LLMs could enhance clinical prediction models, making them both more accurate and interpretable. I don’t have a specific job lined up yet, but I’d love to join an industry healthcare research lab where I can continue working on AI for maternal and neonatal health, clinical decision-making, and diagnostics.

 

Q: Any advice for students who are considering a career in AI for healthcare?
Be curious and open to exploring different disciplines. My journey started in finance, but a single exposure to programming changed everything. The intersection of AI and healthcare is incredibly rewarding, but it requires collaboration with domain experts, patience in dealing with complex medical data, and a deep commitment to solving real-world problems. Most importantly, find work that feels meaningful to you—because that’s what will keep you going when challenges arise.