Nakul Verma - Department of Computer Science, Columbia University

Research

I enjoy working on various aspects of machine learning problems and high-dimensional statistics. I am especially interested in understanding and exploiting the intrinsic structure in data (eg. manifold or sparse structure) to design effective learning algorithms.

Selected Publications

On the Incompressibility of Structure-Preserving Embeddings of Graphs Szymon Snoeck, Noah Bergam, Nakul Verma In Progress, 2025
Using Deep Autoregressive Models as Causal Inference Engines Daniel Im, Kevin Zhang, Nakul Verma, Kyunghyun Cho Under review, 2024 arxiv
LogicLearner: A Tool for the Guided Practice of Propositional Logic Proofs Amogh Inamdar, Uzay Macar, Michel Vazirani, Michael Tarnow, Zarina Mustapha, Natalia Dittren, Sam Sadeh, Nakul Verma, Ansaf Salleb-Aouissi Under review, 2024
Contrastive Loss is All You Need to Recover Analogies as Parallel Lines Narutatsu Ri, Fei-Tzin Lee, Nakul Verma Association for Computational Linguistics (ACL) workshop on Representation Learning, 2023 arxiv
Improving Model Training via Self-learned Label Representations Xiao Yu, Nakul Verma Computing Research Repository (CoRR) abs/2209.04528, 2022 arxiv
A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level Iddo Drori, Sarah Zhang, Reece Shuttleworth, Leonard Tang, Albert Lu, Elizabeth Ke, Kevin Liu, Linda Chen, Sunny Tran, Newman Cheng, Roman Wang, Nikhil Singh, Taylor Patti, Jayson Lynch, Avi Shporer, Nakul Verma, Eugene Wu, Gilbert Strang Proceedings of the National Academy of Sciences (PNAS), 2022 arxiv
Solving Probability and Statistics Problems by Program Synthesis at Human Level and Predicting Solvability Leonard Tang, Elizabeth Ke, Nikhil Singh, Bo Feng, Derek Austin, Nakul Verma, Iddo Drori International Conference on Artificial Intelligence in Education (AIED), 2022 arxiv
Solving Linear Algebra by Program Synthesis Iddo Drori, Nakul Verma Computing Research Repository (CoRR) abs/2111.08171, 2021 arxiv
Automated Symbolic Law Discovery: A Computer Vision Approach Henry Xing, Ansaf Salleb-Aouissi, Nakul Verma Association for the Advancement of Artificial Intelligence (AAAI), 2021
Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics Bo Cowgill, Fabrizio Dell'Acqua, Augustin Chaintreau, Nakul Verma and Samuel Deng ACM Conference on Economics and Computation (EC), 2020
Cortical pattern generation during dexterous movement is input-driven Britton Sauerbrei, Jian-Zhong Guo, Jeremy Cohen, Matteo Mischiati, Wendy Guo, Mayank Kabra, Nakul Verma, Brett Mensh, Kristin Branson, Adam Hantman Nature, 2019 pdf biorxiv
Noise-tolerant fair classification Alex Lamy, Ziyuan Zhong, Aditya Menon and Nakul Verma Neural Information Processing Systems (NeurIPS), 2019 arxiv
Metric learning on manifolds Max Aalto and Nakul Verma Computing Research Repository (CoRR) abs/1902.01738, 2019 arxiv
Model-Agnostic Meta-Learning using Runge-Kutta Methods Daniel Im, Yibo Jiang and Nakul Verma Computing Research Repository (CoRR) abs/1910.07368, 2019 arxiv
Meta-learning to cluster Yibo Jiang and Nakul Verma Computing Research Repository (CoRR) abs/1910.14134, 2019 arxiv
Stochastic neighbor embedding under f-divergences Daniel Im, Nakul Verma and Kristin Branson Computing Research Repository (CoRR) abs/1811.01247, 2018 arxiv
Time-accuracy tradeoffs in Kernel prediction: controlling prediction quality Samory Kpotufe and Nakul Verma Journal of Machine Learning Research (JMLR), 2017 pdf code
Sample complexity of learning Mahalanobis distance metrics Nakul Verma and Kristin Branson Neural Information Processing Systems (NIPS), 2015 pdf talk poster
Distance preserving embeddings for general n-dimensional manifolds (aka An algorithmic realization of Nash's embedding theorem) Nakul Verma Journal of Machine Learning Research (JMLR), 2013 pdf oldpdf slides video poster
Efficient energy management and data recovery in sensor networks using latent variables based tensor factorization Bojan Milosevic, Jinseok Yang, Nakul Verma, Sameer Tilak, Piero Zappi, Elisabetta Farella, Luca Benini, Tajana Rosing Conference on Modelling, Analysis and Simulation of Wireless and Mobile Systems (MSWiM), 2013
Learning from data with low intrinsic dimension Nakul Verma Ph.D. Thesis, Dept. of Computer Science and Engineering, UC San Diego, 2012 pdf
Distance preserving embeddings for general n-dimensional manifolds (aka An algorithmic realization of Nash's embedding theorem) Nakul Verma Conference on Learning Theory (COLT), 2012 pdf oldpdf slides video poster
Learning hierarchical similarity metrics Nakul Verma, Dhruv Mahajan, Sundararajan Sellamanickam and Vinod Nair Conference on Computer Vision and Pattern Recognition (CVPR), 2012 pdf poster
A note on random projections for preserving paths on a manifold Nakul Verma UC San Diego, Tech. Report CS2011-0971, 2011 pdf
Latent variables based data estimation for sensing applications Nakul Verma, Piero Zappi, Tajana Rosing Conference on Intelligent Sensors, Sensor Networks, and Information processing (ISSNIP), 2011
Multiple instance learning with manifold bags Boris Babenko, Nakul Verma, Piotr Dollar and Serge Belongie International Conference on Machine Learning (ICML), 2011 pdf slides poster
Which spatial partition trees are adaptive to intrinsic dimension Nakul Verma, Samory Kpotufe and Sanjoy Dasgupta Conference on Uncertainty in Artificial Intelligence (UAI), 2009 pdf poster software
Mathematical advances in manifold learning Nakul Verma Survey, UC San Diego Tech. Report, 2008 pdf slides
Learning the structure of manifolds using random projections Yoav Freund, Sanjoy Dasgupta, Mayank Kabra and Nakul Verma Neural Information Processing Systems (NIPS), 2007 pdf poster software
A concentration theorem for projections Sanjoy Dasgupta, Daniel Hsu and Nakul Verma Conference on Uncertainty in Artificial Intelligence (UAI), 2006 pdf poster

Talks

The perils of using t-SNE and friends slides
- Computer Vision and Machine Learning Seminar, Janelia Research Campus, HHMI Kristin Branson
The hidden potential of non-Euclidean representations for machine learning slides
- Bloomberg LP, Office of the CTO Kai-Zhan Lee
Distance preserving embeddings for Riemannian manifolds slides
- Carnegie Mellon University, Machine Learning Department Aarti Singh
- IBM Research, Almaden Ken Clarkson
- University of Washington, Math Department Marina Meila
- Yahoo Labs, Bangalore Dhruv Mahajan
An introduction to statistical theory of learning slides
- Neurotheory seminar, Janelia Research Campus, HHMI Shaul Druckmann
A tutorial on metric learning with some recent advances slides
- Bay Area Machine Learning Group Tony Tran

Software

Spatial Trees are a recursive space partitioning datastructure that can help organize high-dimensional data. They can assist in analyzing the underlying data density, perform fast nearest-neighbor searches, and do high quality vector-quantization. Here we implement several instantiations (KD-tree, RP-tree, PCA-tree) to study their relative strengths.

Useful Links

Machine Learning blog