Junfeng YangAssociate Professor
Co-director of Software Systems Lab
Department of Computer Science
500 West 120 Street, 519 CSB
Mail Code 0401
New York, NY 10027
Lab: 487 CSB
Phone: (212) 939-7012
Fax: (212) 666-0140
[ News | Awards | Publications | Software | Projects]
[ People | Press | Teaching | Support ]
My research centers on making reliable and secure systems. Some of my current research thrusts are (1) security and robustness of machine learning, (2) tools to better protect, verify, analyze, test, and debug software, and (3) programming and runtime systems for cloud applications. After getting a BS in Computer Science from Tsinghua University, I earned my PhD in Computer Science at Stanford, where I created eXplode, a general, lightweight system for effectively finding storage system errors. This work has led to an OSDI best paper award, numerous bug fixes to real systems such as the Linux kernel, and a featured article in Linux Weekly news. In 2008, I worked at Microsoft Research Silicon Valley, extending eXplode to check production distributed systems. MoDist, the resultant system, is being transferred to Microsoft product groups. At Columbia University, my recent work on reliable multithreading was featured in sites such as ACM Tech News. I won the Sloan Research Fellowship and the AFOSR YIP award, both in 2012; and the NSF CAREER award in 2011.
I'm looking for a few graduate students, postdocs, and undergraduate interns. If you know how to build systems/tools, we should talk.
Columbia undergraduate and master students: the above applies to you, too. Also, take a look at some projects available for academic credits.
I was on sabbatical in 2016 co-founding a Columbia spin-off called NimbleDroid. NimbleDroid provides automated, comprehensive app performance analysis to help teams build performant apps. Its mission is to leverage research breakthroughs to automate mundane tasks in software engineering.
- SOSP best paper, 2017
- Sloan research fellowship, 2012
- Air Force Office of Scientific Research Young Investigator Program Award (AFOSR YIP), 2012
- NSF CAREER, 2011
- OSDI best paper, 2004
(If you're interested in a paper draft but it isn't available online, please email me for a copy.)
Select Publications (Complete list...)
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
[bib]  (Best paper award)
Proceedings of the 26th ACM Symposium on Operating Systems Principles (SOSP '17), October, 2017
We increasingly rely on deep learning and deep neural networks (DNNs) in safety- and security-critical applications such as self-driving, medical diagnosis, face-based identification, and malware detection, but it remains an open challenge to thoroughly test DNNs for robustness and security. We propose Neuron Coverage, the first testing coverage metric to empirically understand how much decision logic a testing input set has exercised in a DNN. We design and build DeepXplore, the first systematic testing framework for DNNs. Given a test input, DeepXplore applies physically realizable transformations (e.g., darkening an image) to the inputs (as opposed to noise in prior adversarial ML work) to generate new inputs to maximize neuron coverage. It found thousands of flaws in state-of-art self-driving and malware detection DNNs and improved their neuron coverage by over 50%. (Also appeared in MLSec '17.)
Shuffler: Fast and Deployable Continuous Code Re-Randomization
Proceedings of the Twelfth Symposium on Operating Systems Design and Implementation (OSDI '16), 2016
Describes Shuffler, a system that continuously randomizes an application's binary code at runtime, defeating code-reuse attacks. Shuffler is fast: it shuffles all code within tens of milliseconds, whereas cutting-edge ROP attacks need 10--100x more time to discover gadgets. Shuffler is egalitarian: leveraging the insight that randomization doesn't require a higher privilege authority, Shuffler shuffles itself, reducing trusted computing base and making the approach applicable to kernels and hypervisors. Shuffler is deployable: its augmented binary analysis requires no modifications to OS, compilers, and linkers.
Towards Making Systems Forget with Machine Unlearning
Proceedings of the 2015 IEEE Symposium on Security and Privacy (S\&P '15), 2015
Describes our vision of forgetting systems that quickly and completely forget user data including all derived data for security, privacy, and usability. The paper focuses on making machine learning systems forget (hence the term machine unlearning)
Determinism Is Not Enough: Making Parallel Programs Reliable with Stable Multithreading
Communications of the ACM (2014)
This paper is geared toward a general audience. If you have time to read just one paper on our concurrency work, this is the paper to read. It describes our vision of stable multithreading (StableMT), a radical approach to making multithreading reliable, and summarizes our last five years of work on designing, building, and applying stable multithreading systems. The final version of this paper will appear in CACM.
Parrot: a Practical Runtime for Deterministic, Stable, and Reliable Threads
Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP '13), November, 2013
Describes Parrot, a simple, deployable thread runtime system for improving reliability with low overhead. This is our most recent and best paper on stable and deterministic multithreading.
- Crane, our transparent state machine replication system.
- AppDoctor, our Android app checker.
Parrot, our latest stable
and deterministic multithreading system. It has two goals: (1) be practical
and (2) be fast. By default, it schedules
synchronizations in a round-robin manner, vastly reducing the set of
schedules for reliability. When needed, it allows developers to
performance hintsfor speed. Together with the code, we also released a benchmark suite with 100+ multithreaded programs, and Parrot's complete results on these programs.
- NeonGoby, a system for effectively detecting errors in alias analysis, one of the most crucial and widely used program analyses. If you have an LLVM-based alias analysis you want to check, give NeonGoby a try.
- Loom, a "live-workaround" system designed to quickly and safely bypass various types of concurrency errors at runtime. It contains a generic engine for live-updating multithreaded programs without restarts, which you can leverage if you want to build a live-update tool.
- eXplode, a storage system checker. It uses an approach we call in-situ model checking to thoroughly check general systems software in a lightweight manner.
I'm fortunate to work or have worked with these brilliant people.
- Yang Tang, PhD student
- Xinhao Yuan, PhD student
- David Williams-King, PhD student
- Lingmei Weng, PhD student
- Kexin Pei, PhD student
- Linjie Zhu, PhD student
- Yinzhi Cao, Postdoc research scientist, 2014-2015, joined Lehigh University as a professor
- Yan Cui, Postdoc research scientist, 2013-2015, went to Intel
- Gang Hu, PhD, 2018, went to Google
- Georgios Koloventzos, MS, 2016
- Rui Gu, MS, 2017
- Karthik Jayaraman, MS, 2016
- Heming Cui, PhD, 2015, joined the University of Hong Kong as a professor
- Jingyue Wu, PhD, 2014, went to Google
- Chuliang Weng, Visiting research scientist, 2012
- Oren Laadan, Postdoc research scientist, 2010-2011, founded Cellrox
- John Gallagher, MS, 2011, went to FourSquare
- Chia-che Tsai, MS, 2011, went to Stony Brook for PhD
- Neetha Maria Sebastian, MS, 2011, went to Google
- Yunling Wang, MS, 2010, went to Microsoft
- Ben Warfield, MS, 2010, went to Wireless Generation
- Nathan Murith, MS, 2010, went to Autodesk
- Maoliang Huang, MS, 2010, went to FlexTrade Systems
- Patrick Huang, MS, 2009, went to Sony
I co-advise some students in the SSL lab.
Articles and Discussions about Research
Shuffler: Network World
Machine unlearning: The Stack, EurekAlert, The Atlantic
Peregrine: CACM, ACM Tech News, The Register, Columbia Engineering, TG Daily, Physorg.com
eXplode: Linux Weekly News
Static analysis: Linux Kernel Mailing List
Spring 2018 -- E6121: Reliable Software
Spring 2018 -- E6998-009: Security and Robustness of ML systems
Spring 2017 -- E6121: Reliable Software
Fall 2016 -- Teaching leave
Spring 2016 -- Sabbatical leave
Fall 2015 -- Sabbatical leave
Spring 2015 -- Teaching leave
Fall 2014 -- E6121: Reliable Software
Spring 2014 -- Teaching leave
Fall 2013 -- W4118: Operating Systems I
Spring 2013 -- Teaching leave
Fall 2012 -- E6121: Reliable Software
Spring 2012 -- W4118: Operating Systems I
Fall 2011 -- E6121: Reliable Software
Spring 2011 -- W4118: Operating Systems I
Fall 2010 -- E6998-1: Reliable Software
Spring 2010 -- W4118: Operating Systems I
Fall 2009 -- E6998-1: Reliable Software
Spring 2009 -- W4118: Operating Systems I
Fall 2008 -- E6998-2: How to Make Reliable Software
Support for Research
- ABIDES: Adaptive BInary Debloating and Security, ONR N00014-17-1-2788, 09/2017 - 08/2020    PIs: Georgios Portokalidis, Junfeng Yang, Vasileios P. Kemerlis
- Reliability and Security of Cyber-Physical Systems, Canon, 02/2017 - 01/2018    PIs: Junfeng Yang
- Validation of Distributed System Using Model Checking, Huawei, 11/2016 - 3/2017    PIs: Junfeng Yang
- Efficient Repair of Learning Systems via Machine Unlearning, NSF CNS-1564055, 09/2016 - 08/2020    PIs: Junfeng Yang, Yinzhi Cao
- Adapting Static and Dynamic Program Analysis to Effectively Harden Debloated Software, ONR N00014-16-1-2263, 03/2016 - 02/2019    PIs: Junfeng Yang, Georgios Portokalidis
- Availability of Large Scale Distributed Systems, WeChat, 03/2016 - 02/2017    PIs: Junfeng Yang
- Reliability and Security of Cyber-Physical Systems, Canon, 02/2016 - 01/2017    PIs: Junfeng Yang
- The YOLO Approach to Resilient Cyber-Physical Systems, ONR, 08/2015 - 07/2018    PIs: Simha Sethumadhavan, Junfeng Yang
- Efficiently, Effectively Detecting Mobile App Bugs with AppDoctor, Google, 02/2014 - 01/2015    PIs: Junfeng Yang
- Research Experiences for Undergraduates (REU) for LOOM: a Language and System for Bypassing and Diagnosing Concurrency Errors, NSF CNS-1340511, 06/2013 - 09/2013    PIs: Junfeng Yang
- Research Experiences for Undergraduates (REU) for Guanyin: a Thousand hands with a Thousand eyes for Distributed Software Checking, NSF CNS-1340506, 06/2013 - 09/2013    PIs: Junfeng Yang
- SHF: Medium: RacePro: Automatically Detecting API Races in Deployed Systems, NSF CCF-1162021, 09/2012 - 08/2016    PIs: Jason Nieh, Junfeng Yang
- Sloan Research Fellowship, Sloan Foundation, 2012 - 2016    PIs: Junfeng Yang
- Concurrency Attacks and Defenses, AFOSR YIP, 07/2012 - 06/2016    PIs: Junfeng Yang
- Transparently Extending Programs at Compilation to Prevent Bugs, ONR N00014-12-1-0166, 07/2012 - 06/2016    PIs: Junfeng Yang, Angelos Keromytis
- MEERKATS: Maintaining EnterprisE Resiliency via Kaleidoscopic Adaptation and Transformation of Software Services, DARPA MRC, 09/2011 - 01/2016    PIs: Angelos Keromytis, Roxana Geambasu, Junfeng Yang, Simha Sethumadhavan, Sal Stolfo  (Columbia as the lead institution, with GMU and Symantec)
- LOOM: a Language and System for Bypassing and Diagnosing Concurrency Errors, NSF CNS-1117805, 09/2011 - 08/2014    PIs: Junfeng Yang
- CAREER: Making Threads More Deterministic by Memoizing Schedules, NSF CAREER CNS-1054906, 02/2011 - 01/2017    PIs: Junfeng Yang
- SPARCHS: Symbiotic, Polymorphic, Autonomic, Resilient, Clean-slate, Host Security, DARPA CRASH, 10/2010 - 08/2015    PIs: Simha Sethumadhavan, Sal Stolfo, Angelos Keromytis, Junfeng Yang, David August
- SemGrep: a System for Improving Software Reliability through Semantic Similarity Bug Search, NSF CNS-1012633, 07/2010 - 06/2013    PIs: Junfeng Yang, Angelos Keromytis, Dawson Engler
- MINESTRONE, IARPA, 08/2010 - 01/2015    PIs: Angelos Keromytis, Junfeng Yang, Sal Stolfo  (Columbia as the lead institution, with Dawson Engler @ Stanford University, Anup Ghosh, Angelos Stavrou, and Michael Locasto @ George Mason University, and Marc Dacier, Matthew Elder, and Darrell Kienzle @ Symantec Corp)
- Guanyin: a Thousand hands with a Thousand eyes for Distributed Software Checking, NSF CNS-0905246, 09/2009 - 08/2014    PIs: Junfeng Yang, Gail Kaiser, Jason Nieh