![]() |
Junfeng YangAssistant ProfessorDepartment of Computer Science Columbia University 1214 Amsterdam Ave. 460 CSB Mail Code 0401 New York, NY 10027 Lab: 487 CSD Phone: (212) 939-7012 Fax: (212) 666-0140 [ News | Awards | Publications | Projects] [ People | Press | Teaching | Support ] |
My research centers on making reliable and secure systems (one-page research statement). Some of my current projects involve efficient and reliable multithreading, tools for the cloud, and operating systems support for reliability. After getting a B.S. from Tsinghua University, I earned my PhD in Dawson Engler's group at Stanford, where I created eXplode, a general, lightweight system for effectively finding storage system errors. This work has led to an OSDI best paper award, numerous bug fixes to real systems such as the Linux kernel, and a featured article in Linux Weekly news. In 2008, I worked at Microsoft Research Silicon Valley, extending eXplode to check production distributed systems. MoDist, the resultant system, is being transferred to Microsoft product groups. I'm now co-directing the Software Systems Lab (SSL) at Columbia University, where my recent work on reliable multithreading was featured in sites such as ACM Tech News. I won the Sloan Research Fellowship and the AFOSR YIP award, both in 2012; and the NSF CAREER award in 2011.
I'm looking for a few graduate students, postdocs, and undergraduate interns. If you know how to build systems/tools, we should talk.
Columbia undergraduate and master students: the above applies to you, too. Also, take a look at some projects available for academic credits.
Select Awards
- Sloan research fellowship, 2012
- Air Force Office of Scientific Research Young Investigator Program Award (AFOSR YIP), 2012
- NSF CAREER, 2011
- OSDI best paper, 2004
Draft Papers
Some of the most recent papers we've been working on:
-
Make Parallel Programs Reliable with Stable Multithreading
 [bib]
Columbia University Technical Report CUCS-006-13Summarizes our recent research on stable multithreading systems that dramatically simplify the interleaving behaviors of threads, addressing fundamental, open problems to improve the reliability of multithreaded programs. -
Effective Dynamic Detection of Alias Analysis Errors
 [bib]
Columbia University Technical Report CUCS-003-13Describes NeonGoby, a system for effectively detecting errors in alias analysis, one of the most important and widely used program analyses
Select Publications
-
Verifying Systems Rules Using Rule-Directed Symbolic Execution
 [bib]
Eighteenth International Conference on Architecture Support for Programming Languages and Operating Systems (ASPLOS '13), 2013
Describes Woodpecker, a system that leverages path slicing to speed up symbolic execution. It enables users to check systems rules, and avoids checking program paths irrelevant to the rule, drastically reducing the amount of redundant work. -
Sound and Precise Analysis of Parallel Programs through Schedule Specialization
 [bib]
Proceedings of the ACM SIGPLAN 2012 Conference on Programming Language Design and Implementation (PLDI '12), June, 2012
Describes a program analysis framework for analyzing multithreaded programs with high precision. The key idea is to statically analyze a multithreaded program toward only a small set of schedules to improve precision, then enforce these schedules at runtime for soundness. -
Efficient Deterministic Multithreading through Schedule Relaxation
 [bib]
Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP '11), October, 2011
Describes Peregrine, a system for efficiently making threads deterministic, addressing a key open challenge within the field of deterministic execution. -
Pervasive Detection of Process Races in Deployed Systems
 [bib]
Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP '11), October, 2011
Describes RacePro, a system for finding process races (e.g., multiple processes accessing a shared resource such as a file without proper synchronization). -
Practical Software Model Checking via Dynamic Interface Reduction
 [bib]
Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP '11), October, 2011
Our latest model checking paper. It describes a new reduction technique that decomposes a full distributed system into components and then explores the executions of these components in a divide-and-conquer way. -
Stable Deterministic Multithreading through Schedule Memoization
 [bib]
Proceedings of the Ninth Symposium on Operating Systems Design and Implementation (OSDI '10), October, 2010
Describes Tern, a system for making threads more deterministic and stable. The key idea is to memoize past schedules and reuse them when possible, much like the natural tendencies in animals and humans to repeat familiar routes to avoid possible hazards along unknown routes. -
Bypassing Races in Live Applications with Execution Filters
 [bib]
Proceedings of the Ninth Symposium on Operating Systems Design and Implementation (OSDI '10), October, 2010
Describes LOOM, a live-workaround system for fixing races in live applications. LOOM is safe (live-update will not introduce new errors), fast (negligible overhead for most benchmarks), and flexible (able to fix all bugs evaluated). -
MODIST: Transparent Model Checking of Unmodified Distributed Systems
 [talk | bib]
Proceedings of the Sixth Symposium on Networked Systems Design and Implementation (NSDI '09), April, 2009
Describes how we applied our in-situ model checking approach to find 10 protocol-level errors in three real distributed systems, including a production system that has been managing more than 100 thousand machines for over two years. Note this version has two minor calculation errors fixed. -
EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors
 [talk | bib]
Proceedings of the Seventh Symposium on Operating Systems Design and Implementation (OSDI '06), November, 2006
Describes our in-situ model checking approach, which made it easy to thoroughly check real systems. We applied eXplode to 17 storage systems and found serious data-loss errors in every system checked. This paper is my favorite in describing our model checking approach. -
Using Model Checking to Find Serious File System Errors
 [talk | bib]  (Best paper award)
Proceedings of the Sixth Symposium on Operating Systems Design and Implementation (OSDI '04), December, 2004
Describes how we leveraged model checking to find serious errors in three widely used, well tested Linux file system errors. Some of these errors can vaporize your entire file systems.
People
I'm fortunate to work or have worked with these brilliant people.
- Chuliang Weng, Visiting Associate Research Scientist
- Jingyue Wu, PhD student
- Heming Cui, PhD student
- Yang Tang, PhD student
- Gang Hu, PhD student
- Xinhao Yuan, PhD student
- Oren Laadan, Postdoc 2011, founded Cellrox
- John Gallagher, MS 2011, went to FourSquare
- Chia-che Tsai, MS 2011, went to Stony Brook for PhD
- Neetha Maria Sebastian, MS 2011, went to Google
- Yunling Wang, MS 2010, went to Microsoft
- Ben Warfield, MS 2010, went to Wireless Generation
- Nathan Murith, MS 2010
- Maoliang Huang, MS 2010, went to FlexTrade Systems
- Patrick Huang, MS 2009, went to Sony
I co-advise some students in the SSL lab.
Articles and Discussions about Research
Peregrine: CACM,
ACM Tech News,
The Register,
Columbia Engineering,
TG Daily,
Physorg.com
MoDist: Softpedia
eXplode: Linux Weekly News
Static analysis: Linux Kernel Mailing List
Teaching
Fall 2012 -- E6121: Reliable Software
Spring 2012 -- W4118: Operating Systems I
Fall 2011 -- E6121: Reliable Software
Spring 2011 -- W4118: Operating Systems I
Fall 2010 -- E6998-1: Reliable Software
Spring 2010 -- W4118: Operating Systems I
Fall 2009 -- E6998-1: Reliable Software
Spring 2009 -- W4118: Operating Systems I
Fall 2008 -- E6998-2: How to
Make Reliable Software
Support for Research
- SHF: Medium: RacePro: Automatically Detecting API Races in Deployed Systems, NSF CCF-1162021, 09/2012 - 08/2016    PIs: Jason Nieh, Junfeng Yang
- Sloan Research Fellowship, Sloan Foundation, 2012 - 2014    PIs: Junfeng Yang
- Concurrency Attacks and Defenses, AFOSR YIP, 07/2012 - 06/2015    PIs: Junfeng Yang
- Transparently Extending Programs at Compilation to Prevent Bugs, ONR N00014-12-1-0166, 07/2012 - 06/2015    PIs: Junfeng Yang, Angelos Keromytis
- MEERKATS: Maintaining EnterprisE Resiliency via Kaleidoscopic Adaptation and Transformation of Software Services, DARPA MRC, 09/2011 - 09/2015    PIs: Angelos Keromytis, Roxana Geambasu, Junfeng Yang, Simha Sethumadhavan, Sal Stolfo  (Columbia as the lead institution, with GMU and Symantec)
- LOOM: a Language and System for Bypassing and Diagnosing Concurrency Errors, NSF CNS-1117805, 09/2011 - 08/2013    PIs: Junfeng Yang
- CAREER: Making Threads More Deterministic by Memoizing Schedules, NSF CAREER CNS-1054906, 02/2011 - 01/2016    PIs: Junfeng Yang
- SPARCHS: Symbiotic, Polymorphic, Autotomic, Resilient, Clean-slate, Host Security, DARPA CRASH, 10/2010 - 09/2014    PIs: Simha Sethumadhavan, Sal Stolfo, Angelos Keromytis, Junfeng Yang, David August
- SemGrep: a System for Improving Software Reliability through Semantic Similarity Bug Search, NSF CNS-1012633, 07/2010 - 06/2012    PIs: Junfeng Yang, Angelos Keromytis, Dawson Engler
- MINESTRONE, IARPA, 08/2010 - 07/2014    PIs: Angelos Keromytis, Junfeng Yang, Sal Stolfo  (Columbia as the lead institution, with Dawson Engler @ Stanford University, Anup Ghosh, Angelos Stavrou, and Michael Locasto @ George Mason University, and Marc Dacier, Matthew Elder, and Darrell Kienzle @ Symantec Corp)
- Guanyin: a Thousand hands with a Thousand eyes for Distributed Software Checking, NSF CNS-0905246, 09/2009 - 08/2013    PIs: Junfeng Yang, Gail Kaiser, Jason Nieh
