Spring 2007


Human Computing: The Rewards of Computer Science in Everyday Life
Elizabeth Mynatt
Georgia Institute of Technology
Wednesday, March 7th, 2007
ABSTRACT: Visions describing the pervasive use of computing technologies by the general population often fail to explain how these technologies made the transition from research laboratories and professional environments to living rooms, coffee shops and handbags. Although many technologies do disappear into the socio-technical infrastructure of everyday life, the path to invisibility often requires a complex cycle of human assimilation, learning, and adaptation with technology evolution before reaching coherent resting points of literacy, adoption and mastery.

In this talk I will discuss three case studies where these processes are visible today and the eventual endpoint for these technologies is far from certain. Consumer medical technologies for chronic healthcare offer the potential of sustaining quality of life and independence while improving treatment outcomes and decreasing healthcare costs. While many of these technologies leverage robust ubiquitous sensing capabilities, challenges in interpretation, motivation and privacy remain. In contrast, the growing literacy and assimilation of gaming technologies provides a unique glimpse into the future of media and a potential path into the workplace. Finally the quandary surrounding the lagging adoption of home networking technologies points to future challenges for the computing community.

BIOGRAPHY: Elizabeth Mynatt is associate professor in the College of Computing and director of the Graphics, Visualization and Usability Center (GVU) at the Georgia Institute of Technology. The center hosts fifty-five faculty drawn from the computer science, psychology, liberal arts, new media design, history of science and technology, engineering, architecture, management, and music. Mynatt played a pivotal role in creating the College of Computing Ph.D. program in Human-Centered Computing, integrating studies in human-computer interaction, learning sciences and technology, cognitive science, artificial intelligence, robotics, software engineering, and information security. In the last decade, Mynatt has directed a research program in ubiquitous computing and technologies adapted to everyday life. With work that began at Xerox PARC and has grown to fruition at Georgia Tech, she examines the pervasive presence of computation in everyday life. Mynatt earned her Bachelor of Science summa cum laude in computer science from North Carolina State University and her Master of Science and Ph.D. in computer science from Georgia Tech.


Query Execution in Column-Oriented Database Systems
Daniel Abadi
Massachusetts Institute of Technology
Wednesday, February 7th, 2007
ABSTRACT: Recent research on column-oriented database systems (DBMSs) has shown that these systems can outperform existing row-oriented DBMSs by one to two orders of magnitude on read-mostly query workloads like those found in data warehouses, decision support, and customer relationship management systems. In this talk, I will discuss this exciting new class of database systems and will provide an overview of the C-Store system that we have developed over the past two years at MIT. I will then focus on the design of the column-oriented query execution engine I have developed. In particular, I will discuss the impact on query performance of tuple construction (stitching together attributes from multiple columns into a row-oriented “tuple”) and operation on compressed data. Tuple construction allows column-oriented DBMSs to offer a standards-compliant relational database interface (e.g., ODBC, JDBC, etc); however, if done at the wrong point in a query plan, a significant performance penalty is paid. Similarly, data compression can improve query performance by an order of magnitude by trading cheap CPU cycles for expensive I/O bandwidth.
Democratizing content distribution
Michael J. Freedman
New York University
Wednesday, February 14th, 2007
ABSTRACT: In order to reach their large audiences, today’s Internet publishers primarily use content distribution networks (CDNs) to deliver content. Yet the architectures of the prevalent commercial systems are tightly bound to centralized control, static deployments, and trusted infrastructure, thus inherently limiting their scope to ensure cost recovery.

This talk describes a number of techniques (and the resulting systems) that realize highly-scalable cooperative content distribution. I show how to build a fully self-organizing CDN that efficiently delivers content using unreliable nodes, describe how to transparently dispatch clients to nearby servers, and touch on how to securely incentivize resource sharing.

These ideas have been implemented, deployed, and tested in production systems currently serving several terabytes of data to more than a million people every day. The view of the Internet gained from deploying these systems answered long-standing questions about network configuration, while providing important insights that helped evolve our systems’ designs.

BIOGRAPHY: Michael J. Freedman is a doctoral student at NYU, visiting Stanford since 2005. He received his M.Eng. and S.B. degrees from MIT. His research interests span distributed systems, security, networking, and cryptography, with a focus both on principled designs and real deployments.
High-Radix Interconnection Networks
John Kim
Stanford University
Monday, February 19th, 2007
ABSTRACT: Over the past twenty years, the pin bandwidth available to a chip has increased by approximately an order of magnitude every five years. This increasing pin bandwidth can be effectively utilized by creating high-radix routers with large number of skinny ports instead of low-radix routers with fat ports. The use of high-radix routers leads to lower cost and better performance but utilizing high-radix routers in the network presents many difficult challenges. In this talk, I will address two issues in high-radix interconnection networks: the design and microarchitecture of a high-radix router and a cost-efficiently high-radix topology.

Conventional microarchitecture does not scale to high radix since the complexity of the allocators in the routers scale quadratically with the radix. I will present a hierarchical router organization that results in a distributed, complexity-effective microarchitecture and maintains high performance. Through collaboration with Cray, this microarchitecture was implemented in the Cray YARC router. In the second part of the talk, I will describe the flattened butterfly: a cost-efficient topology for high-radix network. Compared to a folded-Clos topology, the flattened butterfly provides approximately 2x reduction in cost per performance on balanced traffic while maintaining the same cost per performance on adversarial traffic pattern. I will also show how this topology can be extended to on-chip networks to achieve lower latency and lower energy consumption.

BIOGRAPHY: John Kim is a PhD candidate in the Concurrent VLSI Architecture group at Stanford University, working with Professor Bill Dally. John’s research interest includes computer architecture and interconnection networks. Some of his research in high-radix networks has directly impacted the interconnection networks used in the latest Cray BlackWidow network. John received his B.S. and M.Eng from the Department of Electrical Engineering at Cornell University. He has also worked on the design of several microprocessors at Motorola and Intel.
Temporal Memory Streaming
Thomas Wenisch
Carnegie Mellon University
Monday, February 26th, 2007
ABSTRACT: While semiconductor scaling has steadily improved processor performance, scaling trends in memory technology have favored improving density over access latency. Because of this processor/memory performance gap — often called the memory wall — modern server processors spend over half of execution time stalled on long-latency memory accesses. To improve average memory response time for existing software, architects must design mechanisms that issue memory requests earlier and with greater parallelism. Commercial server applications present a particular challenge for memory system design because their large footprints, complex access patterns, and frequent chains of dependent misses are not amenable to existing approaches for hiding memory latency. Despite their complexity, these applications nonetheless execute repetitive code sequences, which give rise to recurring access sequences — a phenomenon I call temporal correlation. In this talk, I present Temporal Memory Streaming, a memory system design paradigm where hardware mechanisms observe repetitive access sequences at runtime and use recorded sequences to stream data from memory in parallel and in advance of explicit requests.
BIOGRAPHY: Tom Wenisch is completing his Ph.D. in Electrical and Computer Engineering at Carnegie Mellon University, specializing in computer architecture. Tom’s current research includes memory streaming, multiprocessor memory system design and computer system performance evaluation. His future research will focus on multi-core/multiprocessor systems, with particular emphasis on improving system programmability and debuggability.
Multiprocessor Architectures for Programmability
Luis Ceze
University of Illinois at Urbana-Champaign
Wednesday, February 28th, 2007
ABSTRACT: While multiprocessor hardware is finally becoming ubiquitous, enticing many programmers to write parallel programs is going to be very challenging. For this reason, I believe that the main problem that confronts computer architects today is designing computer architectures that help simplify parallel programming.

In this talk I will present two novel computer architecture techniques that help simplify parallel programming. The first one is Bulk — a framework of primitives for manipulating in hardware sets of addresses in bulk. Bulk’s primitives are used as building blocks to support interactions between multiple threads, enabling high-programmability environments such as thread-level speculation, transactional memory, and high-performance sequential memory consistency. The second technique is architectural support for data-centric synchronization. The technique, called Colorama, associates concurrency control constraints with data, providing an attractive alternative to the traditional code-centric approach of locks and transactions. Together, these two techniques offer promising directions in the critical area of novel multiprocessor architectures for programmability.

BIOGRAPHY: Luis Ceze is a PhD candidate in the Department of Computer Science at the University of Illinois Urbana-Champaign (UIUC). His doctoral research has focused on computer architectures to decrease the hardware complexity and improve the programmability of multiprocessor systems. He has co-authored over 20 papers in computer architecture, programming models, and systems. He has also participated in the Blue Gene, Cyclops, and PERCS projects at IBM. He has received awards from UIUC for research and academic accomplishments, and from IBM for contribution to the Blue Gene project.
Deputy: Dependent Types for Safe Systems Software
Jeremy Condit
UC Berkeley
Monday, March 5th, 2007
ABSTRACT: Programming language tools offer powerful mechanisms for improving the safety and reliability of systems code. This talk presents Deputy, a type system and compiler for enforcing type and memory safety in real-world C programs such as Linux device drivers and the Linux kernel itself. Deputy’s type system uses dependent types, a language mechanism that allows programmers to describe common C idioms in an intuitive fashion.

The Deputy project offers contributions to both systems and programming languages. From a systems perspective, Deputy is attractive because it can provide fine-grained safety guarantees in a modular and incremental fashion; for example, Deputy can be used to enforce type and memory safety in Linux device drivers without requiring changes to the kernel. The SafeDrive recovery system for Linux device drivers uses Deputy for isolation and failure detection, and as a result, it is both simpler and faster than previous systems for isolating software extensions. From a language perspective, Deputy shows how to reason about dependent types in imperative code. Deputy has fewer restrictions on mutation than previous systems, and it uses run-time checks and several inference techniques to ensure decidability and usability.

BIOGRAPHY: Jeremy Condit is a graduate student who is currently completing his Ph.D. at the University of California, Berkeley. His research interests focus on using programming language tools and techniques to address the challenges of building large software systems. He received his A.B. summa cum laude from Harvard University, and he received his M.S. from the University of California, Berkeley. He is a recipient of the NSF Graduate Research Fellowship, and he received the Best Paper Award at ETAPS 2005 from the European Association for Programming Languages and Systems.
Improving the performance of highly reliable software systems
Ed Nightingale
University of Michigan
Monday, March 19, 2007
ABSTRACT: Commodity operating systems still retain the design principles developed when processor cycles were scarce and RAM was precious. These out-dated principles have led to performance/functionality trade-offs that are no longer needed or required; I have found that, far from impeding performance, features such as safety, consistency and energy-efficiency can often be added while improving performance over existing systems. I will describe my work developing Speculator, which provides facilities within the operating system kernel to track and propagate causal dependencies. Using Speculator, I will show that distributed and local file systems can provide strong consistency and safety guarantees without the poor performance these guarantees usually entail.
BIOGRAPHY: Ed Nightingale is a Ph.D. candidate at the University of Michigan. His research focuses on experimental software systems, especially operating systems, distributed systems and mobile computing.
COMMSYN: On-chip Communication Architecture Synthesis for Multi-Processor Systems-on-Chip
Sudeep Pasricha
UC Irvine
Wedenesday, March 21, 2007
ABSTRACT: Multi-processor systems-on-chip (MPSoC) are highly sophisticated embedded systems that consist of one or more programmable processors running software, memory modules for data storage and other hardware components connected together via an on-chip communication infrastructure. Such MPSoC architectures represent heterogeneous systems offering flexible parallel processing resources for the implementation of emerging applications in the multimedia, networking, aeronautics, automotive and telecommunication domains. One of the most time-consuming activities in a typical MPSoC design cycle is the exploration and implementation of its on-chip communication architecture, since inter-component communication has a critical impact on system performance, power consumption, reliability and overall cost.

In this talk, I will discuss an innovative framework (COMMSYN) that I have developed to automate the process of exploration and synthesis of on-chip communication architectures for MPSoC architectures. The framework accepts a high-level communication constraint graph of the system as input and generates an enhanced architecture model with hardware/software components interconnected via a well defined communication infrastructure, while satisfying multiple design constraints, including power, performance, cost and area. I will describe in detail how the framework comprehensively synthesizes communication architecture topology and protocol parameters while satisfying multiple designer constraints, and show results of applying COMMSYN on industrial-strength MPSoC examples. I will wrap up with a snapshot of some of the other novel features in COMMSYN which make it state-of-the-art, such as physical implementation awareness during synthesis and support for co-synthesis with the memory architecture.

BIOGRAPHY: Sudeep Pasricha is a Ph.D. candidate affiliated with the Center for Embedded Computer Systems at the University of California, Irvine, working with Prof. Nikil Dutt. Sudeep’s research interests include design automation and CAD for embedded multi-processor systems-on-chip (MPSoC), system-level modeling languages and methodologies, networks-on-chip (NoC), middleware for distributed systems and computer architecture. He has given several tutorials on the topic of on-chip communication architecture design at leading conferences, and his research on the exploration and design of on-chip communication architectures is being actively used in industry, at Fujitsu and Conexant. He received a Best Paper Award Nomination at the Design Automation Conference (DAC) in 2005 and a Best Paper Award at the Asia and South-Pacific Design Automation Conference (ASP-DAC) in 2006.
A Lightweight, General Approach to Finding Serious Storage System Errors
Junfeng Yang
Stanford University
Monday, April 2, 2007
ABSTRACT: Storage systems such as file systems, databases, and RAID systems have a simple, basic contract: you give them data, they do not lose or corrupt it. Unfortunately, their code is exceptionally hard to get right: it must anticipate all failures and crashes, and always correctly recover from them, regardless of the current system state. In this talk, I will describe an approach we invented that makes it easy to systematically check real storage systems for errors. We used a novel adaption from model checking, a comprehensive, heavyweight formal verification technique, that makes our approach systematic while being lightweight. We applied this approach to a broad range of real storage systems, and found serious data-loss bugs in every system checked, typically with little effort.
BIOGRAPHY: Junfeng Yang is a Ph.D. candidate in the Computer Science Department at Stanford. His research interests span the areas of systems, security, software engineering and programming languages, focusing on practically checking large, real-world software systems. Junfeng Yang received his MS in Computer Science from Stanford in 2002, and his BS in Computer Science from Tsinghua University, Beijing, China in 2000. He is a receipt of the Siebel Scholarship, the Stanford School of Engineering fellowship. He received the Best Paper Award of OSDI 2004.
Lightweight Heterogeneous and Reconfigurable Embedded Systems for Medical
Ani Nahapetian
Wednesday, April 4, 2007
ABSTRACT: The proliferation of increasingly lightweight and heterogeneous embedded systems has created significant research challenges that must be addressed before these devices can reach their full potential in enhancing our lives. Specifically with medical applications, these devices have the potential to increase the quantity and the quality of the medical data collected in a much more immediate, but also unobtrusive way. They must, however, leave a small footprint while still being capable of large amounts of computation and adaptability.

In this talk, I will present some dynamic reconfigurability approaches which can provide software level adaptability coupled with hardware speed and low-power execution required by these systems. Secondly, since the devices will be deployed near, on, or even in the body, I will present low power techniques to control heat dissipation and extend system lifetime tailored to the submicron level and/or the multiprocessor realm which are posed to dominate. Finally, I will preview my medical embedded tracking system that harnesses local, efficient sensor data to trigger and enhance the accuracy and efficiency of tracking applications, specifically targeted to medical and nursing applications.

BIOGRAPHY: Ani Nahapetian is a PhD candidate at UCLA. Her research interests lie in embedded systems and CAD, with a special focus on medical embedded systems, dynamic reconfiguration, and low power scheduling. More information can be found at http://www.cs.ucla.edu/~ani.
Looking Beyond Performance: Processors for Time Travel
Satish Narayanasamy
Monday, April 9, 2007
ABSTRACT: The processor industry is at an inflection point. In the past, performance was the driving force behind the processor industry. But in the coming many-core era, improving programmability and reliability of the system will be at least as important as improving raw performance. To meet this vision, I will present a processor feature called BugNet that assists programmers in understanding software failures.

Reproducing software failures is a significant challenge. The problem is severe especially for multi-threaded programs because the causes of failure can be non-deterministic in nature. BugNet continuously logs a program’s execution while sacrificing very little performance (~1%). If the program crashes, the developer can use the log to debug the failure by time traveling to a past instant in failed program’s execution time. I will also talk about two software tools built using the BugNet logging mechanism. One is a data race analysis tool, which is currently used at Microsoft. Another is a tool that enabled Intel to easily port their ASIM architectural simulator to Mac OSX, which was never attempted before because of the effort involved.

BIOGRAPHY: Satish Narayanasamy is a Ph.D. candidate in Computer Science at the University of California, San Diego. His research interests include computer architecture and systems support for software development and system reliability. He has received two IEEE Top Picks awards in recognition of his industry relevant contributions.
From web servers to databases to storage systems: A methodological approach to system design
Bianca Schroeder
Carnegie Mellon University
Monday, April 16, 2007
ABSTRACT: Modern computer systems are complex, and designing systems with good performance and high reliability is a major challenge. In this talk, I will show how a measurement-driven “methodological” approach to system design can create better systems. The approach combines real-world measurements with techniques from statistical workload and failure analysis, user behavior characterization, analytical modeling, performance evaluation, and scheduling and queueing theory to gain insights into system behavior that lead to better design ideas. Specific applications we consider in this talk include:
(*) How to schedule connections in a web server to combat transient overload;
(*) How to provide QoS for database transactions;
(*) How to exploit client behavior patterns to maximize system performance;
(*) How to improve reliability of large-scale clusters and storage systems.
BIOGRAPHY: Bianca Schroeder is currently a postdoctoral researcher in the Computer Science Department at Carnegie Mellon University working with Garth Gibson. She received her doctorate from the Computer Science Department at Carnegie Mellon University under the direction of Mor Harchol-Balter in 2005. She is a two-time winner of the IBM PhD fellowship and her work has won two best paper awards. Her recent work on system reliability has been featured in articles at a number of news sites, including Computerworld, Slashdot, StorageMojo and eWEEK.

Dr. Schroeder’s research focuses on the design and implementation of computer systems. The methods she is using in her work are inspired by a broad array of disciplines, including performance modeling and analysis, workload and fault characterization, machine learning, and scheduling and queueing theory. Her work spans a number of different areas in computer systems, including high-performance computing systems, web servers, computer networks, database systems and storage systems.

Constructing Network Systems to Analyze a Zillion Forms of Ill-Will
Ruoming Pang
Princeton University
Wednesday, April 18, 2007
ABSTRACT: A deep problem in building systems to defend against Internet attacks is the immense range of semantics and activity that such a system must grapple with. This talk presents my work in developing and then applying rich semantic analysis of network traffic in this context—research I pursued in the context of the “Bro” system that watches network traffic in real-time to analyze behavior of systems, applications, and users. I will first develop how semantic traffic analysis plays a central role in understanding the nature of “Internet background radiation”—malicious, blind-targeting traffic generated by worms, botnets, and other malwares. I will then turn to addressing the difficult systems issues that arise when attempting to apply such analysis across numerous and complex application protocols used in today’s Internet. Prior to my work, Bro required extensive hand-coding in C++ for each Internet protocol it analyzed. As a powerful alternative, I developed “binpac”—a language/compiler for generating parsers from abstract declarations of protocol syntax and semantics. Analogous to “yacc”, “binpac” both significantly reduces the complexity of building parsers and yet yields generated parsers that are as efficient as manually coded ones.
BIOGRAPHY: Ruoming Pang is a Ph.D. candidate in Computer Science at Princeton University and a Software Engineer at Google. His interests center around building rich software systems—especially systems for automated analysis of complex, real-world data—and programming language approaches in support of building such systems. Ruoming is one of the primary developers of the Bro network intrusion detection system, and in his copious spare time enjoys programming contests and math puzzles.