ABSTRACT:
Machine learning is an interdisciplinary field which seeks to develop
both the mathematical foundations and practical applications of
systems that learn, reason and act. Machine learning draws from many
fields, ranging from Computer Science, to Engineering, Psychology,
Neuroscience, and Statistics. Because uncertainty, data, and inference
play a fundamental role in the design of systems that learn,
statistical methods have recently emerged as one of the key components
of the field of machine learning.
In particular, Bayesian methods,
based on the work of Reverend Thomas Bayes in the 1700s, describe how
probabilities can be used to represent the degrees of belief of a
rational agent. Bayesian methods work best when they are applied to
models that are flexible enough to capture to complexity of real-world
data. Recent work on non-parametric Bayesian methods provides this
flexibility. I will touch upon key developments in the field, including
Gaussian processes, Dirichlet processes, and the Indian buffet process
(IBP). Focusing on the IBP, I will describe how this can be used in a
number of applications such as collaborative filtering,
bioinformatics, cognitive modelling, independent components analysis,
and causal discovery. Finally, I will outline the main challenges in
the field: how to develop new models, new fast inference algorithms,
and compelling applications.
BIOGRAPHY:
Zoubin Ghahramani is Professor of Information Engineering at the
University of Cambridge, UK, and is also Associate Research
Professor of Machine Learning at Carnegie Mellon University, USA. He
obtained BA and BSE degrees from University of Pennsylvania, and a
PhD in 1995 from MIT working with Prof Mike Jordan. He did a postdoc
in Computer Science at University of Toronto working with Prof Geoff
Hinton. His work has included research on human sensorimotor
control, cognitive science, statistics, and machine learning. His
current focus is on Bayesian approaches to statistical machine
learning, and has recently started a research programme on machine
learning applications to information retrieval. He has published
over 100 peer reviewed papers, and serves on the editorial boards of
several leading journals in the field, including JMLR, JAIR, Annals
of Statistics, Machine Learning, and Bayesian Analysis. He is
Associate Editor in Chief of IEEE Transactions on Pattern Analysis
and Machine Intelligence, currently the IEEE's highest impact
journal. He also serves on the Board of the International Machine
Learning Society, and was Program Chair of the 2007 International
Machine Learning Conference.
ABSTRACT:
Miniaturization and Moore's Law have enabled us to combine sensing, computation and wireless communication in integrated, low-power devices, and to embed networks of these devices in the physical world. By placing sensing devices up close to the physical phenomena, we are now able to study details in space and time that were previously unobservable. Looking back over the past few years, we have made significant progress toward the vision of programmable, multi-modal, multi-scale observatories. We have made our greatest strides in these applications using: judicious application of server-side and in situ processing, mobility at multiple scales, and multi-scale data and models as context for in situ measurements. We are now applying these lessons learned and technical approaches to human as well as natural systems, in particular by exploring use of the installed base of image, location, and acoustic sensors that we all carry around in our pockets or on our belts---mobile phones. In this talk I will draw upon experiences with pilots and prototypes at CENS.
BIOGRAPHY:
Deborah Estrin (Ph.D. MIT, 1985; BSEE UCB, 1980) is a Professor of Computer Science at UCLA, holds the Jon Postel Chair in Computer Networks, and is Founding Director of the NSF-funded Center for Embedded Networked Sensing (CENS). CENS' mission is to explore and develop innovative, end-to-end, distributed sensing systems, from ecosystems to human systems. Estrin's earlier work addressed Internet protocol design and scaling, in particular, inter-domain and multicast routing. Since the late 90's her work has focused on multi-disciplinary, experimental-systems research as applied to a variety of environmental monitoring challenges. Most recently this work includes participatory-sensing systems, at the personal and community level, leveraging the location, acoustic, image, and attached-sensor data streams increasingly available from mobile phones.
Estrin chaired the 1998 ISAT study on sensor networks and the 2001 NRC study on Networked Embedded Computing which produced the report Embedded Everywhere. She served as a founding member of the National Ecological Observatory Network (NEON) Advisory board, and is currently a member of the NRC Computer Science and Telecommunications Board (CSTB), and TTI/Vanguard. Estrin was selected as the first ACM-W Athena Lecturer in 2006, was awarded the Anita Borg Institute's Women of Vision Award for Innovation in 2007, and was elected as a member of the American Academy of Arts and Sciences in 2007.
ABSTRACT:
A significant cause of the unreliability of end-to-end communications on
the Internet is route instability: dynamic changes in routers' selected
paths. Instability is becoming even more problematic due to the
increasing prevalence of real-time applications and concerns about the
scalability of the Internet routing architecture. Yet Route Flap
Damping, the main mechanism for combating instability, has introduced
unexpected pathologies and reduced availability.
This talk describes a more principled approach to stabilizing Internet
routing. We identify general approaches to achieve stability, and
quantify their inherent tradeoffs with other objectives via upper and
lower bounds. I will describe Stable Route Selection (StaRS), a new
approach which uses flexibility in route selection to improve stability
without sacrificing availability. Simulation and experimental results
show that StaRS improves stability and end-to-end reliability while
deviating only slightly from preferred routes, and closely approaching
our theoretical lower bound. These results indicate that StaRS is a
promising, easily deployable way to safely stabilize Internet routing.
StaRS's stability improvements are enabled by dramatic heterogeneity in
route failure patterns. I will present the case that StaRS is an
instance of a much more general principle: that heterogeneity ---
variation in reliability, processing speed, bandwidth, or other metrics
--- should quite often be viewed as an advantage. This thesis is
supported by practical and theoretical results in a variety of settings
including distributed hash tables, overlay multicast, and job
scheduling.
BIOGRAPHY:
Brighten Godfrey's research concerns distributed and networked systems,
including Internet routing architecture, distributed algorithms,
analysis of networks, peer-to-peer systems and overlay networks. He is
presently a Ph.D. candidate advised by Ion Stoica at UC Berkeley.
ABSTRACT:
Driven by the hardware shift to multi-core architectures, concurrency is
being brought into mainstream software development. Unfortunately,
concurrent programs are prone to concurrency bugs, because of the
inherent complexity of concurrency and the sequential thinking habits of
programmers. Concurrency bugs' non- deterministic property also brings a
lot of trouble to developers. Improving the reliability of concurrent
programs is a critical and urgent task.
This talk presents recent work on understanding, detecting, and exposing
concurrency bugs. I will first present two techniques that detect
concurrency bugs from the angle of programmers' synchronization
intention. The first one is AVIO (technology transfer to Intel). AVIO
automatically infers programmers' atomicity intentions from correct
execution. It detects violations to the inferred intentions and reports
concurrency bugs. The second technique is MUVI. MUVI automatically
infers variable correlation relationship from the source code and
detects unsynchronized concurrent accesses to correlated variables.
During this talk, I will also briefly discuss some findings from our
characteristics study of real-world concurrency bugs. These findings
have motivated the above bug detection work and inspired us to improve
concurrent program testing.
BIOGRAPHY:
Shan Lu is a Ph.D. candidate in the Department of Computer Science at
the University of Illinois at Urbana-Champaign. Her research areas are
software systems and computer architecture. She is interested in the
software reliability problem, with focus on concurrent software systems.
She received the W.J. Poppelbaum Memorial Award from University of
Illinois in 2007 for research and academic accomplishments. Her recent
work on concurrency bug detection is under technology transfer to Intel
and was selected into IEEE Micro Top Picks 2006.
ABSTRACT:
Today, computer security resembles an arms race: the bad guys constantly
find new ways to break in, and being safe requires staying one step
ahead of them in cutting off avenues of attack. This strategy is simply
too risky and too expensive in the long run. In this talk, I will argue
that we need to address security at a much more fundamental level, and I
will show how re-designing operating systems, network protocols, and
hardware can provide a solid foundation for building applications in a
way that eliminates or radically reduces vulnerabilities.
Much of the challenge in building secure applications stems from the
fact that real systems are constantly evolving, and that most
programmers are not security-conscious, resulting in code rife with bugs
that cause security vulnerabilities. Instead of trying to fix all code,
this talk will argue that we should protect data, by controlling how it
can move through the system. The key insight is that data protection
cuts across layers: any piece of data in an application can also be
viewed as memory or files by the OS, or as physical pages by the
hardware. Consequently, even data in buggy applications can be protected
by the OS or by hardware, despite the latter two being at a much lower
level of abstraction.
In particular, I will first describe how a low-level information flow
control mechanism can be provided by a small OS kernel, hardware, or
network protocol, and then show how the same mechanism can be used
throughout the system to enforce security policies ranging from those
traditionally found in Unix to those that can ensure the privacy of user
data in a web server built from largely untrusted code.
BIOGRAPHY:
Nickolai Zeldovich is a postdoc at Stanford University, where he
recently received his Ph.D. Previously he received M.Eng. and S.B.
degrees from MIT. His research interests are in security, operating
systems, and networking.
ABSTRACT:
Over the years Moore's Law has provided exponential growth in the raw
computational resources available to hardware architects. At the same
time, however, chip fabrication costs have also skyrocketed, resulting
in expenses that relatively few institutions can afford. As part of
my dissertation, I have proposed "brick and mortar" chips to mitigate
these high fabrication costs while offering the performance and
integration of a modern ASIC. This work includes several
architectural design choices and innovations, including a polymorphic
on-chip network design.
I will demonstrate multiple modes of network
customization that this design allows, including topology and
buffering resources. This single polymorphic network fabric can be
configured to mimic the topology and performance of optimally designed
application-specific networks with no appreciable overhead. This
network is not only a critical enabler of brick and mortar-based
designs, but has broad applicability to any chip requiring an on-chip
network. When used with brick and mortar, however, low-cost,
high-performance custom chips can become a reality.
BIOGRAPHY:
Martha Mercaldi Kim is a Ph.D. candidate in Computer Science and
Engineering at the University of Washington. Prior to moving to
Seattle she earned her B.A. in Computer Science at Harvard University
and a M. Eng. in Embedded Systems Design from the University of Lugano
in Lugano, Switzerland. Her research interests are in computer
architecture, particularly its boundaries: in the hardware/ software
interaction, as well as the architecture/circuit interaction. As a
graduate student she was a member of the WaveScalar research group and
developed an architecture to enable reduced-cost chip designs.
ABSTRACT:
Building distributed systems is particularly difficult because of the
asynchronous, heterogeneous, and failure-prone environment where these
systems must run. This asynchrony makes verifying the correctness of
systems implementations even more challenging. Tools for building
distributed systems must strike a compromise between reducing programmer
effort and increasing system efficiency. Mace is a C++ language
extension, compiler, runtime, and toolset, that translates a concise but
expressive distributed system specification into a C++ implementation.
Mace exploits a natural decomposition of distributed systems into a
layered, event-driven state machine.
A key design principle of Mace is
to separate each service algorithm from the implementation mechanics
(serialization, dispatch, synchronization, etc.), debugging code (logging
and property testing), and its utility services (lower-level services
providing a specified interface). Our experience indicates that
precisely because Mace imposes limits on the design structure of
distributed systems, it supports the implementation of a wide variety of
high-level supporting tools, including model checking, simulation, live
debugging, and visualization.
Mace is fully operational, has been in
development for four years, and has been used to build a wide variety of
Internet-ready distributed systems. This talk will describe both the
Mace programming language design and MaceMC, the first model checker
that can find liveness violations in unmodified systems implementations.
ABSTRACT:
With the advent of data centers and "cloud computing", distributed
systems are becoming much larger and far more sophisticated, with
computation spread over thousands of hosts and complex execution
paths. In this talk I will discuss new approaches to securing and
understanding these complex systems.
I will first describe how we can build more robust systems using a new
trusted primitive called Attested Append-Only Memory (A2M). We trade
off assumptions on trusted components for improved Byzantine fault
bounds of safety and liveness. I will then present a way of
characterizing the complexity of general networked systems. I will
describe a metric based on distributed state dependencies, and apply
it to routing and classical distributed systems.
BIOGRAPHY:
Byung-Gon Chun is a postdoctoral researcher at the International
Computer Science Institute, funded by Intel Corporation. He received
his Ph.D. in Computer Science in 2007 from the University of
California at Berkeley. His research interests span distributed
systems and networks with emphasis on fault tolerance, security,
complexity, and system troubleshooting.
ABSTRACT:
Network-enabled mobile devices are quickly becoming ubiquitous in the
lives of ordinary people, but current technologies for providing
ubiquitous global connectivity between these devices still require
experts to set up and manage. Users must allocate and maintain global
domain names in order to connect to their devices globally via DNS,
they must allocate a static IP address and run a home server to use
Mobile IP or set up a virtual private network, they must configure
firewalls to permit desired remote access traffic while filtering
potentially malicious traffic from unknown parties, and so on. This
model of "management by experts" works for organizations with
administrative staff, but is infeasible for most consumers who wish to
set up and manage their own personal networks.
The Unmanaged Internet Architecture (UIA) is a suite of design
principles and experimental protocols that provide robust, efficient
global connectivity among mobile devices while relying for
configuration only on simple, intuitive management concepts. UIA uses
"personal names" rather than traditional global names as handles for
accessing personal devices remotely. Users assign these personal names
via an ad hoc device introduction process requiring no central
allocation. Once assigned, personal names bind securely to the global
identities of their target devices independent of network location.
Each user manages one namespace, shared among all the user's devices
and always available on each device. Users can also name other users
to share resources with trusted acquaintances. Devices with naming
relationships automatically arrange connectivity when possible, both
in ad hoc networks and using global infrastructure when available. We
built a prototype implementation of UIA that demonstrates the utility
and feasibility of these design principles. The prototype includes an
overlay routing layer that leverages the user's social network to
provide robust connectivity in spite of network failures and
asymmetries such as NATs, a new transport protocol implementing a
novel stream abstraction that more effectively supports the highly
parallelized and media-oriented applications demanded on mobile
devices, and a flexible security framework based on proof-carrying
authorization (PCA) that provides "plug-in" interoperability with
existing secure naming and authentication systems.
BIOGRAPHY:
Bryan Ford began his systems research career as an undergraduate in
the Flux group at the University of Utah, where he developed novel
kernel structuring and component reuse techniques. After a break to
join Phobos Inc., a successful networking startup, he returned to
research as a graduate student at MIT, where he has pursued a diverse
array of interests including programming languages, peer-to-peer and
ubiquitous device networking, storage systems, and virtual machines.
ABSTRACT:
Digital information plays an increasingly critical role in scientific
research, military systems and other enterprises, and this trend has
important implications. First, many systems are more and more being
distributed over a global network of datacenters, which is emerging as
an important distributed systems paradigm. Second, storage systems in
these environments must ensure the durability, integrity, and
accessibility of digital data, and do so under potentially turbulent
conditions. For example, in large scale distributed systems, servers
continuously fail; data should remain durable despite constant failure.
Antiquity is a distributed storage system designed for these sorts of
challenging environments. It maintains data securely, consistently, and
with high availability in a dynamic wide-area environment. At the core
of the system is a novel secure log structure that permits Antiquity to
guarantee the integrity of stored data, even under extreme stress. Data
is replicated on multiple servers in a manner that ensures that it can
be retrieved later even when some replicas are inaccessible. Moreover,
unlike prior fault-tolerant systems, the Antiquity fault-tolerance
protocols can handle high levels of node churn, regenerating data on the
fly when necessary to handle faults ranging from server outages to
Byzantine (malicious) attacks.
Further, I will present SMFS, a remote mirroring solution targeted for
settings where high-speed high-latency links connect a pair of
datacenters. SMFS provides strong disaster tolerance guarantees with
asynchronous performance-mirroring response times are more typical of
high-speed LAN setting. Not only does the approach provide reliability
through mirroring, but there are conditions under which it offers
dramatic power savings. Longer term, we see SMFS and Antiquity as two
examples of a family of innovative solutions addressing a range of
demanding problems seen in turbulent, mission-critical, and power
constrained settings.
BIOGRAPHY:
Hakim Weatherspoon is currently a post-doctoral fellow at Cornell
University. His work covers various aspects of information systems,
distributed systems, network systems, and peer-to-peer systems with
focus on fault-tolerance, reliability, security, and performance of
Internet-scale systems. He previously received his PhD from University
of California, Berkeley in Computer science.
ABSTRACT:
Chip multiprocessors (CMPs) are now commonplace and the number of
processors on each machine is growing rapidly. While parallel machines
are becoming ubiquitous, the difficulty of writing parallel programs
remains a significant obstacle to its widespread adoption. In this
talk, I will present two ideas that tackle this problem.
First, I will present a hybrid transactional memory scheme. One major
source of difficulty in writing parallel programs is implementing
correct synchronization between threads. Locks are the most common
synchronization mechanism but have a number of well-known
pitfalls. Transactional memory is an alternative mechanism that makes
parallel programming easier. Early transactional memory proposals were
implemented either entirely in hardware or in software. Both
approaches have their limitations. We proposed the first hybrid
hardware/software transactional memory scheme that approaches the
performance of a hardware scheme in the common case and gracefully
falls back to a software scheme when a transaction exceeds the
capability of the hardware.
Second, I will present an architectural proposal that supports fine
grained parallelism. To harness the parallelism on a CMP, applications
must expose their thread-level parallelism to the hardware. One common
approach to doing this is to decompose a program into parallel "tasks"
and allow an underlying software layer to schedule these tasks to
different threads. Software task scheduling can provide good parallel
performance as long as tasks are large compared to the software
overheads. However, a significant number of RMS applications have
small tasks for which software task schedulers achieve only limited
parallel speedups. This work proposed a hardware technique to
accelerate dynamic task scheduling on scalable CMPs. This proposal has
relatively simple hardware and delivers close to optimal performance
of a number of RMS benchmarks with fine-grained parallelism.
BIOGRAPHY:
Sanjeev Kumar is a Staff Researcher in Microprocessor Technology
Labs at Intel Corporation. Prior to joining Intel in 2002, he received
his Ph.D in Computer Science from Princeton University. The goal of
his research is to make it easier to write correct parallel programs
that achieve the desired performance objectives. His current research
has two components: parallelizing mainstream applications to obtain an
in-depth understanding of such applications, and proposing hardware
and software techniques that enable such applications on mainstream
parallel machines.