Operating Systems II

OPERATING SYSTEMS II	COMS E6118, Dept of Computer Science, Columbia University

	General Info \| Presentations/Reviews \| Projects \| Grades \| Discussion \| OS Resources

INDIVIDUAL AND TEAM PROJECTS
This course has two projects, an individual mini-project and a team final project. Further details regarding the individual mini-project are available here. The remainder of this section discusses the final project.
The goal of the final project is to provide the opportunity for you to investigate in depth a particular area of operating systems and conduct operating systems research. The size of the project can vary, but thinking of it as a Usenix conference paper is a reasonable model. Final projects are to be undertaken in teams of two students. If you have a project that you feel warrants a third participant, please check with me first. A list of students in the class and their interests are available. Feel free to use it in finding your final project partners.
For this project, you need to pose a question, design a framework in which to answer the question, conduct the research, and write up your experience and results. There will be four deliverables for this project which will count toward your final project grade: a project proposal and research plan (20%), an extended abstract (20%), an in-class presentation (20%), and a final report (40%). Further information about the deliverables is available here.
The topic of the final project is largely up to you, though some project topics are suggested below. You need not pick your final project from this list, but if you decide on a project not on this list, please check with me before fully committing to the project. Some guidelines for choosing a project are: (1) the work can be completed in less than three months, (2) we have the required hardware and software in-house to enable you to conduct the necessary research, (3) the research project has something to do with operating systems, (4) the project is structured in such a way that you can have quantitative results, and (5) you will learn something from doing this project. Thanks to Margo Seltzer for a number of the ideas on this page.
Project Suggestions

Disk Space Management: Proportional share resource management has been proposed for CPU scheduling. Can proportional share resource management be applied to disk space management in an intelligent manner that is more effective than standard disk quotas? A key issue that needs to be addressed is how to reclaim disk space from users that end up using more than their proportional share of disk space. Design and implement an algorithm for proportional share disk space management.
Process Checkpoint/Restart in Linux: A process checkpoint/restart mechanism allows a user to stop a process, save its state to disk, copy that state to another machine, and restart that process the new machine. Build a Linux device driver that implements a process checkpoint/restart mechanism. Be careful to state your assumptions, if any, about the nature of processes that can be checkpointed and restarted using your mechanism.
Shrinking Linux: An installation of RedHat Linux takes up several hundred megabytes of disk space, even without installing the window system. Can you develop a version of Linux that compiles itself and fits on a floppy disk or two? What is the minimum space requirement of a Linux system that can compile itself? You should consider removing functionality from Linux (virtual memory, support for 13 different file systems, etc.) as part of this project.
Queueing of I/O Requests: Approximately ten or fifteen years ago, Kirk McKusick conducted a study of existing disk drives and determined that the operating system could do a better job of scheduling disk operations than the drives themselves. That is, the system achieved better I/O throughput when requests were issued to the disk one and a time and the operating system controlled the ordering (as opposed to sending a large number of requests to the disk and allowing it to do the ordering). Does this result hold with today's devices? Given that SCSI hides much of the disk geometry and that the processors on disks are more intelligent, one might hypothesize that this is no longer true. Prove/Refute this hypothesis.
Power Management: There has been much research on managing power consumption to extend the battery life of laptop computers. Compare the power management performance of Linux and Windows NT. Which operating system does a better job? Survey the literature and implement a power management technique in Linux to improve its power management performance.
Stackable File Systems under Windows NT: NT has a different file system interface than Unix's vnode interface. NT's I/O subsystem defines its file system interface. NT Filter Drivers are optional software modules that can be inserted above or below existing file systems. The task of these Filter Drivers is to intercept and possibly extend file system functionality. One example of an NT filter driver is its virus signature detector. It is therefore possible to emulate file system stacking under NT, the way it is done in BSD4.4 or through other work such as the Wrapfs stackable templates (Solaris, Linux, and FreeBSD). Create a stackable template layer for Windows NT, and use it to implement a transparent encryption file system. Existing stackable templates (Wrapfs, Cryptfs) for Unix systems will be provided in doing this project.
Decomposition of Functionality for Distributed Systems: The client/server model is used for many distributed applications and systems. An interesting question is how does one decide on the decomposition of application functionality between the client and the server? Is the goal to minimize network bandwidth because networks are slow, or maximize client computation because desktop systems are fast, or something else? For instance, thin client systems such as VNC move all computations to the server and just send pixel updates to the client. As a result, clients do not have much computing power but networks need to have more bandwidth and less latency for the system to run well. Select a client/server distributed system or application and examine how the resource requirements of the system change depending on how the functionality is partitioned.
Middleware Resource Management: Management of machine resources such as CPU, memory, disk, etc. is traditionally done within the operating system. However, this typically requires operating system kernel source code modifications, which may be difficult to deploy. An alternative is to implement resource management functionality at the middleware level in user space outside of the kernel. A device driver may also be implemented to provide access to lower level hardware information, such as interrupts. Design a proportional share middleware resource management framework that does as much resource management as possible outside of the kernel.
Distributed Middleware Management: Suppose you have a set of applications and a set of machines that you can use to run them. What is the best way to allocate the applications to the machines to make the best use of available machine resources? Design a distributed middleware framework that can assign applications to distributed machines to provide good load balancing behavior across the machines. As a further step, enable this framework to function effectively across a heterogeneous set of machines with different operating systems.
x86 Application-specific Kernel: Commercial operating systems often provide lots of functionality that goes unused when the system in question is just being used as a dedicated server for a single application. If the operating system was designed with the application in mind instead of the other way around, the operating system could be designed to be much simpler and deliver much better performance for the given application. Choose a performance sensitive application (i.e. web server, streaming video, etc.) and write an x86 operating system from scratch with a basic command interpreter that is taylored to the needs of the application.
Adaptive QoS with OS Support for Multimedia Applications: Multimedia applications are often able to adapt their behavior to make the best use of available of resources. For instance, a video application might reduce its frame rate or show a lower resolution image if it is not possible to show a high resolution video sequence at normal frame rate due to insufficient computing resources. Ideally, these applications would adapt in such a way to maximize the quality of their results, but how to decide what is better quality for an end user is an open problem. If we knew what the right way was to define quality, better OS mechanisms could be designed to cooperate with applications in managing resources. Choose a multimedia application and a meaningful quality metric for that application. Measure the quality of its results for various adaptation mechanisms. Develop an OS mechanism that will help the application choose the right adaptation based on available resources.
OS Benchmarking: Operating system performance can be measured in many different ways. One can measure its interactive performance, how well it supports a web server, how reliable the system is, how well databases run on it, etc. Conduct an exhaustive survey of operating system benchmarking techniques and discuss the tradeoffs of the different approaches. What aspects of operating system performance are hard to measure? Select a representative set of these techniques and benchmark two operating system platforms. Discuss your results.
OS Scalability: Future data centers will provide computational services for thousands of users. How well do existing operating systems scale in meeting the demands of large numbers of activities? For instance, does scheduling overhead grow linearly or worse so that systems that are heavily loaded waste time in the operating system exactly when they don't have the time to spare? Compare the scalability of two operating systems by measuring their performance under heavy workloads.
Real-time Performance: A number of commercial operating systems (Linux, NT, Solaris) claim to provide support for real-time applications. Measure and compare the real-time performance of two operating systems. For instance, if a process has the highest priority in the system, how does its dispatch latency change as the number of other processes running in the system increases?
Proportional Share Linux: Design and implement a complete proportional share resource management system for Linux, including management of CPU, memory, locks, and network resources. Enable your system to assign shares to groups of processes as well as individual processes. The share assignment mechanisms and policies should provide similar functionality to the ticket mechanisms and policies of Lottery scheduling.
Real-time Multiprocessor Scheduling: Much work has been done dealing with uniprocesor scheduling for real-time processes, but hardware vendors are moving toward multiprocessor platforms and the work in multiprocessor scheduling is more more limited. Quantitatively evaluate the effectiveness of Solaris running on a 4-processor multiprocessor in meeting the time constraints of real-time processes.

Jason Nieh, nieh@cs.columbia.edu