Individual written problems are to be done individually. Group
programming problems are to be done in your assigned programming
groups. All homework submissions are to be made via Courseworks.
Refer to the homework policy section on the
class web
site for further details.
Programming problems are to be done in your assigned groups using the VM that has been assigned to your group. For all programming problems you will be required to submit source code, a kernel patch, a README file documenting your files and code, and a test run of your programs. In addition, you should submit a cover sheet using either homework_work.txt or homework_nonwork.txt, depending on whether or not the programming assignment is completely working or not. For source code submissions, you only need to submit new source code files that you created and kernel source code files that you changed. You should clearly indicate your names, email addresses, and assigned group number on your submission. Each group is required to submit one writeup for the programming assignment.
The image of the kernel you build for this assignment should be vmlinuz.hmwk4. Grading for this assignment will be done based on vmlinuz.hmwk4. You should base your changes on the 2.6.11 kernel that you installed for Homework 2, but without the modifications you made for the new systems calls you wrote for Homework 2. In particular, submitted kernel patches should be patched against your modified KDB-enabled kernel from Homework 2, i.e. do not include KDB code in your patch.
CWRR Scheduling
The Linux scheduler repeatedly switches between all the running tasks on the system, attempting to give a fair amount of CPU time to each task. Unfortunately, this means that multi-threaded applications will get more CPU time just because they contain more threads. Similarly, users who run more processes will also get more CPU time. An alternative is to introduce a notion called a container, which contains a group of processes. Fair-share scheduling with containers is a strategy that shares the CPU among sets of processes according to the container that is associated with those processes. For example, suppose Mary, Jane, and Sam are each assigned a separate container and using a machine that uses fair-share scheduling. Suppose Mary has one runnable process, Jane has three, and Sam has six. Fair-share scheduling views the ten runnable processes in three containers, and each container receives one-third of the CPU cycles to allocate to the processes that it owns. So Mary's single process would get about 33 percent of the available CPU cycles, each of Jane's three processes would get roughly 11 percent of the available CPU cycles, and each of Sam's six processes would get about 5.5 percent of the CPU cycles. Suppose Jane created two more processes. Jane would still in total get one-third of the CPU, so that each of her processes would now get about 6.7% of the CPU.
Add a new scheduling policy to support fair-share scheduling. Call this policy CWRR, for container weighted round-robin scheduling. CWRR should introduce a container abstraction and use fair-share scheduling based on each process's assigned container. At each invocation of the scheduler, we will use a hierachical scheme to choose which task to run: first, a container is chosen, then, a task within that container's set is chosen. If we allow each container's chosen task the same amount of time on the CPU, each container should be represented equally.
Since we are making two choices in each scheduling decision, we need two algorithms: one to choose the container, and one to choose one of that container's tasks. Let's start by assuming we will use a Round-Robin scheme to decide which container to choose. We keep a queue of containers, and whichever container was chosen last time we scheduled, we choose the next container in the queue for this schedule. We then choose a task within this container's set of tasks (more on this later), and let it run for a fixed amount of time (the time quantum). Now every container gets an equal amount of CPU time.
Now let's say that some containers are more equal than others. Imagine that we associate an integer, which we'll call a weight, with each container, and when a user is chosen, we let the container's chosen tasks run for W time quanta (instead of a single time quantum), where W is the container's weight. In this way we can specify that some containers get more CPU time than others, and also how much more. A container with a weight of 3 will get 50% more CPU time than a container with weight 2. This is called proportional sharing. More specifically, this implementation of proportional sharing is called Weighted Round-Robin.
We still haven't specified how a task is to be chosen once we choose a container. For simplicity, use a simple round-robin (RR) scheduling algorithm to do that. The intracontainer RR scheduler should use the same default timeslice as the Linux scheduler uses for a task with the default nice value. Otherwise, the RR scheduler should work the same as CWRR except that it schedules tasks, not containers, and there are no different task weights.
(50 pts) Implement a new task scheduler, which we will call Container-based Weighted Round-Robin, or CWRR:
Most of the code of interest for this assignment is
in kernel/sched.c and
include/linux/sched.h. These are probably not
the only files you will need to look at, but they're a
good start. While there is a fair amount of code in these
files, a key goal of this assignment is for you to
understand how to abstract the scheduler code so that you
learn in detail the parts of the scheduler that are
crucial for this assignment and ignore the parts that are
not.
Implement a container abstraction. Your abstraction
should include a container identifier, cid, to identify the
particular instance of a container, much like a pid
identifies a task. For simplicity, make the cid an integer
type. Any set of processes should be assignable to a
container. For example, you should not assume that all
processes associated with a container belong to the same
user. One user may have several containers, and a
container may have processes associated with different
users. You should use sched_setscheduler() to
assign a task to a container, with the cid passed in as
the scheduling parameter. If the cid is -1, a new
container should be allocated and assigned to the task.
If a container no longer contains any tasks, it should be
removed. Similarly, sched_getscheduler()
should return the scheduling policy of the process and
sched_getparam() should return the cid of a
task if it is assigned to a container. Other aspects of
the container abstraction implementation should be guided
by the implementation requirements discussed below.
Implement container weights. Add two system calls,
getcweight() and
setcweight(), with the following
prototypes:
int getcweight(int cid);
int setcweight(int cid, int weight);
Give getcweight() syscall number 223,
and setcweight() number
251. getcweight() should return the weight
of the specified container, or -1 on error. The default container
weight should be 1. setcweight() should
give the specified container the specified weight, and return
0, or -1 on error.
For either call, a cid value of -1 indicates the
current container (that is, the owner of the task that is
making the system call). Otherwise, the range of valid
cids should be 0-65535. A bad cid should result in a
return value of -1 and errno set to
EINVAL. Only root is allowed to call
setcweight(); a call by any other user should
return -1 and set errno to EACCES. Also, a
container's weight must be greater than zero. Any attempt
to set a bad weight is an error, and errno should be set
to EINVAL.
Put your code for these two system calls at the bottom
of kernel/sched.c.
Your scheduler should operate alongside the existing
Linux scheduler. Therefore, you should add a new
scheduling policy, SCHED_CWRR. The value of
SCHED_CWRR should be 4. Only tasks
whose policy is set to SCHED_CWRR (normally
done via the system call
sched_setscheduler()) should be considered
for selection by your new scheduler.
Your scheduler and the existing scheduler should interoperate as follows:
If there are any running tasks whose policy is
SCHED_RR or SCHED_FIFO
(these are real-time tasks), one of them should be
chosen before any other, according to the existing
Linux scheduler rules.
If there are any running tasks whose policy is
SCHED_CWRR (your scheduler), one of them
should be chosen before any SCHED_NORMAL
(the default scheduler) tasks, according to the rules
outlined.
If all running tasks have a policy of
SCHED_NORMAL (or SCHED_BATCH),
the default scheduler should be used to choose a task,
according to the default scheduler's rules.
In other words, there should be a strict priority
relationship where SCHED_RR and
SCHED_FIFO are first,
SCHED_CWRR is second, and
SCHED_NORMAL and SCHED_BATCH are last.
Making your scheduler operate alongside the existing may seem like an unnecessary difficulty, but it spares you the (significant) difficulty of dealing with scheduling during initialization, so you can get a system up and running before any of your own code goes into action.
If a container who has no runnable SCHED_CWRR
tasks has a SCHED_CWRR task become runnable,
that container should go to the end of the round-robin queue
(should be the last container to get a turn).
The base time quantum should be 100 ms. This means
that if a user's container has a weight of 5, she gets 500 ms at a
time with which to run her SCHED_CWRR
tasks. Note that this does not mean that every one of her
SCHED_CWRR tasks runs for 500 ms. How long a
task is allowed to run depends on the intracontainer RR scheduler.
Note that you will need to change the timer interrupt
handler so that schedule gets called at the right
time. See scheduler_tick() in
kernel/sched.c.
The SCHED_CWRR scheduling flag should be
inherited by the child of any forking parent. In
other words, if a parent process that belongs to a
container creates a child process, the child process
should belong to the same container.
Linux uses an array data structure for keeping track of tasks for scheduling. Tasks in this array are indexed by priority. You may find this data structure useful for implementing CWRR. However, your scheduler should be designed to work with any number of containers and should not assume only a fixed number of containers.
Hint: when you see the
need_resched field of the currently running
task get set to 1, this is a signal to some assembly code
on the return path of the interrupt handler. It causes
schedule() to get called just before
returning from the current interrupt handler, and is the
correct way to make a schedule happen from within an
interrupt handler, where you can't call
schedule() directly.
(10 pts) Create a test program that creates some number of processes and assigns different numbers of them to containers with different weights, then measures CPU usage as the processes run to show that your CWRR scheduler behaves properly. Submit your test program source file and a writeup of your results.