W4118 Fall 2000 - Homework 3

Homework 3
W4118 Fall 2000
DUE: Monday, 10/23/2000 at the beginning of class

Submission Instructions:

All non-programming problems in this assignment are to be done by yourself. All group programming problems in this assignment are to be done in your assigned groups. Each of you should turn in one hardcopy for the non-programming problems. Each group should turn in one hardcopy for the programming problems. Both hardcopies should have your name and email address clearly written on the first page.

Programming problems are to be done in your assigned groups using the VM that has been assigned to your group. For all programming problems you will be required to submit source code, a README file documenting your files and code, and a test run of your programs. For source code submissions, you only need to submit new source code files that you created and kernel source code files that you changed. You should clearly indicate your names, email addresses, and assigned group number on your submission. Each group is required to submit one writeup for the programming assignment using the SUBMIT program. Refer to the homework submission page on the class web site for additional submission instructions.

Non-Programming Problems:

Exercise numbers refer to the course textbook. Each problem is worth 4 points unless otherwise indicated.

Exercise 6.1
Exercise 6.3
Exercise 6.4
Exercise 6.6
Exercise 6.8
Exercise 6.10
Exercise 7.2
Exercise 7.5
Exercise 7.10
Exercise 7.12

Programming Problems:

The image of the kernel you build for this assignment should be kern2. Grading for this assignment will be done based on kern2.

Background

The purpose of this assignment is to gain better understanding of a commercial scheduler implementation (the Linux scheduler), to extend this scheduler with new scheduling policies, and to evaluate your implementation.

The basic operation of the Linux scheduler is as follows. When a scheduling decision is to be made, the schedule function (/usr/src/linux/kernel/sched.c) is invoked. Its purpose is to select the next process be allocated the CPU. When reading through the scheduler code for the first time, it is easiest to 'skip' over the multiprocessor (__SMP__) aspects and focus on the flow and the purposeful simplicity of the operations.

Within the schedule function, the current process is termed prev and the default for the next process is the idle (swapper) task. The schedule function traverses the run queue to find the best process to schedule next by starting with the init task. For each process on the run queue, it calcuates a goodness value. As it traverses the run queue, if a task is found with a greater goodness value, the scheduler replaces its pointer to the current next task with this task. When it reaches the end of the list, if unable to find a process with a nonzero 'goodness' value, it recalculates the scheduling state used to determine the goodness of SCHED_OTHER processes, then traverses the list again. Once it selects a task, bookkeeping is performed and then architecture-specific functions are used to actually switch to this next task.

The goodness of a task is calculated differently depending on the scheduling policy. For the real time policies (SCHED_RR and SCHED_FIFO), this goodness is the priority (rt_priority) of the process plus a 'boost' to make sure runnable real-time processes always have more goodness than non-realtime processes. Non-realtime processes are scheduled under the SCHED_OTHER policy. Under SCHED_OTHER, goodness is based primarily on the counter and the priority of the process. The counter is the number of ticks a process will be allowed to run when it is allocated the CPU. If a process was interrupted during it's last allocation of the CPU, the counter value is the number of ticks remaining from the process' last allocation. Note the difference between the priority and the rt_priority of a process.

Now we get to the fun part!

(10 pts.) A proportional share scheduler is one that allocates CPU cycles to processes in proportion to each process's assigned weight. For example, if processes P1 and P2 are both runnable and have assigned weights 3 and 1, a proportional share scheduler will give P1 3 times as many CPU cycles as P2. How can SCHED_OTHER be used to provide proportional share behavior? Write a simple while loop program to test the proportional share behavior of SCHED_OTHER. Your test program's processing loop should not contain I/O operations as they would skew your results. Run 5 different copies of the program concurrently such that the respective processes behave as though they are assigned weights 1, 2, 3, 4, 5.
(10 pts.) Show that the CPU time given to each process is in proportion to its weight. Since utilities such as top and ps will be too coarse grained to collect the information you require, you will need a more suitable mechanism to collect the data. Start by simply inserting printk statements in the scheduler to gauge the accuracy of your policy. Obviously, the number of printk statements should be few as the scheduler is time critical and is called very frequently.
For your final report, use statistics collected by a small program which calls proc_info to retrieve the system and user times of each of the 5 processes at 1 second intervals. You may use your own code for the proc_info system call, or the code posted in the solutions for Homework 2.
(15 pts.) Extend the Linux scheduler with a weighted round-robin scheduler. Like other schedulers, it should have a name. Call it SCHED_WRR. SCHED_WRR should be implemented such that it schedules in O(1) time and uses a time quantum of 1 jiffy for a process with weight 1. You will need to modify the sched_setscheduler system call to allow users to select your new scheduling policy and to assign a weight to a process. Use the same while loop program as before to test your implementation of SCHED_WRR by running 5 different copies of the program concurrently, with respective weights 1, 2, 3, 4, 5.
(5 pts.) As in the case of SCHED_OTHER, show that the CPU time given to each process is in proportion to its weight. For your final report, use statistics collected by a small program which calls proc_info to retrieve the system and user times of each of the 5 processes at 1 second intervals. Based on your measurements, is SCHED_WRR or SCHED_OTHER more accurate in proportionally allocating CPU cycles? Conceptually, which scheduler should be more accurate? Do your measurements support your conceptual understanding? Why or why not?
(15 pts.) One innovative form of scheduling is known as lottery scheduling, which is more or less what it sounds like. An operating system's processes are all given individual amounts of tickets. A random number is generated and the process owning the winning ticket gets to run for a quantum. Every time the schedule function is activated, a new random number is generated and a new process is selected to run. Therefore, if there are 3 processes A, B, and C, assigned 1,2, and 3 tickets respectively for a total of 6 tickets, process A has a 1/6 chance of winning, process B has a 1/3 chance of winning, and process C has a 1/2 chance of winning.
Extend the Linux scheduler with a lottery scheduler. Like other schedulers, it should have a name. Call it SCHED_LOT. You will need to modify the sched_setscheduler system call to allow users to select your new scheduling policy and to assign a weight to a process. Use the same while loop program as before to test your implementation of SCHED_LOT by running 5 different copies of the program concurrently, with respective weights 1, 2, 3, 4, 5.
(5 pts.) As in the case of SCHED_OTHER and SCHED_WRR, show that the CPU time given to each process is in proportion to its weight. For your final report, use statistics collected by a small program which calls proc_info to retrieve the system and user times of each of the 5 processes at 1 second intervals. Compare how accurate SCHED_LOT is compared to SCHED_WRR and SCHED_OTHER in proportionally allocating CPU cycles. Based on your measurements, which scheduler is more accurate? Conceptually, which scheduler should be more accurate? Do your measurements support your conceptual understanding? Why or why not?

Additional Information: