You will build your own Linux kernel for this class. For each homework assignment, you will build a new Linux kernel. The image of the kernel you build for this assignment should be vmlinuz.hmwk5. Grading for this assignment will be done based on vmlinuz.hmwk5.
For this problem, you will implement a new kernel synchronization primitive. Read the "How Processes Are Organized" section in Chapter 3, and the "Synchronization Primitives", "Atomic Operations", "Spin Locks" and "Read/Write Spin Locks" sections in Chapter 5 in Understanding the Linux Kernel. In Chapter 3, pay particular attention to the material about wait queues. Note that the synchronization primitives in Linux generally do not have owners and are not recursive. No owners means that there's nothing stopping any piece of code from unlocking every lock in the kernel (and bringing the kernel to a fiery screeching halt, probably). Not recursive means that the owner of a lock (even if there was such a thing) can't obtain the lock twice.
(25 pts.) Design and implement a new kernel synchronization primitive: a barrier. Barriers are used in parallel programming to ensure that all processes complete one stage of computation before moving on to another (for example, a parallelized matrix multiply). Given a barrier of size N, all processes calling the barrier primitive will block until all N processes have called it, after which each may continue. The barrier is then reset back to N. Note that no other ordering is expressed or implied; e.g., the last process to arrive at the barrier may be the first process to leave or vice versa. Implement the following new system calls in the Linux kernel, which should be numbered 318 to 320, respectively:
int barriercreate(int num);
Creates a new
barrier of size num,
returning the barrier ID on success, -1 on failure. int barrierdestroy(int barrierID);
Destroy the
barrier with the given barrier ID and notifies any processes waiting on
the barrier to leave it. Return number of processes notified on success
and -1 on failure. int barrierwait(int barrierID);
Blocks an
individual process until N processes have arrived; must verify that the
barrier ID is valid. Return 0 on success and -1 on failure. The
barrier being destroyed while a process waits is also a failure
condition that should return -1. You should begin by thinking carefully about the data structures
that
you will need to solve this problem. Your system will need to support
having multiple barriers open at the same time, so you will probably
need a set of barrier descriptor data structures, each of which
identifies a barrier. Those data structures will need to be put in a
list from which your code can find the appropriate barrier descriptor
correponding to the barrier that you need. Space for the barrier
descriptors should be dynamically allocated, most likely using the
kernel functions kmalloc()
and kfree()
.
You should not make any assumptions about whether the system is a uniprocessor or multiprocessor system. Be sure to properly synchronize access to your data structures. Moreover, be sure to select the appropriate synchronization primitives such that they are both correct and efficient, in this order. For instance, you should prefer a spinlock over a semaphore if-and-only-if you don't plan to sleep while holding it.
You can choose to work at the level of wait queues using either
the
associated low-level routines such as add_wait_queue()
, remove_wait_queue()
,
or the higher-level routines such
as prepare_to_wait()
, finish_wait()
. You
can
find code examples both in the book and in the kernel. If you prefer
to use functions such as interruptible_sleep_on()
and sleep_on()
,
then plan carefully because they can be
racy in certain circumstances.
HINT: a useful method to guarantee the validity of a data
structure in
the presence of concurrent create, access and delete operations, is to
maintain a reference count for the object. Then, the object should only
be freed by the last user. The file system, for example, keeps a
reference
count for inodes: when a process opens a file the reference count of
the
inode is incremented. Thus if another process deletes the file, the
inode remains valid in the system (albeit invisible) because its count
is positive. The count is decremented when the process closes the
corresponding file descriptor, and if it reaches zero then the inode
is freed. A reference count must be protected against concurrent
access,
either using explicit synchronization or using atomic types (see atomic_t
in Chapter 5 in Understanding the Linux Kernel).
You are designing a completely new facility to add to the
kernel. You
will need to create a new implementation file and change the Makefile
so that it compiles your new file. You are also likely to need to
change a few parts of the existing kernel. You should properly handle
errors that may occur and report meaningful error codes, e.g. -ENOMEM
in the barrier that a memory allocation fails.
As always, you are encouraged to look for existing kernel code that
performs similar tasks and follow the coventions and practices it
provides.
(5 pts.) Write an application program to test your new kernel functions. Your test program should show that the kernel functions work for the usual cases, for example, when all N processes wait on a barrier, and also for boundary conditions such as (but not limited to)
barrierwait()
is called;barrierdestroy()
is
called.You should create a patch to store changes you've made to the kernel and to submit your changes for this problem. If you've changed 50 lines of code, a patch will be about 50 lines long. This is much easier to store, email, or move around than the full 283,000-line source tree.
If you are using the KDB patch you applied from Homework 2 (which we recommend), your patch
should be from after that patch is applied; i.e., KDB diffs should not
show up in the patch.
Git is a perfect tool to create a patch and apply it. First back up your .config file, then do a ``make distclean'' in your changed source tree (you don't want object files, config files, etc. in your patch). Then, run git format-patch -k --stdout v2.6.18 > ../homework5.patch. Please run git commit for each step of the homework.