COMS W4118: Operating Systems I

Dept. of Computer Science, Columbia University

Spring 2013 — Kaustubh Joshi

Homework 2

Homework 2 is due Monday 3/4 at 12:01 AM ET.

Errata/Additions:

  1. In Section II of the programming assignment, you are being asked to pass several arrays to the kernel. You should not copy these arrays into a local array defined in your syscall function (why?). Allocate memory for them dynamically using kmalloc. Be sure to make sure you free everything you allocate even if your syscalls return through an error codepath. Also remember that the mallocing can cause your code to be put to sleep.
  2. This programming assignment may have several places where we ask the question (why?) in parentheses. In your written assignment, you can answer these questions for extra credit (up to 5 points per why?). Your answer must be your own - do not copy your groupmate's answer.
  3. Several of you have asked about how to find the pid based on the process name for processes such as ksoftirqd. We're not asking you to ensure that your program works with kernel threads (would it make sense to color them?), but you should ensure that any user level process can be identified correctly. To make things simpler still, you can assume that we will not test your code with any preexisting process owned by root (except /bin/sh or any test programs we run as root).
  4. Clarification: because Linux doesn't distinguish between processes and threads during task management, your syscall must work correctly even if it is passed a thread id (tid) instead of a process id (pid). The thread id's of a process's threads can be viewed from the /proc/<pid>/task directory. The correct behavior in this case is to assign the color to all the threads of the process containing the thread. Note however, that your test programs are not required to support identifying a process through one of its thread ids.
  5. Several people have asked about the rationale behind clearing a process's color after a vfork. Well, what you've been asked to do in the assignment is a simplification. In a full fledged system that implements an approach like colors, a new process would be allowed to loose its color only after it executes one of the exec calls. In the extra credit portion (the why?), you may answer why it is permissible for a process to loose its color after an exec.
  6. Important update to submission instructions: When submitting your programming assignment, please include the binaries of all test programs in your submission. Also, in addition to the patch you are required to submit (which contains your entire commit history), you should also submit a diff of your current code against a clean version of the kernel. To produce such a diff, use the following command: git diff hw2_clean.. > groupN_hw2.diff Submit both the .patch and .diff files, but check only the diff file using the checkpatch.pl script.

All written problems in this assignment are to be done alone, while the programming assignment is a group assignment. Submit the written and programming assignments separately using the instructions on the homeworks page. Periodically check back on the course page for any errata or updates to submission instructions.

Written Assignment (40 pts)

Exercise numbers X.Y refer to the exercise problem Y in Chapter X in the course textbook, Operating Systems Concepts 9e. Each problem is worth 5 points. Make your answers concise. You'll lose points for verbosity. Your written assignment directory should contain a single ASCII file called hw2.txt with your solutions and a completed teammate evaluation form for the programming assignment.

  1. 3.11: Explain the role of the init process on UNIX and Linux systems in regards to process termination.

  2. 3.17: Using the program shown below, explain what the output will be at lines X and Y and why.

        #define SIZE 5
        int nums[SIZE] = {0,1,2,3,4};
    
        int main() {
            int i ;
            pid_t pid;
    
            pid = fork();
    
            if (pid == 0) {
                for (i = 0; i < SIZE; i++) {
                    nums[i] *= -i;
                    printf("CHILD: %d ", nums[i]); /* LINE X */
                }
            } else if (pid > 0) {
                wait(NULL);
                for (i = 0; i < SIZE; i++) {
                    printf("PARENT: %d ", nums[i]); /* LINE Y */
                }
            }
        }
      

  3. 3.15: Give an example of a situation in which ordinary pipes are more suitable than named pipes and an example of a situation in which named pipes are more suitable than ordinary pipes.

  4. 3.16: Consider the RPC mechanism. Describe the undesirable consequences that could arise from not enforcing either the “at most once” or “exactly once” semantic. Describe possible uses for a mechanism that has neither of these guarantees.

  5. 3.18: What are the benefits and the disadvantages of each of the following? Consider both the system level and the programmer level.

    1. Synchronous and asynchronous communication
    2. Automatic and explicit buffering
    3. Send by copy and send by reference
    4. Fixed-sized and variable-sized messages

  6. 4.7: Under what circumstances does a multithreaded solution using multiple kernel threads provide better performance than a single-threaded solution on a single-processor system?

  7. 4.15: Consider the following code fragment:

        pid t pid;
        pid = fork();
        if (pid == 0) { /* child process */
            fork();
            thread_create( . . .);
        }
        fork();
      
    1. How many unique processes are created?
    2. How many unique threads are created?

  8. Read 4.22 about how to compute π (Pi) using Monte-Carlo simulations. The following is a single threaded program (without much error checking) to compute π. You need to convert this program into a multithreaded program that is optimized for a quad-core system. But the catch is that you are not allowed to use locks, mutexes, or any other synchronization primitives. How will you ensure the correctness of your program? (Note: you don't need add comprehensive error checking. Just make sure your solution is free of race conditions.)

        #include <stdio.h>
        #include <stdlib.h>
        #include <stdint.h>
        #include <time.h>
        #include <math.h>
    
        uint64_t hit;
        uint64_t run;
    
        int main(int argc, char* argv[]) {
            /* Seed the random number generator */
            srand(time(NULL));
    
            for (run = 0; run < UINT32_MAX; run++) {
                double x = (double)rand()/(double)RAND_MAX;
                double y = (double)rand()/(double)RAND_MAX;
                if (sqrt(x*x + y*y) <= 1) {
                    hit++;
                }
            }
            printf("Pi is %f\n", 4*(double)hit/(double)run);
        }
      

Group Programming Assignment: Colors (60 pts)

In this programming assignment, you will be working with and modifying the Android kernel, which is a slightly modified version of Linux. Because the kernel needs to contain device drivers that support specific pieces of hardware, different Android devices use different kernel branches. We will be testing your modified kernels on the Android emulator, for which the relevant kernel is goldfish-2.6.29. A preconfigured VM containing the Android emulator, an Atom x86-based Android virtual device image, a git repository containing the goldfish kernel, and the compiler tools required to build the kernel has been created for your group on the CLIC lab machines. Instructions on how to use this environment (and on how to create one on your own laptop if you wish) are available here.

As the deliverable, you will be required to submit a kernel patch against the goldfish-2.6.29 branch in your git repo showing all kernel modifications you have made (including any added files), and the user mode test programs detailed in the assignment. To produce such a patch, make sure you've committed all your changes, and then go to the root of the kernel tree and type git format-patch hw2_clean.. --stdout > groupN_hw2.patch where N is your group number. This command should create a single groupN_hw2.patch file to submit. You should supply a README file documenting your files and code that explains any way in which your solution differs from what was assigned, and any assumptions you made. In addition, you are required to also fill out and include this cover sheet to indicate whether your code has any problems or not, and what attempts you have made to fix them.

All code must be written in C and a Makefile must be provided with a default target that compiles all user-mode programs. When using gcc, the -Wall switch must be used to display warnings. In addition, you must check your kernel patches against the Linux coding guidelines using the scripts/checkpatch.pl script in the kernel tree. You can find a tutorial on how to use checkpatch and general kernel hacking tips here. Before submitting, make sure you remove any files produced during compilation (.o files, executable).

Description

As companies increasingly rely on corporate functions accessed through mobile devices, people often find themselves carrying multiple phones: one for personal use, and another, locked down one for work. It would be nice to have to carry only a single device but security of sensitive corporate data is a concern. While there are different ways to tackle this problem, e.g., having multiple virtual phones running on the same physical device, in this assignment we are going to implement part of a solution based on mandatory access control that is inspired by SEAndroid and AT&T Toggle.

The idea is to assign each process (and later in the semester, files) a "color" representing the environment, e.g. work or home, that it's part of, and preventing processes with different colors from communicating with each other or sharing data. We are going to do this by adding new system calls to the Android kernel that will allow a user mode program to set or get a process's color. Then, we will modify the Android binder to ensure that two processes with different colors cannot use it to communicate with each other. As we've discussed in class, the binder is an Android-specific IPC mechanism that support's Android's remote procedure call model and is used by applications to access services provided by other applications such as the contact list, email, or the phone dialer. You can find more details about how the binder is implemented and used here, here and here.

Deliverables

  1. Part I (5 points): Compile the Android goldfish kernel and boot it in the emulator environment. Make sure you've booted using your custom kernel by going to the "Settings" app on the phone, and selecting the "About phone" tab. Information about your custom kernel is shown in the "Kernel version" section. Your deliverables for the first part of the assignment are the strings found in this section. Paste them as-is in your README file.
  2. Part II (30 points): Add two new system calls that allow the caller to set a process's color in it's task_struct structure and to retrieve it. Assume that colors are defined as a 16 bit integer (u_int16_t from the file include/linux/types.h), and each process has a default color of 0. Put all your code in the file kernel/color.c. As you read about Linux system calls, you will learn that each system has a syscall number that is used to invoke it. The function definitions and syscall numbers for the system calls you will implement are as follows:
          /* Syscall number 223. nr_pids contains the number of entries in
             the pids, colors, and the retval arrays. The colors array contains the
             color to assign to each pid from the corresponding position of
             the pids array. Returns 0 if all set color requests
             succeed. Otherwise, The array retval contains per-request
             error codes -EINVAL for an invalid pid, or 0 on success. 
           */
          int set_colors(int nr_pids, pid_t *pids, u_int16_t *colors, int *retval);
    
          /* Syscall number 251. Gets the colors of the processes
             contained in the pids array. Returns 0 if all set color requests
             succeed. Otherwise, an error code is returned. The array
             retval contains per-request error codes: -EINVAL for an
             invalid pid, or 0 on success.
           */
          int get_colors(int nr_pids, pid_t *pids, u_int16_t *colors, int *retval);
      

    Adhere to the following rules when implementing your system calls:

    • Make sure to register your system call on other architectures. Your code must work for i386, x86_64, and arm. The provided system call numbers are valid only for the x86 32 bit architecture that we will test in the emulator. Other architectures may require different syscall numbers. So, do not hardcode constants - use #define instead. See the appropriate arch/<arch>/include/unistd.h file for each architecture. Document the numbers you used for the other architectures in your README file.
    • Make sure you check all the parameters. Remember, the kernel executes with elevated privileges and the caller may be a malicious program. Never trust pointers received from user space! See documentation for the copy_to_user and copy_from_user functions. Return an EFAULT error code if any of the pointers are invalid.
    • Only the root (or sudo) is allowed to change a process's color. Return -EACCES (see here for a description of standard error codes) if the caller isn't root (you may need to access the credentials structure found in task_struct) or use a function like current_uid().
    • Make sure that all threads belonging to the same process are marked with the same color. You may need to learn about the tgid field in the task_struct.
    • Color information must be propagated to children across the fork and clone system calls, but not the vfork system call (why?).

    Hints: Read LKD, Ch. 5 about system calls and how to add new ones. Read the code for the getuid() and getpid() in kernel/timer.c to see simple syscall examples. Don't forget to add the color.c file to the appropriate Makefile, and also to your git repository (using git add). You can find the process descriptor in the include/linux/sched.h file under the name task_struct. The Linux kernel provides efficient data structures to find tasks. Investigate the find_task_by_pid() and related family of functions. When manipulating the process list, make sure you use appropriate locking primitives to prevent the process descriptor changing while you are working with it. Remember that you should never sleep while holding a lock on the process list or a task_struct.

  3. Part III (10 points): Prevent two processes with different colors from communicating through the binder unless the color of either process is 0. The binder implementation can be found in drivers/staging/android/binder.c and drivers/staging/android/binder.h. You don't need to understand all the code - the function where you will need to do the permissions check is called binder_transaction which is called by the binder_ioctl IOCTL (I/O control) call. We'll learn about I/O and IOCTLs later in the semester. For now, all you need to know is that RPCs through the binder result in an IOCTL call to this function. To return an access denied error when the colors of the caller and callee do not match, you should not cause the ioctl syscall to fail (which would just result in the application trying to make the connection again), but instead, return a binder specific error. Look at the code to understand how other errors are handled by the binder, and return a BR_FAILED_REPLY binder error.

    To test your implementation, you may try the following use cases. Assigning different colors to the com.android.email (email app) and android.process.acore (contacts provider) processes should cause contact auto-completion when composing emails to stop working. Similarly, assigning different colors to the com.android.calendar (calendar app) and the com.android.providers.calendar (calendar provider) processes should prevent the calendar app from showing your calendar.

  4. Part IV (15 points): Write the following three test programs to test your work. All programs should compile for and be runnable on the Android emulator.

    1. setcolors should accept through the command line pairs of process name and color id and sets the colors using the set_colors syscall. E.g., setcolors /system/bin/sh 1 com.android.email 2 com.android.phone 2 will set the color of the /system/bin/sh process to 1, and so on. Handle errors in the inputs gracefully by printing error messages, but doing as much work as is possible (i.e., set the colors for as many processes as you can). Hint: you can find the command line name of each process in the /proc directory. What do the numbered directories correspond to?
    2. getcolors should accept a list of process names, e.g., getcolors /system/bin/sh com.process.acore com.android.email, and return their colors. Again, handle errors in the inputs gracefully.
    3. forktest should accept command lines with the following format: forktest <delay> fork|vfork|clone cmdline. On running, it should read the command line arguments, and sleep for delay seconds (use the sleep libc function). On waking up, it should either execute a fork(), clone(), or vfork() syscall, and execute the program specified in the rest of the command line cmdline using exec(). E.g., forktest 10 vfork forktest 10 fork sleep 10 will sleep for 10 sec, and then use vfork to recursively call a copy of forktest that sleeps for 10 sec and uses fork to call sleep 10.

Error Handling

Check the return values of all functions utilizing system resources. Do not blithely assume all requests for memory will succeed and all writes to a file will occur correctly. Your code should handle errors properly. Many failed function calls should not be fatal to a program. Typically, a system call will return -1 in the case of an error (malloc will return NULL). If a function call sets the errno variable (see the function's man page to find out if it does), you should use the result of strerror(errno) as part of your error message. As far as system calls are concerned, you will want to use one of the mechanisms described in Reporting Errors or Error Reporting.