Homework 1
W4118 Fall 2008
DUE: Friday 9/19/2008 at 11:59pm EST

All non-programming and programming problems in this assignment are to be done by yourself. Group collaboration is not permitted for this assignment. All homework submissions are to be made via Courseworks. Refer to the homework policy section on the class web site for further details.

Non-Programming Problems:

Exercise numbers refer to the course textbook, Modern Operating Systems. Each problem is worth 4 points.

  1. What are three main purposes of an operating system?

  2. Ch 1 Exercise 3

  3. Ch 1 Exercise 6

  4. Ch 1 Exercise 7

  5. Ch 1 Exercise 13

  6. Ch 1 Exercise 14

  7. Ch 1 Exercise 17

  8. Ch 1 Exercise 18

  9. Ch 1 Exercise 23

  10. Consider the various definitions of an operating system. Consider whether the operating system should include applications such as web browsers and mail programs. Argue both that it should and that it should not, and support your answer.

Programming Problems:

For all programming problems you will be required to submit source code, a README file documenting your files and code, and a test run of your programs. The README should explain any way in which your solution differs from what was assigned, and any assumptions you made. Refer to the homework submission page on the class web site for additional submission instructions.

  1. Shell Utility tar (20 Pts.) Most of the commands you type into a UNIX shell are separate executable programs that are invoked by the shell with the arguments you specify. For example, the files /bin/rm, /bin/cp, and /usr/bin/tar are programs executed by your shell when you remove, copy, or store and extract files from an archive file.

    Write wtar in C. Your wtar should assume each of its command line arguments is a filename, open that file and write its contents to standard output, in order and preserving the files permissions. Directories are ignored, so there is no recursion.

    Example Run

    # Archive Files:
    #
    # Following will archive the contents of the student directory.
    # user> wtar /home/student/* > student.wtar
    #
    # Extract archive:
    #
    # user> wtar-x student.wtar
    # user> ls -la
    # -rw-r--r--   1 student  studentgrp      235 Sep  3 13:04 file1.txt
    # -rwxr-xr-x   1 student  studentgrp      303 Sep  3 13:04 binary_ls
    # ...
    # user>
    	

  2. Simple Shell (25 Pts.) Even the shell itself is just another user program. The files /bin/sh, /bin/bash, and /bin/tcsh are all executable files for shells. The only thing special about your login shell is that it is listed in your login record so that /bin/login (the program that prompts you for your password) knows what program to start when you log in. If you run ``cat /etc/passwd'', you'll see the login records of the machine and the login shell program listed as the last field.

    1. Write a simple shell in C. Your shell should read a line from standard input, parse the line to get the command and its arguments, fork, and exec the command. Your shell should wait for commands to finish, and should exit when it receives the command exit. When executing a command, the shell should first look for the command in the current directory, and, if not found, search the directories defined in a special variable, path.

      • Using the system function is not allowed, as it just invokes the system's /bin/sh shell to do all the work.

      • You may assume command line arguments are separated by whitespace. Don't do anything special for backslashes, quotes, ampersands or other characters that are ``special'' in other shells. Note that this means commands in your shell will not be able to operate on filenames with spaces in them!

      • You may set a reasonable maximum on the number of command line arguments, but your shell should handle input lines of any length.

      • The executable file for your shell should be named sh. When testing, make sure you execute your sh and not /bin/sh.

    2. The path variable holds a list of possible paths in which to search for executables. The list of paths is empty by default. You should implement a builtin command to control this variable: path [+|- /some/dir]

      • path (without arguments) displays all the entries in the list separated by colons, e.g. "/bin:/usr/bin".

      • path + /some/dir appends the given pathname to the path list.

      • path - /some/dir removes the given pathname from the path list.

  3. Timing Methodologies (15 Pts.) Modern x86 PCs such as those in the CLIC lab have a tsc hardware register that counts processor cycles. The register can be read using the rdtsc instruction, which can be used to provide very accurate timing measurements. There are several macros used to read this register, as described in the include file asm/msr.h. These macros can be wrapped through a function like getcycles below, which takes a pointer to a long long and packs it with the number of cpu cycles used since the system was last booted. inline void getcycles (long long int * cycles){ unsigned long low; long high; rdtsc(low,high); *cycles = high; *cycles <<= 32; *cycles |= low; }
    1. Write a function in C called gethosttime that takes a long long cycles value as an argument and returns the equivalent long long in nanoseconds. Your function should be portable across machines, so it will need to determine the CPU speed of the machine on which it is being called. One way to do this in Linux is to parse the data contained in the file /proc/cpuinfo. The /proc filesystem (where /proc/cpuinfo resides), is a virtual filesystem that presents a way for userspace to communicate with the kernel and vice-versa. Various statistics can be read from the files in the /proc filesystem and /proc/cpuinfo, as its name suggests, contains information regarding the CPU.
    2. Using gethosttime, figure out how long it takes to execute the getcycles function. In addition, time the gettimeofday system call. You should be careful to do the measurement in such a way that minimizes the overhead of doing the measurement. In particular, you cannot directly measure the time of a function accurately using another function whose overhead to execute is comparable to the function being measured.
    3. Use both gethosttime and the gettimeofday system call to time the inner for-loop of the following bit of code: for (x=0; x<1000; x++) { for (y=0; y<100; y++) { /* inner loop starts here */ z = x + y; } /* inner loop ends here */ } Your writeup should include the mean and standard deviation of loop iterations, and provide an explanation for any large variations in your results. Based on your measurements, indicate which timing method is more accurate and explain why.

    Additional Requirements

  1. All code (including test programs) must be written in C.
  2. Use a makefile to control the compilation of your code. The makefile should have at least a default target that builds all assigned programs.
  3. When using gcc to compile your code, use the -Wall switch to ensure that all warnings are displayed. Do not be satisfied with code that merely compiles; it should compile with no warnings. You will lose points if your code produces warnings when compiled.
  4. Check the return values of all functions utilizing system resources. Do not blithely assume all requests for memory will succeed and all writes to a file will occur correctly. Your code should handle errors properly. Many failed function calls should not be fatal to a program. Typically, a system call will return -1 in the case of an error (malloc will return NULL). If a function call sets the errno variable (see the function's man page to find out if it does), you should use the result of strerror(errno) as part of your error message. As far as system calls are concerned, you will want to use one of the mechanisms described in Reporting Errors or Error Reporting.

    Tips

  1. For this assignment, your primary reference will be Programming in C. You might also find the Glibc Manual useful.
  2. Many questions about functions and system behaviour can be found in the system manual pages; type in man function to get more information about function. If function is a system call, man 2 function can ensure that you receive the correct man page, rather than one for a system utility of the same name.
  3. If you are having trouble with basic Unix/Linux behavior, you might want to check out the resources section of the class webpage.
  4. A lot of your problems have happened to other people. If you have a strange error message, you might want to try searching for the message on Google.