HW3: HTTP Server

Submission

Part 0: Basic single-process / single-threaded web server (Not graded)

Required tasks

Recommended tasks

Deliverables

Part 1: Multi-process web server

The basic version of the HTTP server has a limitation: it can handle only one connection at a time. This is serious limitation because a malicious client could take an advantage of this weakness and prevent the server from processing additional requests by sending an incomplete HTTP request. In this part we improve the situation by creating additional processes to handle requests.

The easiest way (from programmer’s point of view) to handle multiple connections simultaneously is to create additional child processes with the fork() system call. Each time a new connection is accepted, instead of processing the request within the same process, we create a new child process by calling fork() and let it handle the request.

The child process inherits the open client socket and processes the request, generating a response. After the response has been sent, the child process terminates.

Required tasks

Recommended tasks

Requirements (and hints)

  1. Note that the two socket descriptors – the server socket and the new connected client socket – are duplicated when the server forks. Make sure to close anything you don’t need as early as possible. Think about these:

  2. Don’t let your children become zombies… At least not for too long. Make sure the parent process calls waitpid() immediately after one or more child process have terminated.

  3. Modify the logging so that it includes the process id of the child process that handled the request.

Part 2: Interprocess communication through shared memory

Reading assignment

Required tasks

Requirements, hints, and recommended order of tasks

  1. Since multiple child processes will need to update the stats, you need to keep them in a shared memory segment. Use anonymous memory mapping described in APUE 15.9.

  2. Perform the hit test from Part 1 and see if your code keeps accurate stats. The request counts may or may not be correct due to race conditions.

  3. Now use POSIX semaphore described in APUE 15.10 to synchronize access to the stats. A few things to think about:

  4. Repeat the performance test and verify that the stats are accurate.

Part 3: Directory listing

The skeleton http-server.c does not handle directory listing. When a requested URL is a directory, it simply responds with 403 Forbidden.

Tasks

Requirements and hints

Part 4: Directory listing without running /bin/ls (0 points)

This part is optional and will not be graded.

Tasks

Requirements and hints

Part 5: Multi-threaded web server

POSIX threads provide a light-weight alternative to child processes. Instead of creating child processes to handle multiple HTTP requests simultaneously, we will create a new POSIX thread for each HTTP request.

Required tasks

  1. Modify the original skeleton code (i.e., part0/http-server.c) so that the web server creates a new POSIX thread after it accepts a new client connection, and the new thread handles the request and terminates afterwards.

  2. Two library functions used by the skeleton http-server.c are not thread-safe. You must replace them with their thread-safe counterparts in your code.

    In your README.txt, identify the two functions and describe how you fixed them.

  3. Test this implementation by connecting to it from multiple netcat clients simultaneously.

Recommended tasks

Requirements and hints

Deliverables

  1. Makefile, http-server.c, and other source files as usual

  2. In your README.txt: how you fixed the two non-thread-safe function calls

Part 6: Pre-created pool of threads

Reading

Read the following Q&A at StackOverflow.com:

In part 6 & 7, we will implement the two methods described in the article.

Required tasks

  1. Instead of creating a new thread for each new client connection, pre-create a fixed number of worker threads in the beginning. Each of the pre-created worker threads will act like the original skeleton web server – i.e., each thread will be in a for(;;) loop, repeatedly calling accept().

  2. Test this implementation by connecting to it from multiple netcat clients simultaneously.

Recommended tasks

Requirements and hints

Part 7: Blocking queue

Required tasks

  1. Modify the code so that only the main thread calls accept(). The main thread puts the client socket descriptor into a blocking queue, and wakes up the worker threads which have been blocked waiting for client requests to handle.

  2. Test this implementation by connecting to it from multiple netcat clients simultaneously.

Recommended tasks

Requirements and hints

Part 8: Listening on multiple ports

Required tasks

  1. Modify the code so that the web server takes not just one, but multiple port numbers as command line arguments (followed by the web root as the last argument.) The web server will bind and listen on all of the ports.

  2. Test this implementation by connecting to it from multiple netcat clients simultaneously to different ports.

Recommended tasks

Requirements and hints

Part 9: Nonblocking accept()

Part 8 has a flaw. Between select() and accept(), there is a chance that the client connection gets reset. If that happens, accept() will block. In order to handle that case, we need to make the server socket nonblocking.

Tasks

  1. Modify the code so that createServerSocket() sets the server socket into a nonblocking mode.

  2. Now accept() will never block. In those cases where it would have blocked, it will now fail with certain errno values. Read the man page to find out which errno values you need to handle.

Part 10: Printing request statistics on SIGUSR1

Recall part 2, where we implemented a special admin URL /statistics to fetch a web server request statistics page. In this part, we will implement an alternate mechanism to print statistics.

For this part, we have to go back and start from our part 3 code, which is the last version of http-server with multiple processes (before we switched to multi-threading in part 5.)

Tasks

  1. Modify the code from part 3 so that when the web server receives a SIGUSR1 signal, it will print the statistics at that time to standard error.

  2. Test it by sending the signal with the kill command while the web server is blocked on accept call.

  3. Test it by sending the signal with the kill command to a child process while the child process is in the middle of receiving an HTTP request. Describe what happens and explain why.

Requirements and hints

Deliverables

  1. Makefile, http-server.c, and other source files as usual

  2. In your README.txt: explanation for task #3.

Part 11: Server-side bash scripts (0 points)

This part is optional and will not be graded. You may skip to part 12.

This part is a challenge for those of you hackers, who are complaining that this assignment has been too easy so far.

In this part, we will enable server-side bash scripts. When a requested URL is an executable script, the web server will run it using /bin/bash, and send back the output of the script.

The web server will ensure that the script will not run longer than a fixed amount of time. The server will also terminate the script if the HTTP client (i.e. the browser) closes the TCP connection while the script is still running.

Getting this right is actually pretty hard. You are not expected to handle every single corner cases. (In fact, our solution doesn’t handle all cases either.) But you can get close. We suggest you approach this in the following order:

  1. Implement support for server-side scripts

  2. Terminate the script when the HTTP connection is closed

  3. If the script does not respond to SIGTERM (because it’s catching it or ignoring it), send SIGKILL.

  4. Limit the time that the script can run even if the HTTP client is willing to wait.

Part 12: Pre-forked pool of processes

Recall that in part 6 we pre-created a pool of worker threads. Here, we will pre-fork a pool of worker child processes.

What the child processes do is also similar to what the threads did in part 6. The child processes will all be in an infinite loop repeatedly calling accept(). In part 13, we will change this model in a similar way we did in part 7. In part 7, we passed open socket descriptors to worker threads using a blocking queue. In part 13, we will pass open socket descriptors to worker processes using a UNIX domain socket.

Required tasks

  1. Pre-fork a fixed number of processes. Each child process will run a for (;;) loop, in which it will call accept() and handle the client connection.

  2. Change the code so that only the parent process will handle SIGUSR1 for dumping statistics.

Recommended tasks

Part 13: Passing socket descriptors to child processes

In this part, instead of all the child processes calling accept(), only the parent process will call accept(), and it will pass each connected socket to a child process (chosen by round robin) through a UNIX domain socket.

Here are sendConnection() & recvConnection() functions that sends and receives open file descriptors through a UNIX domain socket. (You don’t need to understand this code. These are for you to copy & paste, and use it in your http-server.c.)

// Send clntSock through sock.
// sock is a UNIX domain socket.
static void sendConnection(int clntSock, int sock)
{
    struct msghdr msg;
    struct iovec iov[1];

    union {
      struct cmsghdr cm;
      char control[CMSG_SPACE(sizeof(int))];
    } ctrl_un;
    struct cmsghdr *cmptr;

    msg.msg_control = ctrl_un.control;
    msg.msg_controllen = sizeof(ctrl_un.control);

    cmptr = CMSG_FIRSTHDR(&msg);
    cmptr->cmsg_len = CMSG_LEN(sizeof(int));
    cmptr->cmsg_level = SOL_SOCKET;
    cmptr->cmsg_type = SCM_RIGHTS;
    *((int *) CMSG_DATA(cmptr)) = clntSock;

    msg.msg_name = NULL;
    msg.msg_namelen = 0;

    iov[0].iov_base = "FD";
    iov[0].iov_len = 2;
    msg.msg_iov = iov;
    msg.msg_iovlen = 1;

    if (sendmsg(sock, &msg, 0) != 2)
        die("Failed to send connection to child");
}

// Returns an open file descriptor received through sock.
// sock is a UNIX domain socket.
static int recvConnection(int sock)
{
    struct msghdr msg;
    struct iovec iov[1];
    ssize_t n;
    char buf[64];

    union {
      struct cmsghdr cm;
      char control[CMSG_SPACE(sizeof(int))];
    } ctrl_un;
    struct cmsghdr *cmptr;

    msg.msg_control = ctrl_un.control;
    msg.msg_controllen = sizeof(ctrl_un.control);

    msg.msg_name = NULL;
    msg.msg_namelen = 0;

    iov[0].iov_base = buf;
    iov[0].iov_len = sizeof(buf);
    msg.msg_iov = iov;
    msg.msg_iovlen = 1;

    for (;;) {
        n = recvmsg(sock, &msg, 0);
        if (n == -1) {
            if (errno == EINTR)
                continue;
            die("Error in recvmsg");
        }
        // Messages with client connections are always sent with 
        // "FD" as the message. Silently skip unsupported messages.
        if (n != 2 || buf[0] != 'F' || buf[1] != 'D')
            continue;

        if ((cmptr = CMSG_FIRSTHDR(&msg)) != NULL
            && cmptr->cmsg_len == CMSG_LEN(sizeof(int))
            && cmptr->cmsg_level == SOL_SOCKET
            && cmptr->cmsg_type == SCM_RIGHTS)
            return *((int *) CMSG_DATA(cmptr));
    }
}

Required tasks

Recommended tasks

Requirements and hints

Part 14: Daemonization (0 points)

This part is optional and will not be graded.

In this part, we will make our web server a daemon process. Daemons in UNIX systems are programs that run as background processes typically providing essential system services to users and other programs. See APUE chapter 13 for more information.

Tasks

Daemonizing your web server is super-easy. Here are all you have to do:

  1. At program start-up (i.e. in the beginning of the main() function maybe after checking arguments), call daemonize() from APUE 13.3.

  2. The daemonize() function will detach the running process from its controlling terminal, so printing to stdout or stderr won’t work anymore. You need to replace the printf() and fprintf() statements with syslog(), described in APUE 13.4.

If you are doing this part, I recommend that you read APUE chapter 13 to learn about daemon processes.

Good luck!


Acknowledgment

This series of assignments were co-designed by Jae Woo Lee and Jan Janak as a prototype for a mini-course on advanced UNIX systems and network programming.

Jan Janak wrote the solution code.

Jae Woo Lee is a lecturer, and Jan Janak is a researcher, both at Columbia University. Jan Janak is a founding developer of the SIP Router Project, the leading open-source VoIP platform.


Last updated: 2016–02–07