Read the skeleton code provided (
http-server.c). Make sure you
understand the code completely.
Test http-server.c using netcat (
Measure the performance of the basic web server
Use a HTTP traffic generator to measure how many requests the web server can handle in a second
You can find an open-source HTTP traffic generator or write a simple
shell script yourself that calls
wget in a loop
Use a sizable file (a hi-res image or a short movie) for testing so that it takes a measurable time for a request to complete
Note that we are not doing a serious performance measurement study. Measuring performance correctly and accurately is not an easy thing to do. The absolute value of our measurements does not mean much. Our goal is only to highlight the differences between the different versions of the web server.
Even then, it is entirely possible that rudimentary measurement techniques may not show much performance differences at all, due to various factors like caching. We’ll see. (I haven’t done any measurements myself, so I don’t know. :)
The basic version of the HTTP server has a limitation: it can handle only one connection at a time. This is serious limitation because a malicious client could take an advantage of this weakness and prevent the server from processing additional requests by sending an incomplete HTTP request. In this part we improve the situation by creating additional processes to handle requests.
The easiest way (from programmer’s point of view) to handle multiple
connections simultaneously is to create additional child processes with the
fork() system call. Each time a new connection is accepted, instead of
processing the request within the same process, we create a new child
process by calling
fork() and let it handle the request.
The child process inherits the open client socket and processes the request, generating a response. After the response has been sent, the child process terminates.
Modify the code so that the web server forks after it accepts a new client connection, and the child process handles the request and terminates afterwards.
Test this implementation by connecting to it from multiple netcat clients simultaneously.
Do performance testing. Do you see any difference from part 1?
Note that the two socket descriptors – the server socket and the new connected client socket – are duplicated when the server forks. Make sure to close anything you don’t need as early as possible. Think about these:
Does the parent process need the client socket? Should it close it? If so, when? If the parent closes it, should the child close it again?
Does the child process need the server socket? Should it close it? What would happen if it doesn’t close it?
Don’t let your children become zombies… At least not for too long.
Make sure the parent process calls
waitpid() immediately after one or
more child process have terminated.
waitpid()inside the main
for (;;)loop? Obviously we cannot let
waitpid()block until a child process terminates – we’d be back to where we started then. You will need to call
waitpid()in a non-blocking way. (Hint: look into
WNOHANGflag.) But even if you make it non-blocking, can you make your parent process call it immediately after a child process terminates? What if the parent process is being blocked on
Modify the logging so that it includes the process id of the child process that handled the request.
The new version of
Performance testing result
APUE 14.8: read page 525–527, skim or skip the rest
APUE 15.9: skim or skip page 571–575, read page 576–578
Modify the code so that the web server keeps request statistics. The
web server should respond to a special admin URL
/statistics with a
statistics page that looks something like this:
Server Statistics Requests: 50 2xx: 20 3xx: 10 4xx: 10 5xx: 10
Feel free to beautify the output.
Since multiple child processes will need to update the stats, you need to keep them in a shared memory segment. Use anonymous memory mapping described in APUE 15.9.
Perform the hit test from Part 1 and see if your code keeps accurate stats. The request counts may or may not be correct due to race conditions.
Now use POSIX semaphore described in APUE 15.10 to synchronize access to the stats. A few things to think about:
POSIX semaphore can be named or unnamed. Which is a better choice here?
Where should you put the
Are we using it as a counting semaphore or a binary semaphore?
Is any of the semaphore functions you are calling a “slow” system call? If so, what do you need to handle?
Repeat the performance test and verify that the stats are accurate.
The skeleton http-server.c does not handle directory listing. When a
requested URL is a directory, it simply responds with
/bin/ls -al on the requested directory and send out the result.
You can format it in HTML if you wish, but the raw output is fine too.
In order to take the output of the
ls command, you need to call
exec. Arrange the file descriptors so that the
ls output comes through the pipe.
Make sure you do not lose the multi-processing capability; that is, you still need to be able to serve multiple requests (whether they are files or directory listings) simultaneously.
Be diligent in closing the file descriptors that you don’t need as early as possible.
ls encounters an error, it will print things to
sure that the result you send to the browser includes them.
This part is easy. Instead of
readdir() system calls. See APUE 1.4 for an
You don’t have to mimic the output of
ls -al. Just the list of
filenames is fine – i.e., mimic the output of
You may want to keep around the code from part 3. In part 5 (coming up
in HW3b), we will generalized it to run any
This series of assignments were co-designed by Jae Woo Lee and Jan Janak as a prototype for a mini-course on advanced UNIX systems and network programming.
Jan Janak wrote the solution code.
Jae Woo Lee is a lecturer, and Jan Janak is a researcher, both at Columbia University. Jan Janak is a founding developer of the SIP Router Project, the leading open-source VoIP platform.
Last updated: 2014–02–13