The projects below are available in the context of CS E6998-03 (Advanced Internet Services) or as a 3998/4901/6901 project. Please contact Prof. Henning Schulzrinne for details.

Names under each project indicate whether the project has been assigned. Some projects may be assigned to more than one group. Projects listing "NN" as a project team member are looking for additional students.

Internet Audio/Video-Related Projects

Web by phone

Using our telephone gateway, build a web-by-phone server that recognizes touch-tone (DTMF) digits and then reads a web page aloud using the Bell Labs TTS text-to-speech system. The gateway requires use of on-campus hardware resources, but you can implement the project without the hardware on any system with audio I/O. You may find a DTMF generator handy; it is available for a few dollars from your local Radio Shack store. Lynx may make a good browser to pre-process web content for reading aloud, particularly in its version for blind users.

The TTS system is available at ~hgs/multimedia/tts, with some random notes.

Francesco Caruso

Xin Jin

Email by phone

Similar to the previous project, allow reading of email messages through the phone. The caller should be able to navigate through messages and delete them. You can either retrieve messages into a file (using movemail, for example) or build/use a POP or IMAP client. You should be able to read subject lines and mail sources (From header). Usually, only the display name, not the user@host part, should be read.

The user should be able to step through the headers of the messages rather than reading the message.

When the message is being read, you should be able to stop, back up and cancel the reading. Included quoted text (<) should be spoken with a different voice. Possible extensions include the ability to only read messages marked as 'urgent' in the subject line or those personally addressed to the recipient (rather than to a mailing list). Additional features include the ability to respond to a message, using a MIME voice attachment recorded from the phone. Also, it should be possible to play back messages with an audio attachment.

For testing, the system should have an input mode that accepts, via telnet or command line, the same DTMF commands as issued by the phone interface.

CoolMail and CollegeClub are examples of commercial services. Siemens has a product called Xpressions.

Serge Shamis, Jack Hsu and Jeff Stutz

Multimedia jukebox: catalogue

We have a 500-CD ROM "jukebox" in our lab, a robotic jukebox changer. Using existing interface utilities, Write a program that catalogues the jukebox CD ROMs into a line-per-CD text file and/or web page. With that, write a program that locates and loads CDs by title. Allow user to associate a particular waveform signature with an audio CD, so that CDs can be found after having been added and removed. (Suitable only for on-campus students.)

Tim Trampedach

Multimedia jukebox: audio CDs

Using the jukebox described in the previous project and available raw-disk reading routines, write a program that reads the data from the disk and serves it using an existing RTSP server.

Conversion of AVI to H.261

Using the H.261 codec in vic and the cinepak codec in xanim, convert standard AVI files to H.261, to allow streaming across the network using the RTSP server.

Write MPEG payload format

Write a module that splits an MPEG file into individual frames and wraps each into the necessary RTP payload header (see RFC xxx). Integrate into the RTSP server for generating streaming MPEG.

Floor Control

Develop a scalable, multicast-based floor controller suitable for a distributed classroom. The floor control tool should send PMM messages to compatible multimedia tools to enable sending and receiving. By pressing a function key on the keyboard, students "raise their hand". The instructor can recognize a student by clicking on the student's name (or using a touch screen) or picking students in order by pressing a key on another computer. Students should be able to "lower their hand" and attach a note indicating the nature of the question. Consider the loss of packets in the network. Students may join class late and need to learn about any pending floor requests.

LPC integration into NeVoT

The LPC-10 low-rate audio codec by Andy Fingerhut, Washington University is to be integrated into NeVoT, the network voice terminal. Attempts at optimization should be made (if you have some signal processing background). The processing requirements and audio quality (as a function of loss and packet size) are to be evaluated and compared to the current LPC codec.

MPEG Layer-3 audio codec integration into NeVoT

Integrate the MPEG layer 3 high-quality audio codec into NeVoT.

Recording function for NeVoT

Add a function to NeVoT, NeViT and/or the vic video tool that records incoming and outgoing packets into an rtpdump file. This is useful when recording events for later re-broadcast.

Screen grabber

Add functionality that allows to grab screen or single window into NeViT or vic. This greatly improves the quality of sending PowerPoint or Adobe Acrobat reader slides compared to scan conversion and then going into digital format again.

RTP translator

Using the RTP library, build an RTP translator to allow configurable translation between audio formats, using the audio encoding libraries of NeVoT. For example, an incoming high-bitrate stream using PCM encoding could be translated to GSM.

RSVP integration into NeVoT

Integrate RSVP capabilities into NeVoT. You can either install and use the existing Solaris or FreeBSD implementation or, more ambitiously, write an RSVP client that implements the RSVP protocol on top of UDP in user space. For this client, no local scheduling or admission control is necessary, just the generation of RSVP messages. NeVoT has to be modified to make reservations and send PATH messages appropriate for the selected encoding. Consider integrating a dynamic scheme so that a failed reservation leads to a reservation at a lower bit-rate. The implementation can be tested with the Sun or FreeBSD RSVP implementations or a Cisco router.

NeVoT reliability enhancements

Add packet-loss compensation to NeVoT, using forward error correction mechanisms and lower-fidelity delayed versions. Measure the effect.

VCR interface for audio/video tools (RTSP client)

Build a Java or Tcl/Tk user interface that implements "VCR" functionality for on-demand access to stored audio and video using RTSP. The user interface contains the standard VCR control buttons (record, play, pause, rewind, fast forward), as well as a slider or similar element for positioning (by elapsed time) within the movie. The audio and video stream are delivered by the RTSP server being developed in my lab; streams are played back using standard audio and video tools such as NeVoT, vat, vic, rat or IP/TV.

As part of the project, you should develop a standalone library that parses RTSP responses. (Please discuss API with instructor.)

Shiva Bhakta

Alex Basile

Java RTSP server

David Jensen

NeVoT stereo placement

Enhance NeVoT to allow placing speakers acoustically at different parts of the stereo space, using, for example, delay and channel balance.

Measurements and comparisons of audio and video codecs for networks

Evaluate and compare the performance of audio and video codecs both through objective and subjective measurements. (Possibly several, coordinated projects.) Include the performance for random and correlated packet losses, including loss traces collected from the Internet.

Prepare and present a wide variety of standard audio and video sequences, showing different qualities and the effects of impairments, to be used for comparison (A/B tests) and teaching.

Loss Measurement Tool

Create a tool that combines traceroute and ping to indicate where in the path packets are lost and delayed. Present results graphically and try your tool on some domestic and international routes.

H.263 implementation

Based on existing H.261 implementations, implement an H.263 video codec for vic and/or NeViT (MBONE video conferencing).

Clock measurements

Measure clock accuracy of audio/video clocks on workstations and PCs.

Multimedia server

Develop a module for the Columbia RTSP media-on-demand server that reads and writes Microsoft ASF audio/video files.

John Pelak

Digital editing of live content

Design and develop one or more modules for creating digital effects for live video (such as title overlay, wipes, fades, insert) for use in the new CS/CVN digital video facility.

Classroom question manager

Design and develop an application that supports "raising your hand" in a distributed, Internet-based classroom. The mechanism should scale to very large classes, with hundreds of participants. Thus, instead of a central server, multicast UDP should be used. Students should be able to indicate the nature of the question so that an instructor can group or delay questions. An instructor should be able to call on students out-of-order and cancel questions. Calling on a student enables that student's audio and video transmission. Other students should, in general, be able to see the list of waiting students. Consider using the PMM (pattern-matching multicast) enabled audio and video applications NeVoT and vic. Your application has to deal with packet loss, but can assume that the computer clocks of all participants are synchronized, so that there is no ambiguity as to who raised her hand first. This application could also be useful for meetings, "Internet TV" shows with audience participation and town-hall-style gatherings.

John J. Lin

Telephony interface

Construct a software driver to connect the analog phone line interface to an IP telephony server, allowing dial-in and dial-out. (Details). A basic version connects every incoming phone call to a fixed Internet address and allows outgoing calls to any phone number.

For group projects, this gateway can be extended so that a phone caller can "dial up" an Internet address. The phone answers Welcome to the Internet telephony gateway service. Please enter the email address or extension, using * for the "at" sign. The user then "types" in the beginning of a name or email address using the phone's numeric keys (447 for hgs, for example), with the text-to-speech facility offering the list of ambiguous choices. You should be able to resolve both ambiguous user names (e.g., from the /etc/passwd file) and ambiguous host names (e.g., by acquiring a zone dump of the .com entries).

Consider also the restriction of outgoing phone calls.

Your system must handle the case when no outgoing phone line is available.

Weibin Zhao and Ying Zhang

Ko Uchiyama and Herbert Ignacio

RTP conformance tester

Develop a tool and test suite that tests RTP compliance and allows to "stress test" RTP implementations.

RTP-based "AudioFile"

Currently, only a single application can use the audio device. Based on the NeVoT library, develop a daemon that multiples requests for an audio device. Modify xplay and playtool to use this to play audio files and/or develop a Netscape plug-in for Solaris that plays audio files using this mechanism or direct write. This could also serve as the basis for an X-like remote audio capability.

Background eliminator

Preferably using the video codecs already in vic or NeViT, construct a system that only transmits a (moving) person, without the background and allows to reassemble a virtual audience at the receiver.

(Some image processing background is helpful for this project.)

Remote-control web and conferencing camera

The Canon VC-C1 camera has a motorized zoom, pan and tilt which can be controlled remotely through a serial port. Serial port interfaces can be written in C or Tcl. As part of the project, write a generic control library in C or Tcl that allows to control the camera. Once a basic control interface API has been written, there are several choices for functionality:

  1. Implement a forms and cgi web interface using Perl or Tcl, to retrieve images (web cam). Accomodate several "concurrent" users. If an image for a particular position has been acquired recently (within a setable interval), return it immediately, otherwise, queue the request and serve requests in order or using motion optimization. The web cam visitor should be able to click on a particular part of the image to zoom in or re-center the camera. The web camera is connected to a SunVideo frame grabber board, which can grab and convert analog video. The software, sample code and documentation for the board can be found in /opt/SUNWits/Graphics-sw/xil. A commercial example can be found at Perceptual Robotics.
  2. Live camera control: Create a Tcl/Tk application that allows a camera operator to position the camera for a live feed (e.g., to tape a lecture). The video is distributed to the network via applications such as vic. The Tcl/Tk application should allow both relative (move over 15 degrees) and absolute positioning (pan to 5 degrees left-of-center). It should be possible to store the current setting by name, so that one can quickly zoom in on a student in a class, for example. Extension: The camera operator points the mouse to a point of the existing image and the camera either zooms in or re-centers on that point. (This may require modifying vic to recognize mouse clicks in the displayed image.)
  3. Integrate camera control into the existing RTSP media server, using the GET_PARAMETER and SET_PARAMETER RTSP commands.

The camera is located in my office and can be accessed remotely through serial port A of

Jack Huang and Laura Zhou

Panagiotis G. Sebos

Directory Services for Internet Telephony

Compare the existing directory services for Internet telephony (Netscape/Insoft IS411, Vocaltec, Four11, Microsoft ULS, etc.). Some are documented, some need to be reverse-engineered. Consider distributed alternatives that allow sub-grouping and graphical representations (rooms, maps, VRML?). Install an implementation of LDAP and construct an interface to SIP.

Allandel Manipon

``Buddy List''

Using SIP, implement a ``buddy list'' (AOL term), where I can see who of one's friends are logged on (or active, i.e., with keyboard activity) at the time. It should be easy to invite a subset to a multimedia conference.

Moshe Sambol

Extending SIP with Notice Transport and Delivery Services

William Nagy

Calendar interface for Internet telephone

Write an interface to a calendar program (Schedule+, cm, CDE calendar, vcalendar compliant...) that automatically forwards or answers calls ("I'm in a meeting until 4:30", "I'm on vacation until August 26"). Allow user to define keywords that govern behavior, e.g., "private" may disable forwarding calls. Use existing privacy indications ("Show Time and Text", "Show Time Only", "Show Nothing") to govern detail to be provided, but allow configuration based on user groups. For example, the user might configure the application so that all callers from are shown time only, while a select group of people identified by their email addresses are shown full details. Everybody else is shown nothing (available/not available). You could use the Netscape addressbook to look up forwarding details. (For example, "meeting at Hilton, San Francisco" would look up phone number for forwarding.)

Your program should be able to deal with overlapping appointments. For example, if the user is traveling to LA and has a meeting there, indicating the end of the meeting may not be particularly interesting, but rather the time to call back would be the time of return from the trip.

Consider parsing the calendar entry to understand header fields, e.g., a Location: line would indicate the appropriate forwarding location.

For the Sun calendar manager (cm), the calendar files are stored in /var/spool/calendar.

The application should be called as

busy caller

The application should return messages in the format of SIP cgi responses, for example:

Status: 480 Meeting with John Doe
Retry-After: Mon, 9 Feb 1998 17:37:17 +0100
Content-Length: 0

If there is currently no meeting scheduled, return 200 OK.

Johnny Yeung

Scheduling of repeated events

Develop a language that expresses sequences of events like daily newscasts, (almost) weekly lectures and the like in a space-efficient manner. Take into account daylight savings time shifts.

Bandwidth estimation and measurement

Based on sample packets (ICMP, UDP, or TCP) or the received data stream, attempt to estimate the bottleneck bandwidth for point-to-point links. Estimate the bandwidth distribution for multicast groups.

Use of Netscape addressbook for Internet telephony

Enhance the Netscape addressbook to allow dialing of Internet and POTS phone calls and to log incoming phone calls by person ("contact list") in the "notes" section of the address book entry. Use the address lists to manage group conferences. Your application should also offer a non-GUI interface to register a phone call and use the database. For example, it should be possible to invoke a program as ab_lookup Smith and get back, one per line, the email addresses of possible callees. Similarly, ab_log subject should add a log of this call to the entry for J. Smith.

Thomas S. Chin

RTCP for network management

Add RTCP capability to the tkined network manager and display tool, to allow monitoring of reception quality in a local or wide area network.

Service differentiation for data and real-time

Investigate options (to be discussed with instructor) to differentiate data and real-time services without using resource reservations.

Call controller

Using Java or Tcl/Tk, implement a call controller, as described in See Personal Mobility for Multimedia Services in the Internet. The program should support both calling and called party using SIP for signaling. Your project should be modular, consisting of a parser and generator for SIP requests and responses, a module for parsing and generating SDP descriptions, and a user interface for initiating and receiving calls. The call controller starts the necessary applications (media agents) to receive and send audio and video. The controller also terminates these applications (by sending them an appropriate signal). You can use the media agents described on the RTP page, e.g., NeVoT, vic, rat and vat. Initially, it is sufficient to simply start these agents, with parameters supplied from the command line. For example, to set up a video connection to port 3456 at, the vic video tool is started as vic

Christopher Tse, Janet Park

H.323 Implementation

Implement (a subset of) H.323, the Internet telephony standard. H.323 is supported by Intel Internet Video Phone and Microsoft NetMeeting, among others, but only on Windows'95.

The project only has to worry about the signaling aspects, since there are existing RTP tools that can exchange data with these H.323 implementations; they just cannot set up a call.

The project might start by implementing routines that parse and generate the Q.931 messages described in H.225.0 and then proceed to parse and generate the ASN.1 (PER) encoding.

Shiva Bhakta (to be completed end of summer 1998)

Network reliability

Design and implement a network reliability monitor. Test reachability of a number of sites periodically (e.g., via ping every ten minutes). If a site is not reachable, try to determine, using traceroute, whether this is a local problem or a problem affecting only one site or a problem affecting an Internet backbone.

Delay measurements

Using the phone system or a radio as a reference, measure the end-to-end delay of both commercial and research network audio (and maybe video) tools on different platforms. Determine whether delays are primarily due to the network, operating system and sound hardware or application.

Service Location Protocol

Implement the wide-area service location protocol.

Jack Caldwell

Last modified 1997-02-10 by Henning Schulzrinne