MPEG RTP Generation Module

Yuntai Kyong
Columbia University
New York, NY 10027



In this project, a software module that splits an MPEG file into
individual frames and wraps each into the necessary RTP payload and sets
the fields of fixed RTP header and extension header for MPEG stream
according to the packeting rules and header field format described in
RFC2038 was developed. This module can be intergrated to RTSP server or
any generic server that delivers MPEG video streams onto network. Simple
RTP player was also developed for dumping MPEG RTP packets and verifying
the RTP-packetized MPEG video stream that it received.


Table of Contents




Internet RFCs

Software Design and Modules Built

Program Documentation

Future Work




The goal of the project was the creation of software library to allow
for the playback and Internet transmission of MPEG video. To do so, it
was necessary to create a way to package MPEG video into Real Time
Protocol (RTP) packets. RTP is a best-effort protocol geared toward real
time transmission and thus toward multimedia data. This software was
developed considering to be used with Real Time Streaming Protocol(RTSP)
, but also can be used with any Internet based MPEG video broadcasting
application. This software supports packetizing and the header extension
of RTP packet or MPEG2 transport stream and MPEG1/2 elementary stream.
Packetization scheme for these streams and header extension specification
are described in RFC2038, RTP Payload format for MPEG1/MPEG2 Video.




RTP is a transport protocol for real-time applications.This protocol is
used for the transport of real-time data, including audio and video. It
can be used for media-on-demand as well as interactive services such as
telephony. RTP provides support for applications with real-time
properties such as continuous media(e.g.,audio and video), including
timing reconsruction, loss detection, security and content
identification. UDP/IP is RTP's initial target networking environment.
RTP does not address resource reservation and does not guarantee
quality-of-service for real time services. The data transport is
augmented by a control protocol(RTCP) to allow monitoring of the data
delivery in a manner scalable to large multicast networks, and to
provide minimal control and identification functionality. RTP and RTCP
are designed to be independent of the underlying transport and network
layers. RTP itself does not provide any mechanism to ensure timely
delivery or provide other quality of service guarantees, but relies on
lower-layer services to do so. It does not guarantee delivery or prevent
out-of-order delivery, nor does it assume that underlying network is
reliable and delivers packets in sequence. The sequence numbers included
in RTP allow the receiver to reconstruct the sender's packet sequence, but
sequence numbers might also be used to determine the proper location of a


ISO/IEC JTC1/SC29 WG11 (also referred to as the MPEG committee) has
defined the MPEG2 standard (ISO/IEC 13818)[2]. This memo describes a
packetization scheme to transport MPEG video and audio streams using
the Real-time Transport Protocol (RTP), version 2 [3, 4]. The MPEG1
specification is defined in three parts: System, Video and Audio. The
video and audio portions of the specification describe the basic format
of the video or audio stream. These formats define the Elementary
Streams (ES). The MPEG2 System specification defines an encapsulation of
the ES that contains Presentation Time Stamps (PTS), Decoding Time
Stamps and System Clock references, and performs multiplexing of MPEG2
compressed video and audio ES's with user data. The MPEG2 System
specification defines two system stream formats: the MPEG2 Transport
Stream (MTS) and the MPEG2 Program Stream (MPS). The MTS is tailored for
communicating or storing one or more programs of MPEG2 compressed data
and also other data in relatively error-prone environments. The MPS is
tailored for relatively error-free environments.


The RTSP protocol is designed to establish and manage multimedia content
sessions, enabling clients to request access to presentations and
conference participants to "pass the remote control" amongst
themselves as they share multimedia stream data.  RTSP is meant to
be transport agnostic, permitting its content to be delivered over TCP,
UDP, or RTP.  RTSP furthermore leverages existing internet
protocols for session description and resource reservation.  RTSP's
design is inspired by that of HTTP 1.1, and is intended to emulate and
co-exist with this protocol wherever possible.


Internet RFCs

As is to be expected when dealing with the MBone and the Internet, there are a
host of official standards to be followed. The Internet Engineering Task force is
responsible for most of these, and two IETF Request for Comments (RFC) documents,
RFC 1889 and 1890, come into play for this project. More important, this project
falls into the jurisdiction of an IETF draft document as well.

RFC 1889

This IETF RFC 1889 [1] defines the Real-Time Protocol. This document is careful to
steer clear of defining any uses of the protocol. Nevertheless, all uses of RTP must
comply with RFC 1889.

RFC 1890

This IETF RFC 1890 [2] is a companion document to RFC 1889 which outlines some
parameters for the use of RTP with different media types. Specifically, RFC 1889
defines payload type codes and timings for different media types.

RFC 2250

This IETF RFC 2250 [2] had the most impact on this project. This document defines
a 32-bit header format to be used with packets of MPEG data. Formats for both audio
and video and for both encapsulated and elementary streams are defined. It is the
format for elementary video streams and MPEG2 transport stream with which this
project needed to comply. In this project elementary stream part is implemented.
It includes the packetization of elementray stream into RTP payloads and the settings
of RTP fixed header, MPEG video extension, and also MPEG2 extension header according
to the each rule rule described in this RFC.

For the MPEG video elementary stream followin fragmentation rules are applied:
1. The MPEG Video_Sequence_Header, when present, will always be at the beginning of
an RTP payload.
2. An MPEG GOP_header, when present, will always be at the beginning of the RTP payload,
or will follow a Video_Sequence_Header.
3. An MPEG Picture_Header, when present, will always be at the beginning of a RTP payload,
or will follow a GOP_header.
Also each ES header must be completely contained withing the packet. Also, the beginning
of a slice must either be the first data in a packet (after any MPEG ES headers) or must follow
after some integral number of slices in a packet.

For the MPEG ES encapsulation The RTP fixed header fields are used as follows as
described in [2]:

Payload Type: MPV designates the use MPEG-I and MPEG-II video encoding elementary[5].
This value is assigned 32 for video elementary stream.
M: For video, set to 1 on packet containing MPEG frame end code, 0
PT: Stream ID of video elementary stream is assigned.
Timestamp: 32-bit 90K Hz timestamp representing presentation time of MPEG picture.
Same for all packets that make up a picture. May not be monotonically
increasing in video stream if B pictures present in stream. For packets
that contain only a video sequence and/or GOP header, the timestamp is
that of the subsequent picture. Timestamp is increased by 29700 for one
picture for a MPEG elementary stream that contains 30 frame per second
frame rate. 90000*0.33 = 29700. Since the actual frame rate is 29.7 frame
per second, timestamp adjustment algorithm should be intergrated to give
more acurate time reference to decoder device.

- MPEG Video-specific header

For MPEG video, the following header is attached to each RTP packet after the RTP fixed


 0                   1                   2                   3 
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|    MBZ  |T|         TR        | |N|S|B|E|  P  | | BFC | | FFC | 
                                AN              FBV     FFV

Each field in this header is exactracted from the fields of the most recent header like
sequence or picture header. Also, this header indicates the presence of new sequence
header and the beginning or the end of each slice.
S bit indicates the presence of new sequence header. TR and P field indicates picture
information like temporal reference and picture type(frame type). FBV, BFC, FFV and FFC
fields give the motion compensation vector for the B or F type picture. B and E filed 
indicate the beginnig and ending of each slice. This field can be used by receiver for
determining the boundary of error.
- MPEG-2 Video-specific header extension
For MPEG video elementary stream, this header follows the MPEG video-specific header
extension described above. This header may not be vital in decoding of a picture, so the
inclusion of this extension is optional. If it presents, they should be copied from the 
corresponding extensions following the most recent MPEG-2 picture coding extension and 
they remain constant for each RTP packet of a given picture.
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|X|E|f_[0,0]|f_[0,1]|f_[1,0]|f_[1,1]| DC| PS|T|P|C|Q|V|A|R|H|G|D|

These values are copied directly from the extension header field, and the meanings of them
are explained in MPEG2 standard.

Software Design and Modules Built

To implement the functions described above, it was necessary to create three
software modules. The first reads MPEG video from a file. The second prepares MPEG
video frames for transmission on the Internet by packaging them into RTP packets,
fragmenting them if necessary. Third part uses the second part and broadcasts the RTP
packets onto Internet.

- FileStream

This module reads raw MPEG video files. These files, which usually use the designation
.mv2, contain a single MPEG video elementary stream. The files are legal MPEG2 video
elementary stream that contain all header. Files of this type are readily available on the
Internet (for example at the
Internet Underground Music Archive).

The fileStream object, when configured with a filename, opens the file and loads the chunk
of data. It provides bit manipulation routines for the upper layer - elementaryStream object

- elementaryStream

elementaryStream object abstracts the structure of elementary stream. It uses the the
bit-manipulation services provided FileStream to parse elementary stream structure. Basic
elementary stream structure like the following:

| sequence | gop | picture | extension header | slices...
sequence: sequence header
gop     : group of picture header
picture : picture header

This is also the RTP fragmenter. The job of this object is to form RTP packets from incoming
MPEG video frames. This object receives a MPEG video stream. Its output is suitable for use
with an RTP generator. It sets the RTP payload type as defined in [2] and is also responsible for
the use of the RTP fixed and extendsion header field.

- rtpMPEGlib

This module encapsulates all the underlying details of bitstream manipulation and elementary
fragmentation and the setting of RTP header feilds. It provides the simple application
programming interface(API) to the user of this library. That may include RTSP server or other
RTP compliant video broadcastin server. The user of this software module invoke play() method
with file name, starting time from the current time and valid socket as parameters. Socket in
Win32 is SOCKET type and it's the user's responsibility that passes established socket.

Program Documentation

This project is being developed in C++ using Microsoft Visual C++ 5.0 on Windows NT
Server Enterprise Edition 4.0 running the Windows NT 4 Option Pack with Service
Pack 3 applied. It is expected to run as an operating system service on Windows NT,
and relies on the WinSock API for communications with the TCP/IP stack.
Software module can be compiled using 'nmake' utility of Microsoft Visual C++ 5.0 and
can be downloaded from You need Microsoft Visual C++ 5.0 to
to compile this module and modify elementaryStream.mak file to include proper header
which is located on Visual C++ include directory.


Future Work

The development of this project has brought to mind many possibilities for future work.

Real-time encoders

This implementation of MPEG video elementary stream is by no means complete. For
example, there is currently no source of MPEG video data other than prerecorded files.
It would be extremely useful to have a software which could encode MPEG video in real
time or read from a piece of hardware which does so.

MPEG Video Stream Recorder

Another useful RTP module would be an MPEG video stream recorder. Since these new MPEG
video module is to be used to receive MPEG video across network, it would also be quite useful
to be able to record these transmissions. While it is currently possible to capture the incoming RTP
frames, recording in this format would be wasteful because of RTP overhead and would contain
unnecessary fragmentation. Also, standard MPEG video playback software would be unable to handle
the resulting recordings. Similarly, it would be possible to capture the video after its conversion
back to pure video picture samples.


[1] H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, "RTP: A Transport Protocol for
Real-Time Applications," RFC 1889, January, 1996.
[2] D. Hoffman, G. Fernando, V. Goyal and M. Civanlar, "RTP payload format for
MPEG1/MPEG2 video," RFC 2250, Internet Engineering Task Force, January , 1998.
[3] Barry G. Haskell, Atui Puri, and Arun N. Netravli, "Digital Video: An Introduction
To MPEG-2", 1997
[4] H. Schulzrinne, A. Rao, R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326,
Internet Engineering Task Force,
April 1998
[5] H. Schulzrinne, "RTP Profile for Audio and Video Conferences with Minimal Control",
RFC 1890, January 1996.
[6] RTP Resource
[7] RTSP Resource