Internet Engineering Task Force Audio/Video Transport Working Group INTERNET-DRAFT H. Liu draft-ietf-avt-rtp-mp2t-00.txt Cisco Systems March 9, 2000 Expires: 9/9/2000 This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 except that the right to produce derivative works is not granted. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Extension of RTP payload Type for Multiple Program MPEG Transport Stream Abstract This document is to define a dynamic payload type that extends the payload type of 33, defined in RFC1890 [4] for MPEG2 transport streams. The usage of payload type of 33 is defined in RFC2250 [1]. This purpose of this extension is to enhance the RTP protocol in such way that it can be used to deliver multiple program MPEG transport stream as well. 1. Introduction Recent effort in the DVB and ATSC standard body has endorsed MPEG2 transport stream format as the broadcast transport standard. With appropriate internet standard such as RFC 2236 [2] for multicast and RFC 2205 [3] and other RFC's for QoS, real-time internet streaming becomes quite feasible. Also with the technology advances on the "last-mile" network such as cable modem and DSL, it is quite possible to deliver high bandwidth, broadcast quality, and individualized MPEG video directly to home. Section 2 in RFC2250 defines how to encapsulate MPEG2 transport stream into RTP packet format. RFC2250 defines the base frequency for MPEG2 transport stream to be 90KHz. While this base clock is proper for an MPEG single program transport stream (SPTS), it losses the precision for MPEG multiple program transport stream (MPTS). Thus, a new payload type is needed when RTP is used to transport MPEG MPTS. 2. Extension to Payload Type for MP2T ISO/IEC 13818-1 [5] defines MPEG transport packet format to be able to carry not only video, audio, but also any program related data. Further more, with ISO/IEC 13818-6, DSM-CC [6], and the Radio Frequency Interface Specification of Data-Over-Cable Services Interface Specification (DOCSIS), defined by CableLab [7], the same packet format can be used to deliver IP data as well. According to ISO/IEC 13818-1, at MPEG transport packet layer, every transport packet is a sample unit in time, instead of the video or audio access units. Thus, the sample clock frequency should be suitable for the packet samples. In the satellite and cable industry, each video frequency carrier typically contains from 27Mbps to 54Mbps. Each frequency carrier can contain up to 40,000 MPEG transport packets. If the base clock frequency 90,000 defined by the payload type of 33 is to be used, each packet only has 2.5 clock tick time. Any round-off error can count for 30% error rate. Thus, the existing payload type 33 is not suitable to deliver MPEG MPTS. Besides, MPEG2 system requires a clock with a very tight constraint. It only allows +/-500ns jitter. One tick of 90KHz is about 11us. There is no way that a 90KHz clock can satisfy this constraint. Thus, the existing payload type 33 is not suitable to deliver MPEG transport stream. This document defines a new dynamic payload type of 96 that extends the base clock frequency for the timestamps to 27MHz. 90KHz is used in the payload type of 33. Furthermore, the new profile defined in this draft introduces the concept of "piecewise" constant bitrate (CBR) in the profile and provides precise timing model for every MPEG transport packet. We will discuss this in more details in Section 4. The meaning of other RTP header parameters remains the same as RFC2250. 27MHz is the same as the Program Clock Reference (PCR) defined in ISO/IEC 13818-1. Now that the MPEG transport stream is to feed any NTSC/PAL receiver, the clock recovery has to be very conservative. The same requirement on the clock frequency and clock recovery rate from ISO/IEC 13818-1 is applied here as well. Now that a dynamic payload type is to be used, RFC2327 [8], Session Decription Protocol (SDP), should be used to broadcast the session information. More specifically, the attribute parameter, a, in the SDP should be used. One example with dynamic payload type of 96 is given below: a=rtpmap:96 MP2T/27000000 Other parameters should be set followed the direction of RFC2327. The way to communicate this parameters between sender and receiver can be done either through Session Anouncement Protocol or through Real-Time Streaming Protocol (RTSP). It is out of the scope of this draft. However, it is critical that a RTP receiver obtains the SDP information and verifies its capability of handling it before processing any RTP packet. Without a valid SDP information, the RTP receiver should respond with a BYE RTCP packet to close RTP session. 3. A four layer model: When delivering MPEG transport stream using the profile defined in this draft, the buffer model given in Fig. 1 is assumed. +-----------+ +-----------+ +-----------+ +-----------+ | | | MPEG | |MPEG | | MPEG | |RTP packet |---->| transport |---->| PES |---->|A/V Compr. | | | | packet | | packet | | layer | +-----------+ +-----------+ +-----------+ +-----------+ RTP timestamps MPEG PCRs | | v v +-----------+ +-----------+ +-----------+ +-----------+ | IP | | MPEG | Vid.| MPEG | | MPEG | | Dejitter |---->| Transport |---->|Demultiplex|---->|Elementary | | Buffer | | Buffer |\ | Buffer | | Buffer | +-----------+ +-----------+ \ +-----------+ +-----------+ | \ | \ Audio | \ | \ +-----------+ | \ | MPEG | | --------------> | Audio Main| | | Buffer | | +-----------+ |System info & other data | +-----------+ | | MPEG | +---------------------------> |System Main| | Buffer | +-----------+ Figure 1: The Packet/Buffer Model for MPEG MPTS over RTP The Buffer model shown in Figure 1 is the extension of the buffer model given in ISO/IEC 13818-1. In addition to different MPEG packet layers, RTP layer and its associated IP dejitter buffer is added. The model given in Fig. 1 implies two clock recovery functions. The timestamps from RTP packets are used to synchronize the receiver clock for RTP session, while MPEG PCRs from MPEG transport packets are used to recover the MPEG decoder clock. The actual implementation may decide to merge two clock recovery functions into one. It is out of the scope of the draft. The size of the dejitter buffer is determined by the maximal delay variation introduced by the IP network. However, there are many ways to determine size of the dejitter buffer statically or statistically. For example, the jitter can be computed from the "interarrival jitter" from the sender report in its corresponding RTCP session. Alternatively, the jitter can also be computed from the guaranteed service block in ADSpec in the RSVP PATH message if RSVP is used for QoS. This document does not specify any method to determine the buffer size. There are two benefits by using layered approach given in Fig. 1. The first benefit is that all the standardized tables and new services defined by DVB and ATSC can be included without any extra work. Thus, any video content that is available from digital broadcast industry can be easily delivered to the end user through IP network with minimal work at the RTP layer. The second benefit of this layered model is that the IP dejittering layer does not need to implement at the decoder side. For example, one can remove jitter induced by IP network first and broadcast MPEG MPTS to the end customer. The purpose of the IP dejitter buffer is to recover the temporal relationship between MPEG transport packet at the buffer output. By doing so, as long as the the MPEG MPTS is compliant at the ingress point, the output of the dejitter buffer should be compliant to ISO/IEC 13818-1. Thus, as long as QoS can be supported, the IP network can be viewed as another streaming transport method, just like ATM. 4. The Use of Timestamp In addition to its original purpose of clock synchronization, when the payload type of 33 is used with marker bit set to one, the timestamp is also used for the following information: a. Derived Piece-wise constant-bitrate(CBR) for the current RTP packet. It is assumed under this document that the session bitrate for any RTP packet is a constant. By knowing the size of RTP packet (from the length field in UDP packet), the timestamps from the current and the next RTP packet, this constant bitrate can be derived. However, the bitrate between packets can be different. Thus, a more constrained VBR can be implemented by changing the channel bitrate at the boundary of each RTP packet. b. Derived timestamp for every MPEG transport packets encapsulated in the RTP packet. Now that the session bitrate within one RTP packet is a constant and the timestamp of the RTP packet represents the timestamp of the first MPEG packet, it implies that the timestamp corresponding to the rest of MPEG packets in the RTP packet can be derived from it. As an example, suppose the current RTP packet has a timestamp of 200, the next RTP has a timestamp of 400, and the number of MPEG packets encapsulated in one RTP packet is five. The first MPEG packet has a timestamp of 200, the second 240, the third 280, the fourth 320, and the fifth 360. Thus, the packet flowing out of the dejitter buffer is deterministic. 5. Limitation on the rate change when M is set to one The definition of M in this draft follows RFC2250. According to RFC2250, when M is set to one, it indicates that the session has switched to different source. Thus, the receiver should reset its PLL and lock to the new clock immediately. Consequently, there is no relationship between the current RTP packet and the previous RTP packet. Thus, the rate change must be prohibited on the last RTP packet before performing any source switching. 6. References [1] D. Hoffman et al., "RTP Payload Format for MPEG1/MPEG2 Video", RFC 2250, Jan. 1998. [2] W. Fenner, "Internet Group Management Protocol Ver. 2", RFC2236, Nov. 1997. [3] R. Braden et al., "Resource ReSerVation Protocol (RSVP) - Version 1 Functional Specification", RFC 2205, Sept. 1997. [4] H. Schulzrinne, "RTP Profile for Audio and Video Conferences with Minimal Control", RFC 1890, Jan. 1996. [5] ISO/SEC. Generic Coding of Motion Pictures and Associated Audio: Systems. ISO/IEC Standard 13818-1, 1995. [6] ISO/IEC. Generic Coding of Motion Pictures and Associated Audio: Extension for DSM-CC. ISO/IEC Standard 13818-6, 1998. [7] CableLab. Data-Over-Cable Service Interface Specifications: Radio Frequency Interface Specification. SP-RFIv1.1-I01-990311, 1999. [8] M. Handley and V. Jacobson, "SDP: Session Description Protocol", RFC2327, Apr. 1998.