David Leon Internet Draft Viktor Varsa Document: draft-leon-rtp-retransmission-00.txt Nokia Expires: December 13, 2001 July 2001 RTP retransmission framework Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract RTP retransmission is an effective packet loss recovery scheme for application which can tolerate the increase in end-to-end delay necessary for this scheme. This document describes an RTP retransmission framework. It requires that appropriate feedback from receivers to senders be available. Packets are retransmitted either in the form of a FEC stream or using an RTP retransmission stream whose payload format is defined in this document. The sender is able to indicate different packet priorities for retransmission using a layered RTP payload format defined in this document Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [1]. Leon, Varsa IETF draft - Expires December 2000 1 RTP retransmission framework July 2001 Acknowledgements There are existing WG drafts for enhanced feedback and retransmission. This document is meant to contribute to this work. The layered RTP payload format proposed in this draft is adapted from the ideas presented in [2] Table of Contents Abstract...........................................................1 Terminology........................................................1 Acknowledgements...................................................2 1. Introduction....................................................2 2. Retransmission framework design goals...........................3 3. Receiver feedback...............................................4 4. Packet retransmission by an RTP source..........................4 5. Layered RTP transmission payload format.........................6 6. Combined use of retransmission and RTP layers...................8 7. Signalling with SDP.............................................9 8. Security consideration.........................................11 9. Intellectual property considerations...........................11 10. References....................................................11 11. Author's Addresses............................................12 1. Introduction RTP packet loss between a source and a receiver may significantly degrade the quality of the received signal. For example, in the case of video transmission, because of the nature of video coding algorithms, the loss of a video packet causes artifacts not only in the frame this packet belongs to but also in frames using this packet for prediction. Several techniques, such as forward error correction (FEC), retransmissions and video backchannel messages may be considered to increase video transport robustness to packet loss. When choosing a technique for a particular system, the delay requirement of a given application has to be taken into account. In the case of multimedia conferencing, the end-to-end delay has to be at most a few hundred milliseconds in order to guarantee interactivity, which usually excludes the use of retransmission. On the other hand, in the case of streaming, latency is perceived by the user as part of the session setup delay and thus a latency of Leon, Varsa - Expires December 2001 2 RTP retransmission framework July 2001 several seconds is tolerable. Retransmission may thus be considered for such applications. Current RTP [3] and AV profile [4] do not support a general retransmission framework. However, some retransmission mechanisms restricted to a particular media encoding and its payload format have been defined [5]. There are existing proposals on a general solution[6] [7] [8] to providing means for RTP low delay feedback, which would allow in turn RTP retransmission: [6] defines the timing rules for receivers to send enhanced RTCP low-delay feedback messages; [7] defines new feedback message formats; [8] defines a new RTP payload format meant to be used for RTP retransmission. This document proposes a retransmission framework, which is different from that proposed in [8], in order to achieve the design goals identified in Section 2. 2. Retransmission framework design goals The retransmission framework presented in this document was designed in order to achieve the goals listed hereafter: * There are no modifications to the payload format used to send the original RTP data (i.e. the data sent for the first time by an RTP source). This will allow a multicast conference to be joined by receivers which may or may not implement this draft. * RTCP statistics are meant to measure the network state (loss, jitter) and use of retransmissions does not skew the statistics in any way. * Signalling of packet priorities through RTP means is achieved through layered RTP transmission. Although layered RTP transmission and RTP retransmission are complementary, they are two separate techniques. They are thus defined independently of each other in this document. Layered RTP retransmission is useful for purposes other than retransmission (e.g. receiver rate adaptation in multicast). It could be considered in the future to define the layered RTP payload format in its own draft and having the retransmission framework reference it. * The number of RTP layers an application may use is not limited. Upon reception of a packet belonging to a given layer, the receiver is able to detect if a packet in another layer was lost. * The framework reuses and enhances the existing FEC mechanisms. * Existing header compression protocols can compress the RTP streams used within this framework. Leon, Varsa - Expires December 2001 3 RTP retransmission framework July 2001 3. Receiver feedback In order to enable retransmissions, the receiver needs to inform the sender of which packets are lost. There are existing proposals on a general solution [6] [7] [8] to providing means for RTP low delay feedback, which would allow in turn RTP retransmission. RTCP is a general mechanism to provide feedback to the sender. As currently defined, it is meant to inform the sender of the reception quality and does not allow to inform of which packets were lost. RTCP should be extended in order to support faster feedback and added reporting information. There needs to be defined an RTCP packet format for packet loss reporting (ACK and NACK) and timing rules dictating when a receiver is allowed to send these feedback packets. The retransmission framework assumes that these would be defined in other documents, such as [6] and [7]. 4. Packet retransmission by an RTP source This section addresses how the sender should retransmit data once it has decided to perform retransmission. Several solutions may be considered to that effect. An immediate solution may be to have the sender re-send the same RTP packet (i.e. the same RTP header and the same RTP data) it has previously sent. This does not meet the design goals of Section 2 because the RTCP statistics would then be skewed. The sender would not get the correct packet loss and network jitter. A new payload format as proposed in [8] for streams for which retransmissions may be performed would be another approach. However, this implies that receivers which do not implement this format are not able to receive the original RTP data stream even if they do not want to use retransmissions. This does not meet the design goals of Section 2. This document proposes the use of a separate RTP stream to carry retransmitted data. The original RTP data stream and RTCP statistics are therefore unaffected. They can be understood by receivers whether or not they implement this RTP retransmission format. The payload format used for the retransmitted RTP stream SHOULD generally be the FEC payload format defined in [9]. However, if the endpoints are not FEC capable (maybe because of processing power limitations), the retransmission payload format RTX defined below SHOULD be used. In both cases (FEC or RTX format), this RTP stream will be sent as a separate stream from the original RTP stream, with its own SN space. The FEC or RTX stream will be sent either to a different transport address (as defined in [3]) from that of the RTP media stream or to the same transport address with its own SSRC. If a different Leon, Varsa - Expires December 2001 4 RTP retransmission framework July 2001 transport address is used, the same SSRC value SHOULD be used for the original stream and the retransmission stream. Use of FEC payload format The FEC payload format contains information that allows the sender to tell the receiver exactly which media packets have been used to generate the FEC packet. The FEC header as defined in [9] is shown below: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SN base | length recovery | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E| PT recovery | mask | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TS recovery | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ See [9] for definition of the different fields. In particular, it is possible to retransmit a given packet by sending a FEC packet whose SN base is the sequence number of the retransmitted packet and a mask equal to zero. Choosing the FEC payload format allows a very modular design for packet loss recovery. The same RTP stream can be used to send both FEC and retransmitted RTP packets. Furthermore, the sender is able to use optimally the feedback from receivers such as ACKs and NACKs to determine how to compute the FEC codes. For example, if two receivers in the same multicast session require two different packets, the sender may send a unique FEC packet generated from the two requested packets. This reduces the sent bit-rate and may be useful in particular for congestion control. RTX payload format The RTX payload format is designed for endpoints which do not implement FEC. The structure of the RTX packet is very similar to that of a FEC packet. As shown below an RTX packet is composed of an RTP header followed by the originally sent RTP packet, i.e. original RTP header and RTP payload: Leon, Varsa - Expires December 2001 5 RTP retransmission framework July 2001 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Original RTP packet Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Original RTP Packet Payload | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The RTX RTP packet header fields are set as follows: The version field is set to 2. The padding bit and the extension bit, the CC value and the marker bit are set to the value 0. There is thus no CSRC list. The payload type value is dynamic, i.e. determined out-of-band. The SSRC value SHOULD be the same as the SSRC value of the original media stream if the original stream and the retransmitted stream are sent to different transport addresses. The SSRC value MUST be different from the original RTP stream SSRC if the retransmitted stream and the original RTP stream are sent to the same transport address. The sequence number has the standard definition: it MUST be one higher than the sequence number in the previously transmitted RTX packet. The timestamp MUST be set to the value of the media RTP clock at the instant the RTX packet is transmitted. The retransmitted packet payload is the original RTP packet, that is the original RTP header followed by the original RTP data. Note 1: it would be possible to define the retransmitted packet format with less overhead by having the RTP header P, X, CC and CSRC fields identical to the original packet format. The retransmitted packet payload would then carry the original PT, TS, SN and payload data. However, for the sake of simplicity we chose to have a copy of the original RTP packet in the retransmitted packet payload as described above. 5. Layered RTP transmission payload format This section defines a payload format for layered RTP transmission. The design of this payload format is based on [2]. The goal of layered RTP transmission is to allow a data stream to be sent as separate RTP streams. This document does not provide a definition of a data stream. A well-known example of this payload format is layered video coding, where the data stream is a set of encoded layers and each layer is sent as a separate RTP stream. Leon, Varsa - Expires December 2001 6 RTP retransmission framework July 2001 However, the concept of layers, in this document, is not restricted to layered encoding. It can be applied to any stream where the classification of data into separate layers is beneficial to the application. In particular, it can be applied to send packets of different priorities as different layer RTP streams. The application can then define as many streams as the number of priorities it wants to use. Different layer RTP streams are either sent to the same transport address or to different transport addresses. If each layer is sent to its own transport address, the same SSRC value SHOULD be used across the different layers. Otherwise, each layer RTP stream MUST have its own SSRC. The packet structure is shown below: . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | RTP header | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | XSN | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Media RTP Packet Payload | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The RTP header for a given layered RTP stream is used as follows. The V, P, X, CC, M and CSRC fields have their usual meaning. The PT value is dynamic. It MUST be associated out-of-band to both the layered RTP transmission format and the payload format used to encode the media in the RTP stream. The sequence number is incremented by one for each packet sent in that RTP layer. SSRC value SHOULD be the same across layers if different transport addresses are used. SSRC value has to be different if sent to the same transport address. The packet payload is composed of a cross layer sequence number XSN, which is 16 bits long, followed by the media payload formatted according to its own payload format. The XSN field is incremented by one for each packet sent in the data stream, regardless of the layer onto which it is sent. This allows the RTP receiver to determine the decoding order of packets sent across multiple layer streams. The use of both a layer sequence number (i.e. the RTP header sequence number) and a cross layer sequence number allows the receiver to implement a robust packet loss detection mechanism. Upon reception of a packet in a given layer, the receiver is able to Leon, Varsa - Expires December 2001 7 RTP retransmission framework July 2001 detect any packet loss which may have occurred on that layer thanks to the layer sequence number. In addition, it is able to detect, thanks to the cross layer sequence number, if a packet in another layer was lost. The receiver may also be able to know on which layer the packet loss occurred. In particular, in case of a single packet loss, the receiver can tell on which layer the packet was lost by checking its received layer sequence numbers for each layer. If only two layers are used, the receiver knows evidently on which layer packets were lost whatever the consecutive number of lost packets. Therefore, the receiver may upon reception of a packet on a given layer, report packet loss on another layer. As an example of packet loss detection, the following shows a sequence of packets when two layers B (base layer) and E (enhancement layer) are used. It is listed for each packet its layer, layer sequence number, cross-layer sequence number and whether this packet is lost: E, SNe, XSN B, SNb, XSN+1 E, SNe+1, XSN+2 E, SNe+2, XSN+3, lost B, SNb+1, XSN+4, lost E, SNe+3, XSN+5, lost E, SNe+4, XSN+6, lost E, SNe+5, XSN+7 Upon reception of the last packet in this series, the receiver is able to tell thanks to the cross-layer sequence number that 4 packets were lost in total. Thanks to the layer sequence number, it can tell that three packets were lost on enhancement layer E and one packet was lost on layer B. It may then choose to report packet loss SNb+1 on the base layer only. 6. Combined use of retransmission and RTP layers If a receiver is allowed to send only limited feedback, it may choose to request retransmission depending on the layer this packet belongs to. For example, in the case of layered video coding, it could request retransmission only for base layer packets, or for the base layer and the first enhancement layer packets. When combining layered RTP streams and retransmission (with the retransmission format or the FEC format), it is possible to use either multiple RTP retransmission streams, i.e. each of the layered RTP stream has a corresponding RTP retransmission stream, or a single retransmission stream onto which retransmitted packets are sent regardless of the layer the packets belong to. Leon, Varsa - Expires December 2001 8 RTP retransmission framework July 2001 There may be different reasons for a given application to implement a single or multiple retransmission streams. In the case where each layered RTP stream is sent to a separate multicast group, an implementation may choose to send multiple retransmission streams. Each retransmission stream could be sent to the same multicast group as its corresponding layered RTP stream. This way a receiver which is not subscribed to all the multicast groups will receive retransmitted packets only for the layers it is receiving. Retransmissions may be performed in unicast mode between the sender and each receiver in a multicast session so that packets are retransmitted only to receivers which request them. In that case, a single RTP retransmission stream is needed to retransmit packets from different layers. It is also possible to define the retransmitted stream as one of the layered RTP streams sent by the sender. In this case, the layered RTP format and retransmission format are concatenated as shown below: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | RTP header | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | XSN | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Original RTP Packet | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Sending the retransmission stream as a layer stream allows receivers to detect faster the loss of retransmitted packets by the sender thanks to the loss detection mechanism mentioned in Section 5. 7. Signalling with SDP FEC usage with SDP If the FEC payload format is used for retransmission, it is signaled with SDP as described in [9]. RTX usage with SDP Leon, Varsa - Expires December 2001 9 RTP retransmission framework July 2001 If the above defined RTX payload format is used for retransmission, it can be signaled with SDP in a very similar way to FEC. The following text is adapted from [9]: The payload type number for the RTX stream is conveyed in the m line description of the original media stream, listed as if it were another valid encoding for the stream. There is no static payload type assignment for RTX, so dynamic payload type numbers MUST be used. The binding to the number is indicated by an rtpmap attribute. The name used in this binding is "retransmission". This does not mean necessarily that RTX is sent to the transport address as the media. Instead, this information is conveyed through an fmtp attribute line. The presence of the RTX payload type on the m line of the media serves only to indicate to the receiver the RTP retransmission stream associated to its media stream. The format for the fmtp line for RTX is: a=fmtp: where 'number' is the payload type number present in the m line. Port is the port number where RTX is sent to. The remaining three items - network type, address type, and connection address - have the same syntax and semantics as the c line from SDP. This allows the fmtp line to be partially parsed by the same parser used on the c lines. The following is an example SDP for RTX: m=audio 49170 RTP/AVP 0 78 c=IN IP4 224.2.17.12/127 a=rtpmap:78 RTX/8000 a=fmtp:78 49172 IN IP4 224.2.17.12/127 The payload format of 0 indicates that the original data is audio PCM. The payload format 78 indicates that retransmission is sent to the same multicast group and TTL as the audio, but on a port number two higher (49172). Layered RTP payload format with SDP The payload type number for a layered RTP stream is conveyed in the m line of the media. There is no static payload type assignment for the layered RTP format, so dynamic payload type numbers MUST be used. The binding to the payload type number is indicated by an rtpmap attribute. The name used in the binding is "layer-XXX" where "XXX" is the actual media MIME subtype. The encoding parameters are then mapped as usual. This allows to signal both that the layered RTP Leon, Varsa - Expires December 2001 10 RTP retransmission framework July 2001 payload format is used and the actual payload format for the media encoding. The following is an example for MPEG-4 video: m=video 49170/2 RTP/AVP 78 a=rtpmap:78 layer-MP4V-ES/90000 a=fmtp:98ààà MPEG-4 payload format specific options 8. Security consideration Retransmission may cause network congestion. The packet retransmission rate should therefore be controlled. 9. Intellectual property considerations Nokia has filed patent applications that might possibly have technical relations to this contribution. On IPR related issues, Nokia refers to the Nokia Statement on Patent licensing, see http://www.ietf.org/ietf/IPR/NOKIA 10. References 1 RFC 2119 Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 2 Nilsson, M., Dalby, D., O'Donnell J., "Layered audiovisual coding for multicast distribution on IP network", PV2000 3 Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", Internet Draft draft-ietf-avt-rtp-new-08.txt, July 2000. 4 Schulzrinne, H and Casner, S. " RTP Profile for Audio and VideoConferences with Minimal Control," Internet Draft draft- ietf-avt-profile-new-09.txt, July 2000. 5 Turletti, T. and Huitema, C. "RTP Payload Format for H.261 Video Streams, RFC 2032, October 1996. 6 Wenger, S and Ott J. "RTCP-based Feedback: Concepts and Message Timing Rules", Internet Draft, draft-wenger-avt-rtcp-feedback- 02.txt, March 2001. Leon, Varsa - Expires December 2001 11 RTP retransmission framework July 2001 7 Fukunaga, S., Sato, N., Yano, K., Miyazaki A, Hata, K, Hakenberg R and Burmeister, C, "Low Delay RTCP Feedback Format", Internet Draft draft-fukunaga-low-delay-rtcp-02.txt, February 2001 8 Miyazaki, A, Fukushima, Wiebke, T, Hata, K, Hakenberg R and Burmeister, C, Takatori, N, Okumura, S, Ohno, T, "RTP Retransmission Payload Format", Internet Draft draft-ietf-avt- rtp-selret-01.txt, February 2001. 9 Rosenberg, J. and Schulzrinne, H., " An RTP Payload Format for Generic Forward Error Correction", RFC 2733, December 1999. 11. Author's Addresses David Leon Nokia Research Center 6000 Connection Drive Phone: 1-972-374-1860 Irving, TX. USA Email: david.leon@nokia.com Viktor Varsa Nokia Research Center 6000 Connection Drive Phone: 1-972-374-1861 Irving, TX. USA Email: viktor.varsa@nokia.com Leon, Varsa - Expires December 2001 12