Network Working Group M. Ramalho Internet-Draft Cisco Systems, Inc. Expires: August 29, 2003 February 28, 2003 RTP Payload Format for RGL Codec draft-ramalho-rgl-rtpformat-01.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 29, 2003. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract This document describes the RTP payload format for the RGL Codec (Version 1.0.0) described in draft-ramalho-rgl-desc-01.txt [4] and documented fully at www.vovida.org [10]. The necessary details for the use of the RGL codec with SDP are included in this document. Ramalho Expires August 29, 2003 [Page 1] Internet-Draft RTP Payload Format for RGL Codec February 2003 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. RGL Frame Specifics Necessary for the Understanding of the Proposed RTP Format . . . . . . . . . . . . . . . . . . . . . 5 4. RTP Payload Format for RGL Codec . . . . . . . . . . . . . . . 7 4.1 Type One: One RGL frame in RTP payload . . . . . . . . . . . . 8 4.2 Type Two: Two RGL Frames with Identical Number of Samples in RTP Payload . . . . . . . . . . . . . . . . . . . . . . . . 9 4.3 Type Three: Three RGL Frames with Identical Number of Samples in RTP Payload . . . . . . . . . . . . . . . . . . . . 10 4.4 Type Four: A Multiplicity of Fully Specified RGL Frames in RTP payload . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.5 Notes on RTP Decoding . . . . . . . . . . . . . . . . . . . . 15 5. SDP Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 16 6. Security Considerations . . . . . . . . . . . . . . . . . . . 17 7. IANA considerations . . . . . . . . . . . . . . . . . . . . . 18 References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 20 Intellectual Property and Copyright Statements . . . . . . . . 21 Ramalho Expires August 29, 2003 [Page 2] Internet-Draft RTP Payload Format for RGL Codec February 2003 1. Introduction The RGL (short for Ramalho G.711 Lossless) Codec obtains lossless compression of speech/audio packet payloads encoded with ITU-T Recommendation G.711 PCM (mu-law or A-law) with trivial complexity and virtually no delay. The RGL Codec (Version 1.0.0) is described in draft-ramalho-rgl-desc-01.txt [4] and documented fully at www.vovida.org [11]. The RGL codec is freeware subject to the Vovida.org licensing terms found at www.vovida.org [12]. The RGL codec performs lossless compression of G.711 encoded frames of arbitrary frame length. However, as described the RGL Codec whitepaper at www.vovida.org [13], the recommended size for optimal RGL compression gains during (8k sampled) speech segments is less than 20 milliseconds. For example, ten milliseconds is near optimum and corresponds to exactly 80 samples of 8k sampled (G.711) speech/ audio input. The RTP payload format described below accommodates RGL frames with up to 255 samples (31.875 milliseconds of speech/audio) when multiple RGL frames are packetized in a single packet. As the interfaces at the RGL end systems are often PSTN/GSTN networks, note that a range of up to 255 (G.711) samples includes convenient frame sizes for optimal transcoding into payloads of virtually all PSTN/GSTN transport systems (e.g., ATM VoAAL2 frame of 44 bytes/5.5 milliseconds or ATM VoAAL5 frame of 48 bytes/6.0 milliseconds). Additionally, the RTP payload format to be described allows for multiple RGL encoded frames (each of up to 255 samples) to be transported in one RTP packet. For example, this allows for two, 10 millisecond RGL frames per RTP packet (20 milliseconds being a common VoIP payload inter-packet interval). The 255 sample RGL frame limitation does not apply for RTP payloads containing only one RGL frame. The words "byte" and "octet" are used interchangeably in this document to denote eight bit words. In conformance with the Internet Protocol, all fields are carried in network byte order, that is, most significant byte (octet) first. Within a byte, the most significant bit is transmitted first. This byte order is commonly known as big endian. In this specification, bytes and bits shown on the left are more significant. Ramalho Expires August 29, 2003 [Page 3] Internet-Draft RTP Payload Format for RGL Codec February 2003 2. Conventions The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL, when they appear in this document, are to be interpreted as described in RFC2119 [2]. Ramalho Expires August 29, 2003 [Page 4] Internet-Draft RTP Payload Format for RGL Codec February 2003 3. RGL Frame Specifics Necessary for the Understanding of the Proposed RTP Format This section outlines RGL codec framing necessary for understanding the proposed RGL RTP frame format. The RGL codec compresses a frame of Y G.711 bytes into the compressed RGL frame that can be of length 1 to (Y+1) bytes. There are two implications of this compression relative to a native G.711 RTP payload: 1) one cannot a priori determine the length of a received RGL frame in bytes, and 2) on relatively infrequent basis, the RGL frame is one byte *longer* than the corresponding native G.711 frame. The first implication implies that if more than one RGL frame is packetized in one RTP payload, that information must be placed in the RTP payload (specifically a Table of Contents immediately after the RTP header) to demark the RGL frame boundaries. Note that this is not necessary for native G.711 RTP encodings, as Y bytes of G.711 RTP payload is equal to exactly Y G.711 samples. The second implication is that the RGL encoding occasionally expands the RGL frame relative to G.711. When expansion occurs, it has been explicitly limited to be one byte in the design of the RGL codec. To accommodate a Table of Contents (TOC), when necessary, at the beginning of the RTP payload described below, the RGL codec was revved to Version 1 to create seven "reserved first RGL byte" codes. This was done by deleting an "anchor codepoint" that was never used for real world signals (see draft-ramalho-rgl-desc-01.txt [4] or documents at www.vovida.org [14] for details). The end result is that a (version 1.0.0 or higher) RGL encoder will never produce a first byte of the form {XXX11110} where {XXX} is not equal to {000}. In other words the first byte in an RGL encoded frame will not be 0x3E, 0x5E, 0x7E, 0x9E, 0xBE, 0xDE or 0xFE. This modification was accomplished such that a Version 1.0.0 (or higher) RGL encoder is backwards compatible with previous version RGL decoders (i.e., a RGL frame produced by a Version 1.0.0 RGL encoder will be successfully decoded by an earlier version RGL decoder). Knowing that these "reserved codes" can never be produced as the first byte of an RGL frame, we can use these codes to create TOC for the various packetization cases to be described below. If one of these seven "reserved codes" is not the first byte in the RTP payload, then the RGL payload consists on only one RGL frame (Type One packetization below). For all other packetizations (more than one RGL frame in the RTP payload), one of the "reserved codes" will be the first byte of the RTP payload; and a particular reserved code will specify a TOC to describe the RGL frame packetization within the RTP payload. Ramalho Expires August 29, 2003 [Page 5] Internet-Draft RTP Payload Format for RGL Codec February 2003 As a major goal of compression is to reduce overall session bandwidth, including RTP overhead, the proposed RTP format below has optimizations to reduce RTP header overhead for the most common RGL codec packetizations and the RGL compression characteristics noted above. Ramalho Expires August 29, 2003 [Page 6] Internet-Draft RTP Payload Format for RGL Codec February 2003 4. RTP Payload Format for RGL Codec The RTP payload format for RGL codec conforms to the Real-Time Transport Protocol (RTP [5]) RFC 1889 in every aspect. A RTP packet for the RGL codec looks like: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC |M| PT | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ \ contributing source (CSRC) identifiers \ / (zero up to fifteen) / \ ..... \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \ Zero or One RTP Header Extension \ / (only if X bit =1) / \ ..... \ +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ \ \ / RTP PAYLOAD (TOC, if required, and one or more RGL frames) / \ \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1 The first twelve octets are present in every RTP packet as per RFC 1889, while the list of CSRC identifiers is present only when inserted by a mixer. This profile follows the RTP profile recommendations in RFC 1890 [6]. There are two bits in the mandatory twelve octet header that are defined by this profile, the extension (X) bit and the marker (M) bit. The payload specification used by this profile does not make use of the RTP header extension field to specify the RGL frame(s) in the RTP payload. Thus no payload extensions are defined herein and the usual setting of the X bit is zero. However note that as per RFC 1890 [6] that RTP applications SHOULD NOT assume that the RTP header X bit is always zero and should be prepared to ignore the header extension. The RGL codec obtains high compression during periods of silence and low-noise. For this reason, the use of VAD/SS (voice activity detection/silence suppression) is NOT RECOMMENDED. Therefore the value of the M bit MUST be set to zero if the RGL application does not use VAD/SS. If VAD/SS is employed despite the recommendation Ramalho Expires August 29, 2003 [Page 7] Internet-Draft RTP Payload Format for RGL Codec February 2003 against using it, then the value of the M bit MUST be set consistent with RFC 1890 [6] (i.e., set to one to mark the beginning of a talkspurt). The following four subsections describe the four different types of RGL payloads: one RGL frame per RTP packet, two RGL frames per RTP packet (each with same number of G.711 samples in each RGL frame), three RGL frames per RTP packet (each with same number of G.711 samples in each RGL frame) and an arbitrary number of RGL frames per RTP payload (each with an arbitrary number of samples in each RGL frame in the payload). 4.1 Type One: One RGL frame in RTP payload Exactly one RGL frame is placed in RTP payload. For this common case, the RGL frame is simply placed in the RTP payload by the RTP payload format encoder. Note from the text in the previous RGL detail section, that whatever the first byte of the RGL frame is, it will not be one of the "RGL reserved codes" (0x3E, 0x5E, 0x7E, 0x9E, 0xBE, 0xDE and 0xFE). All packetizations other than this one will require a TOC and the first byte of the TOC will always begin with one of the reserved codes. Thus, this packetization case is determined by the RTP decoder by noting that the first byte is not one of these reserved codes. The number of samples contained in this payload is specified by the SDP [7] "ptime" parameter and is exactly (ptime*8) samples. If ptime isn't explicitly signaled in SDP, the default ptime for the RGL codec is used (which is defined later in the SDP section of this document to be 20 msec). As described in draft-ramalho-rgl-desc-01.txt [4] the RGL payload for a ptime=10 millisecond payload (this would map to 10*8 = 80 G.711 samples) can be from one to 81 bytes long. If one desires to explicitly note the number of samples in a single RGL frame payload without using the SDP "ptime" parameter, RGL packetization "Type Four" (Section 4.4 (Section 4.4)) MUST be used. The entire RTP payload is passed to the RGL decoder at the far-end by the RTP payload format decoder. The RTP encoder MAY choose to align the RTP payload to 32-bit word boundaries, although it is more bandwidth efficient not to do so. RGL frames are bit zero padded in the RGL encoder to be an integer number of bytes long, but are not 32-bit word aligned. If the RTP encoder does align to 32-bit word boundaries, this is not a problem for the RGL decoder. That is, the RGL decoder calculates the end of the RGL frame by knowing the number of bits per sample (this is encoded in the first RGL byte) and the number of samples represented in the RGL frame (specified here via "ptime"). Thus, if extra bytes after the end of any RGL frame are passed to the far-end RGL decoder they are inconsequential to Ramalho Expires August 29, 2003 [Page 8] Internet-Draft RTP Payload Format for RGL Codec February 2003 decoding the RGL frame (they are not used). 4.2 Type Two: Two RGL Frames with Identical Number of Samples in RTP Payload Exactly two RGL frames per RTP packet (each with same number of G.711 samples in each RGL frame). The TOC for this case is a simple one, the "reserved code" 0x3E followed by the unsigned char "RGL_Size_1", followed by the two RGL frames. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 1 1 1 1 1 0| RGL_Size_1 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + \ Remainder of RTP PAYLOAD: Two RGL Frames \ / (second RGL frame begins (RGL_Size_1+1+2) bytes / \ into the RTP payload) \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2 Field Length Meaning -------------------------------------------------------------------- RGL_Size_1 8 bits The first RGL frame begins immediately (unsigned, after the RGL_Size_1 parameter. RGL_Size_1 uchar8) is set to one less than the length of the first RGL frame size. Thus, the size of this first RGL frame is exactly [RGL_Size_1+1] bytes. Therefore, the second RGL frame begins "RGL_Size_1+2" bytes after the RGL_Size_1 parameter (or equivalently "RGL_Size_1+4" bytes into the RTP payload). Note that since a RGL frame length is at most one byte longer than the corresponding G.711 frame length, this packetization allows for RGL frames containing a maximum of 255 G.711 samples (31.875 milliseconds of G.711) to be placed this format. The oldest RGL frame MUST occur first, as per guidelines in draft-ietf-avt-profile-new-12. [9] The number of samples contained in this payload is specified by the SDP [7] "ptime" parameter and is exactly (ptime*8) samples. Therefore the number of samples in *each* RGL frame is exactly (ptime*8/2) samples. If ptime isn't explicitly signaled in SDP, the default ptime for the RGL codec is used (which is defined later in the SDP section of this document to be 20 msec). If one desires to explicitly note Ramalho Expires August 29, 2003 [Page 9] Internet-Draft RTP Payload Format for RGL Codec February 2003 the number of samples in each of the RGL frames *without* using the SDP "ptime" parameter, RGL "Type Four" packetization (Section 4.4 (Section 4.4)) MUST be used. Although the RTP payload is illustrated above as 32-bit word aligned, this need not be the case. RGL frames are bit zero padded in the RGL encoder to be an integer number of bytes long, but are not 32-bit word aligned. If the RTP encoder does align to 32-bit word boundaries, this is not a problem for the RGL decoder. That is, the RGL decoder calculates the end of the RGL frame by knowing the number of bits per sample (this is encoded in the first RGL byte) and the number of samples represented in the RGL frame (specified via "ptime"). Thus, if extra bytes after the end of any RGL frame are passed to the far-end RGL decoder they are inconsequential to decoding the RGL frame (they are not used). Thus aligning the RTP payload to 32-bit boundaries may result in extra bytes being passed to the RGL decoder for the last RGL frame in the RTP payload. These extra bytes are thus inconsequential. Note that this packetization is anticipated to be a commonly used one, as both 10 msec and 20 msec are common VoIP packetization intervals for speech applications. If the RGL coder utilizes a 10 msec frame (instead of the slightly less compression efficient 20 msec), it would produce two, 10 msec RGL frames during this interval and this packetization could be used. 4.3 Type Three: Three RGL Frames with Identical Number of Samples in RTP Payload Exactly tree RGL frames per RTP packet (each with same number of G.711 samples in each RGL frame). The TOC for this case is a simple one, the "reserved code" 0x5E followed by the unsigned char "RGL_Size_1", followed by the unsigned char "RGL_Size_2", followed by three RGL frames. A 30 msec packetization interval is also a common packetization interval for speech applications (e.g., when interfacing to systems that transcode to/from G.723). The RGL codec could either encode one, 30 msec frame and use Type One packetization or encode two, 15 msec frames and use Type Two packetization (15 msec RGL frames are generally more compression efficient) or encode three, 10 msec frames and use this packetization (10 msec RGL frames are generally even more efficient). As 10 msec appears to be a universal frame size for speech codecs (e.g., G.729), it is envisioned that a three RGL frame packetization format is desirable. Therefore this packetization format has been specifically developed. If a more general RGL frame packetization is required (e.g., more than three RGL frames in a RTP payload), the Type Four packetization MUST be used and will be Ramalho Expires August 29, 2003 [Page 10] Internet-Draft RTP Payload Format for RGL Codec February 2003 described below in Section 4.4 (Section 4.4). 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 1 1 1 1 1 0| RGL_Size_1 | RGL_Size_2 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + \ \ / Remainder of RTP PAYLOAD: Three RGL Frames / \ \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4 Field Length Meaning -------------------------------------------------------------------- RGL_Size_1 8 bits The first RGL frame begins immediately (unsigned, after the RGL_Size_2 parameter. RGL_Size_1 uchar8) is set to one less than the length of the first RGL frame size. Thus, the size of this first RGL frame is exactly [RGL_Size_1+1] bytes. Therefore, the second RGL frame begins "RGL_Size_1+2" bytes after the RGL_Size_2 parameter (or equivalently "RGL_Size_1+4" bytes into the RTP payload). RGL_Size_2 8 bits The second RGL frame begins immediately (unsigned, after the first RGL frame. RGL_Size_2 uchar8) is set to one less than the length of the second RGL frame size. Thus, the size of this second RGL frame is exactly [RGL_Size_2+1] bytes. Therefore, the third RGL frame begins "RGL_Size_2+2" bytes after the beginning of the second RGL frame. Note that since a RGL frame length is at most one byte longer than the corresponding G.711 frame length, this packetization allows for RGL frames containing a maximum of 255 G.711 samples (31.875 milliseconds of G.711) to be placed this format. The oldest RGL frame MUST occur first, as per guidelines in draft-ietf-avt-profile-new-12. [9] The number of samples contained in this payload is specified by the SDP [7] "ptime" parameter and is exactly (ptime*8) samples. Therefore the number of samples in *each* RGL frame is exactly (ptime*8/3) samples. If ptime isn't explicitly signaled in SDP, the default ptime for the RGL codec is used (which is defined later in the SDP section Ramalho Expires August 29, 2003 [Page 11] Internet-Draft RTP Payload Format for RGL Codec February 2003 of this document to be 20 msec). If one desires to explicitly note the number of samples in each of the RGL frames *without* using the SDP "ptime" parameter, RGL "Type Four" packetization (Section 4.4 (Section 4.4)) MUST be used. Although the RTP payload is illustrated above as 32-bit word aligned, this need not be the case. RGL frames are bit zero padded in the RGL encoder to be an integer number of bytes long, but are not 32-bit word aligned. If the RTP encoder does align to 32-bit word boundaries, this is not a problem for the RGL decoder. That is, the RGL decoder calculates the end of the RGL frame by knowing the number of bits per sample (this is encoded in the first RGL byte) and the number of samples represented in the RGL frame (specified here via "ptime"). Thus, if extra bytes after the end of any RGL frame are passed to the far-end RGL decoder they are inconsequential to decoding the RGL frame (they are not used). Thus aligning the RTP payload to 32-bit boundaries may result in extra bytes being passed to the RGL decoder for the last RGL frame in the RTP payload. These extra bytes are thus inconsequential. 4.4 Type Four: A Multiplicity of Fully Specified RGL Frames in RTP payload The Table of Contents (TOC) for this case is more complicated, the "reserved code" 0xFE followed by a number of parameters which is dependent on the number of RGL frames in the RTP packet, followed by the RGL frames themselves. The parameter "Num_Frames" is set to the number of RGL frames are in the RTP payload. The following illustration is an example assuming four RGL frames. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1 1 1 1 1 1 1 0| RGL_Size_1 | Num_Samps_1 | Num_Frames | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RGL_Size_2 | Num_Samps_2 | RGL_Size_3 | Num_Samps_3 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RGL_Size_4 | Num_Samps_4 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + \ Remainder of RTP PAYLOAD: One or more RGL Frames \ / (second RGL frame begins, if it exists, is / \ [RGL_Size_1+1] bytes after last TOC byte) \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6 Field Length Meaning Ramalho Expires August 29, 2003 [Page 12] Internet-Draft RTP Payload Format for RGL Codec February 2003 -------------------------------------------------------------------- RGL_Size_1 8 bits The first RGL frame begins immediately (unsigned, after the last TOC parameter. RGL_Size_1 is uchar8) set to one less than the length of the RGL frame size. Thus, the size of this first RGL frame is exactly [RGL_Size_1+1] bytes. RGL_Size_2 8 bits The beginning of the second RGL frame, if (unsigned, present, is [(RGL_Size_1+1)+1] bytes after uchar8) the last TOC parameter. RGL_Size_2 is set to one less than the length of the second RGL frame size. Thus, the size of the second RGL frame, if present, is exactly [RGL_Size_2+1] bytes. RGL_Size_j 8 bits The beginning of the RGL frame j (for j>2), (unsigned, if present, is uchar8) [(RGL_Size_1+1)+ ... +(RGL_Size_(j-1)+1)+1] bytes after the last TOC parameter. RGL_Size_j is set to one less than the length of the jth RGL frame size. Thus, the size of the jth RGL frame, if present, is exactly [RGL_Size_j+1] bytes. Num_Samps_j 8 bits The number of samples in RGL encoded frame j. (unsigned, As noted below, this range of this parameter uchar8) one to 255 samples. Num_Frames 8 bits The number of RGL frames in RTP payload. (unsigned, Although this parameter has a range from 0 uchar 8) to 255, the number of RGL frames in the RTP payload is limited to the integer number of RGL frames that can be placed in the packet MTU. In practice, this will limit the number of RGL packets to well below 255. A minimum of one RGL frame MUST be present; zero is not a valid value. The parameters which reference frame index "j" above go from one to Num_Frames, inclusive. This payload format has been designed to accommodate one or more RGL frames; as such it is expected that the TOC will contain a minimum the first four bytes (a 32-bit TOC). In all cases, the number of RGL frames in the RTP payload is specified in the uchar parameter "Num_Frames". If more than one RGL frame is to be placed in the payload, additional {RGL_Size_j, Num_Samps_j} two-byte tuples are added to the TOC after the Num_Samps parameter (as illustrated above). The number of such additional two-byte tuples added after the Ramalho Expires August 29, 2003 [Page 13] Internet-Draft RTP Payload Format for RGL Codec February 2003 first 32-bit word in the RTP payload is therefore (Num_Frames-1). The end of the TOC is defined as the "Num_Samps_j" byte that is associated with the last RGL frame in the payload. The TOC is not necessarily 32-bit aligned (it is for an odd number of RGL frames in the payload, however). The length of the TOC is therefore (2+2*Num_Frames). The first RGL frame follows immediately after the end of the TOC. RGL_Size_j is set to one less than the length of the jth RGL frame size. Thus, the size of the jth RGL frame is exactly [RGL_Size_j+1] bytes. Since the jth RGL frame size is a minimum of one byte and RGL_Size_j can take values from zero to 255, the length of the jth RGL frame can be from 1 to 256 bytes long. Since a RGL frame is at most one byte longer than the corresponding G.711 frame, this packetization allows for RGL frames containing a maximum of 255 samples (31.875 milliseconds of G.711). The oldest RGL frame MUST occur first, as per guidelines in draft-ietf-avt-profile-new-12. [9] The number of samples contained in any RGL frame in this payload is specified the corresponding Num_Samps_j parameter and overrides any "ptime" parameter that may be specified via SDP [7]. Due to this, care SHOULD be exercised to set the sum of the individual Num_Samps_j parameters consistent with "ptime" (or the other way around) if this packetization is used in SDP environments (or other media negotiation environments). In other words, if SDP is used, the number of samples contained in the entire payload would be exactly ptime*8 samples; the sum of all the Num_Samps_j parameters should therefore equal ptime*8. Although the RTP payload is illustrated above as 32-bit word aligned, this need not be the case. RGL frames are bit zero padded in the RGL encoder to be an integer number of bytes long, but are not 32-bit word aligned. If the RTP encoder does align to 32-bit word boundaries, this is not a problem for the RGL decoder. That is, the RGL decoder calculates the end of the RGL frame by knowing the number of bits per sample (this is encoded in the first RGL byte) and the number of samples represented in the RGL frame (explicitly provided in this packetization via the appropriate Num_Samps_j parameter). Thus, if extra bytes after the end of any RGL frame are passed to the far-end RGL decoder they are inconsequential to decoding the RGL frame (they are not used). Thus aligning the RTP payload to 32-bit boundaries may result in extra bytes being passed to the RGL decoder for the last RGL frame in the RTP payload. These extra bytes are thus inconsequential. The number of RGL frames in the RTP payload MUST be an integer number (no fractional RGL frames) and this number MUST be chosen such that the resulting IP/UDP/RTP packet does not exceed the MTU (no packet fragmentation). Ramalho Expires August 29, 2003 [Page 14] Internet-Draft RTP Payload Format for RGL Codec February 2003 4.5 Notes on RTP Decoding The following algorithm MAY be used for the RTP decoding of a RGL RTP packet. Step One: If the first byte of RTP payload is of the form {XXX11110} where{XXX} != {000} (a "reserved" RGL code is present), go to STEP TWO. Otherwise use Type One packetization to decode the RGL frame. Step Two: If the first byte of the RTP payload is 0x3E, use Type Two packetization to decode the two RGL frames present in the RTP payload. Otherwise continue to Step Three. Step Three: If the first byte of the RTP payload is 0x5E, use Type Three packetization to decode the three RGL frames present in the RTP payload. Otherwise continue to Step Four. Step Four: If the first byte of the RTP payload is 0xFE, use Type Four packetization to decode the multiple RGL frames present in the RTP payload. Otherwise continue to Step Five. Step Five: A "reserved code" to which a packetization has not been defined by this profile has been found. Do not further process the RTP payload (i.e., do not present it to the RGL decoder). The reserve codes other than 0x3E, 0x5E and 0xFE (i.e., 0x7E, 0x9E, 0xBE and 0xDE) are reserved for future use in defining additional RGL packetization formats or for other future purposes. Thus Step Five above thus provides a backwardly compatible mechanism for the RGL RTP decoder (dropping the packets) when encountering yet-to-be-defined "reserved RGL codes". Ramalho Expires August 29, 2003 [Page 15] Internet-Draft RTP Payload Format for RGL Codec February 2003 5. SDP Issues Until further experience is gained with the RGL codec, it is RECOMMENDED that the RGL codec be referred to as: RGLv1 (v1 for version 1). Additionally, as the RGL codec is not a defined SDP [7] static codec type, it must use a SDP dynamic payload type and be referenced via an SDP rtpmap attribute. Excepting Type Four packetizations, note that the number of samples in a RTP payload is determined by the optional SDP parameter "ptime" [7]. If a ptime other than the default is desired, ptime MUST be specified in the SDP. Putting this together, we have an example SDP of: m=audio 49232 RTP/AVT 94 94=rtpmap:94 RGLv1/8000 a=ptime:10 Where the dynamic payload type of 94 is used (as an example), the default sampling rate is identical to PCMU/PCMA (i.e., 8kHz), and the default RGL frame size for RTP payload containing one RGL frame is 10 milliseconds (i.e., 80 G.711 samples in each RGL frame for Type one packetization). As noted previously, knowledge of ptime is only required for Type One, Type Two or Type Three packetizations; for Type Four packetizations the length of each RGL frame (in bytes) and the number of G.711 samples in each RGL frame are determined via the RGL TOC at the beginning of the RTP payload (and overrides ptime, if present). If the ptime line is not specified in the SDP, then the default ptime of 20 msec is used. The default ptime for the RGLV1 codec is defined here to be 20 milliseconds, thereby making it consistent with the default G.711 (PCMU/PCMA) ptime in RFC 1889 [5]. However, it is RECOMMENDED that ptime be set such that each RGL frame compresses approximately 10 milliseconds of speech (see RGL Codec Whitepaper at www.vovida.org [15] for rationale) or a more appropriate value for interworking with existing or legacy PSTN/GSTN endpoints (e.g., 5.5 milliseconds for ATMVoAAL2). Ramalho Expires August 29, 2003 [Page 16] Internet-Draft RTP Payload Format for RGL Codec February 2003 6. Security Considerations RTP packets using the payload format defined in this specification are subject to the general security considerations discussed in [5] and any appropriate profile (e.g.[6]). As this format transports encoded speech, the main security issues include confidentiality and authentication of the speech itself. The payload format itself does not have any built-in security mechanisms. Confidentiality of the media streams is achieved by encryption, therefore external mechanisms, such as SRTP [8], MAY be used for that purpose. The data compression used with this payload format is applied end-to-end; hence encryption may be performed after compression with no conflict between the two operations. Note also that the RGL payload format is self-describing; if padding of the RGL payload is required by the encryption operation, the decoding of the RGL payload can occur at the far-end without knowledge of the amount of padding. A potential Denial-Of-Service (DOS) threat exists for data encoding using compression techniques that have non-uniform receiver-end computational load. The attacker can inject pathological datagrams into the stream which are complex to decode (e.g., inject hard or impossible inverse root-finding situations) and cause the receiver to become overloaded. The RGL codec, due to its trivial complexity, has bounded receiver-end load for any "bogus RGL" compressed frames and thus does not suffer from this fate. The only known DOS attack is simply a stream of more frames than the RTP/DSP flow can accommodate. Ramalho Expires August 29, 2003 [Page 17] Internet-Draft RTP Payload Format for RGL Codec February 2003 7. IANA considerations When and if the RGL codec becomes mainstream, IANA registration may be necessary. Ramalho Expires August 29, 2003 [Page 18] Internet-Draft RTP Payload Format for RGL Codec February 2003 References [1] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [3] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. [4] Ramalho, M., "RGL Codec Description Document", draft-ramalho-rgl-desc-01.txt (work in progress), February 2003. [5] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A transport Protocol for Real-Time Applications", RFC 1889, January 1996. [6] Schulzrinne, H., "RTP Profile for Audio and Video Conferences with Minimal Control", RFC 1890, January 1996. [7] Handley, M. and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998. [8] Baugher, M., Blom, R., Carrara, E., McGrew, D., Naslund, M., Noorman, K. and D. Oran, "The Secure Real-TIme Transport Protocol", draft-ietf-avt-srtp-05.txt (work in progress), June 2002. [9] Casner, S. and H. Schulzrinne, "RTP Profile for Audio and Video Conferences with Minimal Control", draft-ietf-avt-profile-new-12 (work in progress), November 2001. [10] [11] [12] [13] [14] [15] Ramalho Expires August 29, 2003 [Page 19] Internet-Draft RTP Payload Format for RGL Codec February 2003 Author's Address Michael A. Ramalho Cisco Systems, Inc. 1802 Rue de la Porte Wall Township, NJ 07719-3784 USA Phone: +1.941.708.4650 EMail: mramalho@cisco.com Ramalho Expires August 29, 2003 [Page 20] Internet-Draft RTP Payload Format for RGL Codec February 2003 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. Full Copyright Statement Copyright (C) The Internet Society (2003). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION Ramalho Expires August 29, 2003 [Page 21] Internet-Draft RTP Payload Format for RGL Codec February 2003 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Ramalho Expires August 29, 2003 [Page 22]