Internet Draft Ari Lakaniemi Document: draft-lakaniemi-avt-rtp-amr-00.txt Petri Koskelainen March 10, 2000 Nokia Expires September 2000 RTP Payload Format for AMR Status of this Memo This document is an Internet-Drfat and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract This document specifies the encapsulation of AMR (Adaptive Multi- Rate) speech codec frames into payload of the Real-time Transport Protocol (RTP). The format enables encapsulation of one or several AMR speech frames into one RTP packet. Mode adaptation and discontinuous transmission (DTX) are also supported. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [1]. 3. Introduction This document specifies the encapsulation of the Adaptive Multi-Rate (AMR) speech codec [3] frames into payload of the Real-Time Transport Protocol (RTP) [2]. The AMR codec is a speech codec developed by the European Telecommunications Standards Institute Lakaniemi/Koskelainen [page 1] RTP Payload Format for AMR March 10, 2000 (ETSI). The AMR codec is standardized for GSM, and it is also chosen by the Third Generation Partnership Project (3GPP) as the mandatory speech codec for the third generation systems. AMR provides high speech quality under a wide range of transmission conditions and is well suitable also for other than mobile applications. The AMR includes eight different speech coding modes, whose bit- rates range from 4.75 to 12.2 kbit/s. The sampling rate is 8000 Hz and processing is performed on 20 ms frames. Some of the AMR speech coding modes are speech codecs specified for other standards: the 6.7 kbit/s mode as the ACELP codec specified in section 5.4 of [4] (PDC-EFR), the 7.4 kbit/s mode as IS-641 codec in TDMA [5] and the 12.2 kbit/s mode as GSM EFR [6]. AMR implementation according to [3] must support all eight coding modes. The mode change can occur at any time during operation and therefore the mode information is transmitted as a part of each AMR frame to allow mode change without any additional signaling. It is possible that the decoder may want to receive certain AMR mode for e.g. capacity or quality reasons. This can be signaled to the other end-point by including a mode request into transmitted packet. In addition to the speech codec, AMR specifications also include Discontinuous Transmission / Comfort Noise (DTX/CN) functionality [7]. The DTX/CN switches the tranmission off during silent parts of the speech and only CN parameter updates are sent in regular intervals. The three codec standards that are part of the AMR also have their own DTX/CN schemes ([4][5][8]). To enable interoperability with terminals supporting these standards, AMR can optionally support these additional CN schemes. 4. Payload format The RTP payload format for AMR codec consists of variable length payload header, followed by one or more AMR payload frames. In most cases the actual payload data does not fill the octet structure. In these cases the unused bits in the last octet of the payload are padded with bits of value 0. 4.1. AMR payload header The length of the AMR payload header is either 1 or 5 bits and the header bits are defined as follows: R (1 bit): Indicates the existence of Mode Request field MR (4 bits): Optional field which is present only if R=1. If present, this field is used to request a specific AMR mode from the other end-point of the communication. The frame type indices from 0 to 7 (see Table 1) can be used in mode request. Lakaniemi/Koskelainen [page 2] RTP Payload Format for AMR March 10, 2000 +-+ |R| +-+ Figure 1: Payload header with R=0 +-+-+-+-+-+ |R| MR | +-+-+-+-+-+ Figure 2: Payload header with R=1. Bits are stored into MR field from LSB to MSB. Note that in multicast it is possible to receive conflicting mode requests from different receivers. Therefore, in multicast the sender can choose to ignore mode requests. 4.2. AMR payload frame An AMR payload frame has variable size and it consists a 4-bit frame type field, followed by the AMR speech or CN bits. Note that the AMR payload frame format is exactly the same as the AMR Interface Format 2 (AMR IF2) defined in Annex A of [9]. The AMR payload frame is defined as follows: FT (4 bits): Indicates the mode of the AMR payload frame. The mapping of FT bits into AMR frame type is shown in Table 1. SP (N bits): The speech/CN bits. The number of speech bits depends on the frame type, the number of speech/CN bits for each AMR frame type are shown in Table 1. The bit order for all frame types is defined in [9]. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- /// -+-+ | FT | Speech/CN bits | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- /// -+-+ Figure 3: AMR payload frame. Bits are stored into FT field from LSB to MSB. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |R| MR | FT1 | | +-+-+-+-+-+-+-+-+-+ + | SP1 (103 bits) | + + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | FT2 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + SP2 (95 bits) + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Lakaniemi/Koskelainen [page 3] RTP Payload Format for AMR March 10, 2000 Figure 4: An RTP payload for AMR with mode request (R=1) and two AMR payload frames (a 5.15 kbit/s frame followed by a 4.75 kbit/s frame. frame type speech index mode bits ------------------------------------- 0 AMR 4.75 95 1 AMR 5.15 103 2 AMR 5.9 118 3 AMR 6.7 134 4 AMR 7.4 148 5 AMR 7.95 159 6 AMR 10.2 204 7 AMR 12.2 244 8 AMR CN 39 9 GSM EFR CN 43 10 IS-641 CN 38 11 PDC-EFR CN 37 12 û 14 For future use - 15 No transmission 0 Table 1: Definition of AMR frame types 5. Payload octet structure The AMR payload is stored into octets starting from the LSB of the first octet and filling all octets from LSB to MSB. Possible unused bits in the MSB of the last octet of the payload are set to value 0. The octet structure is constructed as defined by the c-like pseudo code below. Note that in this formula LSB is bit 0 and MSB is bit 7. Nh - number of header bits in the payload Nf - number of payload frames N(j) - number of payload frame bits in frame j h(j) - bit j of the payload header f(n,k) - bit k of the payload frame n b(n,k) - bit k in payload octet n UB - denotes unused bit (set to value 0) for (j = 0; j < Nh; j++) b(0,j) = h(j); c = j; for (j = 0; j < Nf; j++) { for (i = 0; i < N(j); i++) { n = c / 8; k = c % 8; b(n,k) = f(j,i); c++; } Lakaniemi/Koskelainen [page 4] RTP Payload Format for AMR March 10, 2000 } for (j = c % 8; j < 8; j++) { n = c / 8; k = c % 8; b(n,k) = UB; } Example: The payload illustrated in Figure 4 is encapsulated into octets of the payload: mode request is used (R=1), requesting for 7.95 kbit/s frame (MR=5), AMR 5.15 kbit/s payload frame (FT=1 + 103 speech bits denoted as c(n)) and AMR 4.75 kbit/s payload frame (FT=0 + 95 speech bits, denoted as d(n)). Oct.| MSB | Octet structure | LSB ----+-------+-----------------------------------------------+------- 0 | 0 0 1 0 1 0 1 1 ----+-------+-----------------------------------------------+------- 1 | ... ... ... ... c(2) c(1) c(0) 0 ----+-------+-----------------------------------------------+------- 13 | c(102)| c(101) c(100) ... ... ... ... ... ----+-------+-----------------------------------------------+------- 14 | d(3) d(2) d(1) d(0) 0 0 0 0 ----+-------+-----------------------------------------------+------- 15 | ... ... ... ... ... d(6) d(5) d(4) ----+-------+-----------------------------------------------+------- 25 | d(91) d(90) d(89) ... ... ... ... ... ----+-------+-----------------------------------------------+------- 26 | UB UB UB UB UB d(94) d(93) d(92) ----+-------+-----------------------------------------------+------- 6. RTP header usage The timestamp of the RTP header must indicate the sampling time of the first sample of the first frame in the packet. The time is indicated as samples, i.e. frame length 20 ms and sampling rate 8 kHz mean that time stamp is advanced by 160 (samples) for each frame. All frames in a packet must be successive 20 ms frames, stored in the order they are generated by the encoder. The encoder shall set the marker bit (M) of the RTP header to value 1 for packets containing the first active speech frame after a non- speech speech period. For all other packets the marker bit is set to 0. 7. References Lakaniemi/Koskelainen [page 5] RTP Payload Format for AMR March 10, 2000 [1] S. Bradner: "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119 [2] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson: "RTP: A transport protocol for real-time applications", IETF Audio/Video Transport Working Group, RFC1889 [3] GSM 06.90: Adaptive Multi-Rate (AMR) speech transcoding [4] RCR STD-27H, Personal Digital Cellular Telecommunication System RCR Standard [5] TIA/EIA -136-Rev.A, part 410 - TDMA Cellular/PCS - Radio Interface, Enhanced Full Rate Voice Codec (ACELP). Formerly IS- 641. TIA published standard, 1998 [6] GSM 06.60: Enhanced Full Rate (EFR) speech transcoding [7] GSM 06.92: Comfort noise aspects for Adaptive Multi-Rate (AMR) speech traffic channels [8] GSM 06.62: Comfort noise aspect for Enhanced Full Rate (EFR) speech traffic channels [9] 3G TS 26.101: AMR Speech Codec Frame Structure, General description 8. Author's Addresses Ari Lakaniemi Nokia Research Center P.O.Box 407 FIN-00045 Nokia Group Finland Email: ari.lakaniemi@nokia.com Petri Koskelainen Nokia Research Center P.O.Box 100 FIN-33721 Tampere Finland Email: petri.koskelainen@nokia.com This internet-draft expires in September 2000. Lakaniemi/Koskelainen [page 6]