Ari Lakaniemi, Nokia Internet Draft Petri Koskelainen, Nokia Document: draft-lakaniemi-avt-mime-amr-00.txt Johan Sj÷berg, Ericsson March 10, 2000 Expires September 2000 MIME Type Registration of AMR Speech Codec Status of this Memo This document is an Internet-Drfat and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract This document defines MIME-name audio/AMR and its parameters for Adaptive Multi Rate (AMR) speech codec to be used with RTP protocol. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [1]. 3. Introduction GSM Adaptive Multi Rate (AMR) codec [2] is a speech codec developed by the European Telecommunications Standards Institute (ETSI). The AMR codec is standardized for GSM, and is also chosen by the 3GPP as the mandatory speech codec for third generation systems. AMR provides high speech quality under a wide range of transmission conditions and is well suitable also for other than mobile applications. The AMR includes eight different speech coding modes, Lakaniemi/Koskelainen/Sj÷berg [page 1] MIME Type Registration of AMR Speech Codec March 10, 2000 whose bit-rates range from 4.75 to 12.2 kbit/s. The sampling rate is 8000 Hz and processing is performed on 20 ms frames. Some of the AMR speech coding modes are speech codecs specified for other standards, e.g. AMR at 6.7 kbit/s is the ACELP codec specified in section 5.4 of [3] (PDC-EFR). Table 1 introduces all AMR speech coding modes. index bit-rate note --------------------------------- 0 4.75 kbit/s 1 5.15 kbit/s 2 5.9 kbit/s 3 6.7 kbit/s PDC-EFR 4 7.4 kbit/s IS-641 [4] 5 7.95 kbit/s 6 10.2 kbit/s 7 12.2 kbit/s GSM EFR [5] Table 1: AMR speech coding modes AMR implementation according to [2] must support all eight coding modes. The mode change can occur at any time during operation and therefore the mode information is transmitted in-band together with speech bits to allow mode change without any additional signaling. However, it is possible that the decoder may want to receive certain AMR mode or a subset of AMR modes. Therefore, it is possible to request specific set of AMR modes in capability description and it is mandatory for encoder to abide this request. If request for modeset is not given, encoder can freely decide which AMR mode to use. Although in principle AMR codec can perform a mode change at any time between any two modes, it is possible to set limitations for mode changes. The decoder has possibility to define the minimum number of frames between mode changes and to limit the mode change to happen into neighbouring modes only. In addition to the speech codec, AMR specifications also include Discontinuous Transmission / comfort noise (DTX/CN) functionality [6]. The DTX/CN switches the tranmission off during silent parts of the speech and only CN parameter updates are sent in regular intervals. The three codec standards that are part of the AMR also have their own DTX/CN schemes ([3][4][7]). To enable interoperability with terminals supporting these standards, AMR can optionally be extented to support also these CN schemes. The CN capabilities are signaled in capability description. If no CN capabilites are reported, it is assumed that AMR CN is supported. If CN capabilities are reported, all supported CN types (including AMR CN) must be signaled. It is also possible to limit the number of AMR frames encapsulated into one RTP packet. This is an optional feature and if no parameter is given in capability description, the transmitter can encapsulate any number of AMR speech frames into one RTP packet. Lakaniemi/Koskelainen/Sj÷berg [page 2] MIME Type Registration of AMR Speech Codec March 10, 2000 4. MIME Registration MIME-name for the AMR codec is allocated from IETF tree since AMR is expected to be widely used speech codec VoIP applications. Media Type name: audio Media subtype name: AMR Required parameters: none Optional parameters: ptime: Definition as usual in RTP audio. mode: AMR mode. Possible values are: 0, ..., 7 (see Table 1). mode-change-phase: Defines a number N which restricts the mode changes in such a way that mode changes are only allowed on mulitples of N, initial state of the phase is arbitrary. If this paramter is not present, mode change can happen at any time. mode-change-flag: If present, mode changes can be made to neighbouring modes only. If not present, change between any two modes is allowed. amr-cn: If present, GSM AMR DTX/CN is supported. Note that if no CN capabilities are reported, AMR DTX/CN is assumed to be supported. pdc-efr-cn:If present, PDC-EFR DTX/CN is supported, otherwise not supported. is-641-cn: If present, IS-641 DTX/CN is supported, otherwise not supported. gsm-efr-cn:If present, GSM EFR DTX/CN is supported, otherwise not supported. maxframes: Maximum number of AMR speech frames in one RTP packet. The receiver may set this parameter in order to limit the buffering requirements or delay. Encoding considerations: See RFCXXXX for AMR RTP packetization (work in progress: draft-lakaniemi-avt-rtp-amr-00.txt / draft-sjoberg-avt- rtp-amr-00.txt). Security considerations: none Interoperability considerations: If CN capabilities are not signalied in the capability description, only AMR CN is supported. Person & email address to contact for further information: petri.koskelainen@nokia.com, ari.lakaniemi@nokia.com Intended usage: COMMON. It is expected that many VoIP applications (as well as mobile applications) will use this type. Lakaniemi/Koskelainen/Sj÷berg [page 3] MIME Type Registration of AMR Speech Codec March 10, 2000 Author/Change controller: petri.koskelainen@nokia.com, ari.lakaniemi@nokia.com 5. Mapping to SDP Parameters Parameters are mapped to SDP [8] as usual. Example usage in SDP: m=audio 49120 RTP/AVP 97 a=rtpmap:97 AMR a=fmtp:97 mode=0,2,5,7; maxframes=2 6. References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 [2] GSM 06.90: Adaptive Multi-Rate (AMR) speech transcoding [3] RCR STD-27H, Personal Digital Cellular Telecommunication System RCR Standard [4] TIA/EIA -136-Rev.A, part 410 - TDMA Cellular/PCS - Radio Interface, Enhanced Full Rate Voice Codec (ACELP). Formerly IS- 641. TIA published standard, 1998 [5] GSM 06.60: Enhanced Full Rate (EFR) speech transcoding [6] GSM 06.92: Comfort noise aspects for Adaptive Multi-Rate (AMR) speech traffic channels [7] GSM 06.62: Comfort noise aspect for Enhanced Full Rate (EFR) speech traffic channels [8] M. Handley and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998 7. Author's Addresses Petri Koskelainen Nokia Research Center P.O.Box 100 FIN-33721 Tampere Finland Email: petri.koskelainen@nokia.com Ari Lakaniemi Nokia Research Center P.O.Box 407 FIN-00045 Nokia Group Lakaniemi/Koskelainen/Sj÷berg [page 4] MIME Type Registration of AMR Speech Codec March 10, 2000 Finland Email: ari.lakaniemi@nokia.com Johan Sj÷berg Ericsson Research Ericsson Radio Systems AB SE-16480 Stockholm SWEDEN Email: Johan.Sjoberg@ericsson.com This internet-draft expires in September 2000. Lakaniemi/Koskelainen/Sj÷berg [page 5]