MMUSIC Working Group T. Schierl Internet Draft Document: draft-schierl-mmusic-layered-codec-00 Expires: December 2006 June 2006 Signaling media decoding dependency in Session Description Protocol (SDP) Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 18, 2006. Copyright Notice Copyright (C) The Internet Society (2006). Abstract This memo defines semantics that allows for signaling decoding dependency of different media descriptions with the same media type in the Session Description Protocol (SDP). This is e.g. required if media data as result of a layered media coding process is separated and carried in different transport streams. INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 Therefore a new grouping type "DDP" - decoding dependency is defined to be used with the Grouping of Media Lines in the Session Description Protocol (RFC 3388); further an attribute is specified describing the relationship of media streams of a "DDP" group. Additionally attributes for description of the media properties are defined. Schierl Standards Track [page 2] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 Table of Content 1. Introduction.................................................4 2. Terminology..................................................4 3. Motivation for media dependency signaling....................4 4. Generic signaling in SDP for media dependency................5 4.1. Design Principles..........................................6 4.2. Definitions................................................6 4.3. Semantics..................................................7 4.3.1. SDP grouping semantics for decoding dependency............7 4.3.2. Attribute for dependency signaling per media-stream.......7 4.3.3. Attributes for media/operation point description..........8 5. Usage of new semantics in SDP................................9 5.1.1. General...................................................9 5.1.2. Usage with the SDP Offer/Answer Model.....................9 5.1.3. Network elements not supporting dependency signaling......9 5.2. Examples...................................................9 6. Security Considerations.....................................11 7. IANA Consideration..........................................11 8. Acknowledgements............................................11 9. References..................................................11 9.1. Normative References......................................11 9.2. Informative References....................................12 10. Author's Addresses..........................................12 11. Intellectual Property Statement.............................12 12. Disclaimer of Validity......................................12 13. Copyright Statement.........................................13 14. RFC Editor Considerations...................................13 15. Open Issues.................................................13 16. Changes Log.................................................13 Schierl Standards Track [page 3] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 1. Introduction An SDP session description may contain various media descriptions each identifying one media stream. A media description is identified by one "m=" line. If more than one "m=" line exist indicating the same media type, a receiver or network element possibly cannot identify an existing relationship between those "m=" lines. This is certainly the case if the receiver or network element is not aware of the media specific information, which may be carried within in the "fmtp:" attribute. Relationships like dependencies of media streams may exist for different reasons, as for transporting bitstream partitions of a hierarchical media coding process (also known as layered media coding process) or of a multi description coding (MDC) in different transport streams. SDP does not allow for signaling such relations. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [RFC2119]. 3. Motivation for media dependency signaling The reasons for having dependency of media descriptions with the same media type may be various. But the basic idea for all cases is the separation of partitions of a media bitstream for issues like increasing efficiency in transport or allowing scalability in network elements. Two types of dependency are explained in the following in a more detailed way: o Layered/Hierarchical decoding dependency: One or more partition(s) of a layered media bitstream, also known as media layers, may be transported in different network streams. Such a scheme is e.g. used for layered multicast transmission, where the receiver can select a certain combination of those media streams for receiving a certain level of quality or bit-rate. [SDPnew] allows only for signaling a range of transport addresses or ports for a certain media description, but does not take into account that different combination of a layered media bitstream result in different operation points (represented by a layer or a combination of layers) of the media bitstream. These operation points may Schierl Standards Track [page 4] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 require different media codec specific signaling in the "fmtp:" attribute. Further the operation points in different transport streams may belong to different payload types. This may be in particular the case if using the Scalable Video Coding (SVC) Extensions of H.264/MPEG-4 AVC (payload format [SVCpayld]). The base layer of a media-stream of this layered media coding standard is plain H.264/MPEG-4 AVC for compatibility reasons with old receivers, thus the base layer is transported using the native payload format of H.264 [RFC3984]. But enhancement layers of such a media-stream may be transported using the SVC payload format [SVCpayld]. At this point, SVC is used as an example for a layered media coding standard in general. Audio coding standards can be as well of layered nature. Since layered media coding standards in general may save cost in infrastructure for content generation and delivery as well as in transmission bandwidth, it is foreseeable that also future coding standards will follow the layered design. o Equivalent decoding dependency: Dependency of media streams do not necessarily have to be of a hierarchical nature as it is the case for layered media. A relationship between media streams may be also of equal relevance. Maybe all partitions of the media bitstream are required, or at minimum one partition is required for successfully decoding, i.e. a valid media bit-stream can be (re-)constructed at the receiver for decoding. An example for equal importance of partitions of a media bitstream in different transport streams, but with the requirement of having all partitions available for successfully decoding: A video coding standard that allows for differentiated, high quality coding of the color components. In such a case it may be helpful to transport the color components in different transport streams. An example for equal importance of partitions of a media bitstream in different transport streams, but with the requirement of having minimally only one partition available for successfully decoding: A multi description coding (MDC) process. In such a process equal partitions of a bitstream are generated, which can be decoded independently, i.e. each partition is a valid media bit-stream. By each additionally received partition of the MDC stream the quality of the media may be enhanced. 4. Generic signaling in SDP for media dependency Schierl Standards Track [page 5] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 4.1. Design Principles For the separated transport of a media bitstream in different transport streams, the media description of SDP is assumed as the only multiplexing point for the transport protocol, i.e. dependency signaling is only feasible between media descriptions described with a "m="-line and with an assigned media identification attribute ("mid") defined in RFC3388. 4.2. Definitions Media stream: As used in [SDPnew]. Media bitstream: A valid, decodable stream of binary data produced by a media encoder. Decoding dependency: Partitions of a media bitstream may be separated for transportation or scalability issues and must be re-interleaved at the receiver. The result of the re-interleaving process is a valid media bitstream. Equivalent decoding dependency: A media stream may be separated into partitions for transportation issues, where all partitions of the bitstream are required for reconstructing a valid media bitstream. Hierarchical/layered coding dependency: Partitions of a layered media bitstream can be removed or added for scaling quality of the media and bit-rate of the stream. These partitions of the bitstream have a hierarchical dependency. A partition may depend on one or more partition(s). The dependencies between the layered bitstream partitions create a directed graph. Operation point: A subset of a layered media bitstream, including all partitions required for reconstructing a valid media bitstream. This subset of the media represents a certain level or point of quality. Layers of the media bitstream not required for decoding the operation point does not belong to it. Multi description coding (MDC) dependency: A bitstream as result of a multi description coding process can be separated into sub bitstreams, where each of the sub bitstreams can be decoded independently, i.e. each sub bitstream represents a valid media bitstream. A combination of one or more of these sub bitstreams may result in higher quality than decoding a smaller number of sub-bitstreams. Schierl Standards Track [page 6] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 Fine Granularity Scalability (FGS): This capability of a media allows for truncation of media frames or packets by cutting bytes-wise from the end of a media frame or packet for bit-rate and quality reduction/adaptation. An example for a definition of this feature used with video is given in [SVCpayld], but this feature is supported by media coding standards for audio as well. 4.3. Semantics 4.3.1. SDP grouping semantics for decoding dependency This specification defines the new grouping semantics Decoding Dependency "DDP": All media streams of a DDP group have the same type of coding dependency (as signaled by attribute defined in 4.3.2) and belong to one media, i.e. one, more or all media streams of a DDP group may be required for reconstructing a valid media bitstream. This group type informs a receiver or a middle box about the requirement for treating the streams of the group in a similar or the same way. For detailed knowledge about how to treat the streams, the new media level attribute "depend", defined in 4.3.2, SHALL be used. 4.3.2. Attribute for dependency signaling per media-stream This specification defines a new media-level value attribute, "depend". Therefore the formatting in SDP is described by the following BNF [RFC2234]. The "identification-tag" is defined in [RFC3388]: depend-attribute = "a=depend:" dependency-type-tag *(space identification-tag) dependency-type-tag = dependency dependency = "lay" / "eql" / "mdc" The "depend"-attribute describes the decoding dependency. Different types of dependency are defined within this document. The "depend"- attribute may be followed by a sequence of identification-tag(s) for expressing the directly related media streams. The following types of dependency are defined: o lay: Layered decoding dependency - signals that the described media stream is a partition of a layered media bitstream and MUST have the streams identified by the following identification-tag(s) Schierl Standards Track [page 7] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 available for re-interleaving or re-constructing the valid media bitstream. The identification-tag(s) MUST be present for this type of dependency. Further the described media stream represents one operation point of the layered media bitstream, i.e. all other media streams belonging to the same dependency group, but not identified by a identification-tag MAY be left out for scalability or transport issues for the operation point given by this media description. o eql: Equal decoding dependency - signals that the described media stream is a partition of a media bitstream and has the same importance for decoding as the remainder media streams in the group, i.e. all media streams of the group MUST be available for re- interleaving or re-constructing a valid media bitstream. This type of dependency does not require the signaling of the depended media streams. o mdc: Multi descriptive decoding dependency - signals that the described media stream is a partition of a multi description coding (MDC) media bitstream, i.e. at minimum one stream of the group MUST be received for allowing decoding the media. Receiving more than one stream of the group may enhance the decodable quality of the media bitstream. This type of dependency does not require the signaling of the depended media streams. 4.3.3. Attributes for media/operation point description Currently two media level attributes are defined for description of the media. These attributes define the property of the media itself or if separately transported the property of the re-constructed operation point of the media description (after combining the depending media streams to a valid media bit-stream): o "a=resolution: " This media level attribute gives the maximum presentation size of the signaled video in terms of pixels. If the color space components are of different resolution, the resolution of the luminance component is indicated. gives the maximum horizontal size of the video and gives the maximum vertical size of the video in terms of pixels. o "a=fgscapability" If present, this media level attribute indicates the so-called 'Fine Granularity Scalability (FGS)' capability. This attribute gives the capability of truncating network packets for bit-rate and quality reduction of a media stream. The minimal achievable network packet size SHALL be derived from the transport parameters. Schierl Standards Track [page 8] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 5. Usage of new semantics in SDP 5.1.1. General Sender and receiver using the feature of separating a media stream for transport SHALL support the signaling defined in this specification. Using the information about the decoding dependency may give a network element more options in treating the media streams of a session. Therefore the network element does not need to know details about the media (e.g. about the media format description), but SHALL use the information defined in this specification for treating the media streams. 5.1.2. Usage with the SDP Offer/Answer Model If an Answerer does not understand the decoding dependency signaling, it may detect the 'base' media only for a layered media session or may detect only one media-stream of MDC media session. Thus for both described cases, an Answerer may not understand the full media description, but may be able to request a valid sub-set of the offered media. For the Equal decoding dependency case, an Answerer may not correctly understand the session description. If an Offerer is not able to interpret the decoding dependency signaling, the Offerer SHALL NOT offer the feature of separating a media into different transport sessions. 5.1.3. Network elements not supporting dependency signaling Network elements that do not understand the new grouping type, but understand grouping in general, MAY detect a general requirement of treating the media streams of the group in a certain way. Network elements that do not understand the decoding dependency signaling MAY treat all media streams of a session in the same way or MAY use their knowledge about the media format description for treatment of media streams, if such knowledge does exist. Receivers that do not understand the signaling defined in this specification may detect a subset of the separated media only, thus the receiver may not understand the full media description, but may be able to understand and/or request a subset of the media. 5.2. Examples a.) Example for signaling transport of operation points of a layered video bitstream in different transport streams: Schierl Standards Track [page 9] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 v=0 o=svcsrv 289083124 289083124 IN IP4 host.example.com s=LAYERED VIDEO SIGNALING Seminar t=0 0 c=IN IP4 224.2.17.12/127 a=group:DDP 1 2 3 4 m=video 40000 RTP/AVP 94 b=AS:96 a=framerate:15 a=resolution:176 144 a=rtpmap:94 h264/90000 a=mid:1 a=depend:lay m=video 40002 RTP/AVP 95 b=AS:64 a=framerate:15 a=resolution:320 240 a=rtpmap:95 svc1/90000 a=mid:2 a=depend:lay 1 m=video 40004 RTP/AVP 96 b=AS:128 a=framerate:30 a=resolution:320 240 a=fgscapability a=rtpmap:96 svc1/90000 a=mid:3 a=depend:lay 1 2 m=video 40004 RTP/AVP 100 c=IN IP4 224.2.17.13/127 b=AS:256 a=framerate:60 a=resolution:640 480 a=rtpmap:100 svc1/90000 a=mid:4 a=depend:lay 1 2 3 b.) Example for signaling transport of streams of a multi description (MDC) video bitstream in different transport streams. Examples for signaling Equal decoding dependency ("eql") is very similar and is left out for that reason: v=0 Schierl Standards Track [page 10] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 o=mdcsrv 289083124 289083124 IN IP4 host.example.com s=MULTI DESCRIPTION VIDEO SIGNALING Seminar t=0 0 c=IN IP4 224.2.17.12/127 a=group:DDP 1 2 3 m=video 40000 RTP/AVP 94 a=mid:1 a=depend:mdc m=video 40002 RTP/AVP 95 a=mid:2 a=depend:mdc m=video 40004 RTP/AVP 96 c=IN IP4 224.2.17.13/127 a=mid:3 a=depend:mdc 6. Security Considerations 7. IANA Consideration 8. Acknowledgements Funding for the RFC Editor function is currently provided by the Internet Society. 9. References 9.1. Normative References [SDPnew] M. Handley, V. Jacobson, and C. Perkins, "SDP: Session Description Protocol", IETF work in progress, January 2006. [RFC3388] G. Camarillo, J. Holler, and H. Schulzrinne, "Grouping of Media Lines in the Session Description Protocol (SDP)", RFC 3388, December 2002. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, November 1997 Schierl Standards Track [page 11] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 9.2. Informative References [SVCpayld] Wenger,S., Wang, Y.-K., Schierl, T., "RTP Payload Format for SVC Video", "draft-wenger-avt-rtp-svc-02.txt", June 2006 [RFC3984] Wenger, S., Hannuksela, M., Stockhammer, T., Westerlund, M., Singer, D., "RTP Payload Format for H.264 Video", RFC 3984, February 2005 10. Author's Addresses Thomas Schierl Phone: +49-30-31002-227 Fraunhofer HHI Email: schierl@hhi.fhg.de Einsteinufer 37 D-10587 Berlin Germany 11. Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. 12. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET Schierl Standards Track [page 12] INTERNET-DRAFT draft-schierl-mmusic-layered-codec-00 June 2006 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 13. Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. 14. RFC Editor Considerations none 15. Open Issues - This draft is written with the assumption, that the media description ("m"-line) is the one and only multiplexing point. If payload type multiplexing (as used in draft-ietf-avt-rtp-retransmission-12) should be used with the signaling defined in this draft, a complete different approach may be used not based on the grouping semantics. - Missing reference to layered audio codec. - More detailed description of MDC. Missing examples for MDC, since no standard is available? MDC audio codecs? - FGS capability signaling may be extended. - Description of media operation points may be extended. - Missing Example for "Equal decoding dependency". 16. Changes Log Schierl Standards Track [page 13]