Internet Engineering Task Force Audio-Video Transport Working Group INTERNET-DRAFT W. Fenner draft-ietf-avt-jpeg-00.txt Kaman Sciences L. Berc Digital Equipment Corporation R. Frederick Xerox PARC S. McCanne Lawrence Berkeley Labs November 22, 1994 Expires: 5/1/95 RTP Encapsulation of JPEG-compressed video. Status of this Memo This document is an Internet Draft. Internet Drafts are working docu- ments of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts). Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress." Please check the I-D abstract listing contained in each Internet Draft directory to learn the current status of this or any other Inter- net Draft. Distribution of this document is unlimited. Abstract This draft describes a packetization scheme for JPEG video over RTP. It is optimized for real-time video streams using constant JPEG parameters, as opposed to individual JPEG images coming from different sources. This document is a product of the Audio-Video Transport working group within the Internet Engineering Task Force. Comments are solicited and should be addressed to the working group's mailing list at rem- conf@es.net and/or the author(s). Expires March 1995 [Page 1] Internet Draft draft-ietf-avt-jpeg-00.txt November 1994 1. Introduction This document describes the transport of JPEG-compressed video over RTP. JPEG-compressed video has several unique features: + Each frame is large, requiring fragmentation and reassembly + There is no easy way to recover from a lost segment - a lost seg- ment means the whole frame is lost. The JPEG spec specifies a method to recover, but not all hardware decoders can handle it. 2. RTP Usage The RTP timestamp is in units of 65536Hz. The same timestamp value is used for all fragments of a single frame. The RTP sequence number sequentially increases for each packet. The RTP marker bit marks the end of a frame. 3. JPEG header A special header is added to each packet. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MBZ | Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Q | Width | Height | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3.1. MBZ: 8 bits This space is reserved for future use. 3.2. Fragment Offset: 24 bits The Fragment Offset is the data offset in bytes of the current packet in the full JPEG frame. 3.3. Type: 8 bits The type field describes the format of the JPEG data. It encodes all of the JFIF options. Types 0-127 are pre-determined by pro- files, and types 128-255 are free to be redefined by the session protocol. Expires March 1995 [Page 2] Internet Draft draft-ietf-avt-jpeg-00.txt November 1994 3.4. Q Factor (Q): 8 bits The Q Factor describes the current JPEG quantization table. If 1 <= Q <= 99, the algorithm devised by the Independent JPEG Group is used to calculate the JPEG quantization table. If Q > 100, a cus- tom quantization table is being used. It is expected that the standard quantization tables will handle almost every possible case, and custom tables will be used rarely. 3.5. Width: 8 bits Width encodes the number of pixels in multiples of 8 (i.e. a width of 40 denotes an image 320 pixels wide). 3.6. Height: 8 bits Height encodes the number of pixels in multiples of 8 (i.e. a height of 30 denotes an image 240 pixels tall). 3.7. Data The data following the header is an entropy coded JPEG stream as defined in the JPEG standard. JPEG markers are 0xFF bytes in the data stream. A "stuffed" 0x00 byte follows any 0xFF byte generated by the entropy coder (as per section B.1.1.5 of the JPEG standard). 4. Discussion 4.1. The Type Field The Type field encodes all of the JFIF parameters that are expected to stay constant over the lifetime of a session in a single byte. Two type fields are currently defined: Type 0: YUV 4:2:2 square pixels, 16x8 MCU, standard Huffman table Type 1: YUV 4:1:1 square pixels, 16x16 MCU, standard Huffman table A type may be dynamically defined during a session using the session protocol. 4.2. Fragmentation and Reassembly Since JPEG frames are large, they must be fragmented. Frames should be fragmented into packets in a manner avoiding fragmentation at a lower level. When using restart markers, packets should be fragmented such that each packet contains one restart interval (see below). Expires March 1995 [Page 3] Internet Draft draft-ietf-avt-jpeg-00.txt November 1994 4.3. Restart Markers Restart markers indicate a point in the JPEG stream at which the Huffman coder is reset, allowing partial decoding starting at that point. The use of restart markers allows for robustness in the face of packet loss. However, not all hardware decoders support restart markers, meaning that such hardware will only be able to decode the first portion of a frame, up to a restart marker, and then fail. Thus, for maximum interoperabil- ity, we do not include restart markers in the JPEG data. If we include restart markers, each packet should contain a single res- tart interval. Since there is no way to tell a priori how much data will occur between restart markers, a restart interval might span multi- ple packets. If a restart interval must be fragmented, it is preferable to create a short packet so that the next restart interval can occur at the beginning of a packet once again. 5. Security Considerations Security issues are not discussed in this memo. 6. Authors' Addresses William C. Fenner Kaman Sciences Corporation 2560 Huntington Ave. Alexandria, VA 22303 Phone: +1 202-404-7030 Email: fenner@cmf.nrl.navy.mil Lance Berc Digital Equipment Corporation Email: berc@src.dec.com Ron Frederick Xerox PARC Email: frederick@parc.xerox.com Stephen McCanne Lawrence Berkeley Labs Email: mccanne@ee.lbl.gov Expires March 1995 [Page 4] Internet Draft draft-ietf-avt-jpeg-00.txt November 1994 Appendix A The following code can be used to create a quantization table from a Q factor: /* * Table K.1 from JPEG spec. */ static const int jpeg_luma_quantizer[64] = { 16, 11, 10, 16, 24, 40, 51, 61, 12, 12, 14, 19, 26, 58, 60, 55, 14, 13, 16, 24, 40, 57, 69, 56, 14, 17, 22, 29, 51, 87, 80, 62, 18, 22, 37, 56, 68, 109, 103, 77, 24, 35, 55, 64, 81, 104, 113, 92, 49, 64, 78, 87, 103, 121, 120, 101, 72, 92, 95, 98, 112, 100, 103, 99 }; /* * Table K.2 from JPEG spec. */ static const int jpeg_chroma_quantizer[64] = { 17, 18, 24, 47, 99, 99, 99, 99, 18, 21, 26, 66, 99, 99, 99, 99, 24, 26, 56, 99, 99, 99, 99, 99, 47, 66, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99 }; /* * Call MakeTables with the Q factor and two int[64] return arrays */ void MakeTables(int q, int *lum_q, int *chr_q) { int i; if (q < 1) factor = 1; if (q > 99) factor = 99; if (q < 50) q = 5000 / factor; else q = 200 - factor*2; Expires March 1995 [Page 5] Internet Draft draft-ietf-avt-jpeg-00.txt November 1994 for (i=0; i < 64; i++) { lum_q[i] = ( jpeg_luma_quantizer[i] * q + 50) / 100; chr_q[i] = ( jpeg_chroma_quantizer[i] * q + 50) / 100; /* Limit the quantizers to 1 < q < 256 */ if ( lum_q[i] < 2) lum_q[i] = 1; if ( chr_q[i] < 2) chr_q[i] = 1; if ( lum_q[i] > 255) lum_q[i] = 255; if ( chr_q[i] > 255) chr_q[i] = 255; } } Expires March 1995 [Page 6]