Session Description

Requirements

We want to express

This list does not include user interaction, e.g., branching for alternate movie endings or different paths through an educational video. These requirements are partially addressed by scripting languages (like Lingo or content descriptions such as MHEG or HyTime).

Sequencing of media within a clip is not strictly required, since we can have inactive media, with the potential problems of having to allocate screen real-estate, invoke applications, etc. even though the media stream may never be invoked.

In its full generality, this requires a "railroad diagram", with streams diverging and joining along the time axis. However, the likely number of useful programs is small, so that enumeration should be sufficient.

The description should be

Syntax

SDP

SDP is described in an Internet draft.

We could enhance SDP with a per-media identifier such as

m=audio
#=French_Audio
and have a description at the beginning describing the possible alternatives. For example:
p=French_Audio,Video; English_Audio,Video

This does not offer the ability to re-use parameters across media nor have per-program description as in "French movie".

We could also have a program description section, similar to the media descriptions. Generally, as a linear format, SDP gets increasingly messy as structure as added, since the boundary between levels is only determined by special flag characters. Also, this would not be upward compatible with the existing spec, so it is not clear what advantage this confers. Furthermore, the set of type is deliberately small and not meant to be extensible, with extensions handled through a somewhat awkward a= mechanism.

Per-media encoding alternatives can already be expressed in SDP, but they cannot be named individually.

".ini" file format

Works, but structure is only implied by the section headings. Inheritance is not clear. Not clear how long items are handled.

List format (SDF)

A hierarchical listings of the types session, program, media, where media can be declared at session or program level. Media and programs can be declared sequential, concurrent or alternative (pick one of). This is similar to a MIME message body. (We could use that format as well, but that's a different story.) This format is similar to Lisp lists and has, in variations, been widely used, for example, for the PICS system, query representation (Bowman and Dharap, Usenix Winter'93), and various SunOS description including the Sun calendar manager and the xview GUI generator.

This format is also similar to the Transparent Content Negotiation in HTTP specification.

(session
   (version 1.0)
   (rating PG-13)
   (screenplay "Gunther Grass")
   (cost $ 7)
   (summary "This movie describes the coming-of-age of a boy in pre-WW
II Dresden. It contains nudity, profanity and the typical German unhappy
ending."

   (media
     (id   4567)
     (type video)
     (bitrate 80000)
   )

   (one-of
     (program
        (title  "Die Blechtrommel")
        (language ge)    ; using the HTTP language identifiers
        (media (id 4567))

        (one-of
          (type audio)   ; this applies to anything at this level
          (sequence      ; the introductory music is at better quality
            (media
              (id 14)
              (encoding L16)
              (rating G)  ; just music
              (channels 1)
            )
            (media
              (id 12)
              (encoding PCMU)
              (start 0:12) ; somehow missing the first few minutes
              (rating PG)  ; this sound track has the salty parts bleeped out
              (channels 1)
            )
          )
          (media
            (id 13)
            (encoding DVI4)
            (channels 2)
          )
        )
      )
      (program
        (title  "The Tin Drum")
        (language en/us)
        (one-of
          (media
            (id 14)   ; for some reason, this was only dubbed in stereo
            (encoding L16)
            (channels 2)
          )
        )
      )
   )
)

Clearly, there are a lot of variation on this syntax. The spacing and indentation are completely arbitrary. The syntax has the advantage of being easy to parse and not dependent on line boundaries, so that it easily survives mail transport, Netscape editors, etc. It naturally expresses hierarchies and inheritance.

This syntax requires slightly more space than the original SDP format. The SDP example

v=0
o=mhandley 2890844526 2890842807 IN IP4 126.16.64.4
s=Sd seminar
i=A seminar on the session description protocol
u=http://www.cs.ucl.ac.uk/staff/M.Handley/sdp.01.ps
e=M.Handley@cs.ucl.ac.uk (Mark Handley)
c=IN IP4 224.2.17.12/127
t=2873397496 2873404696
a=recvonly
m=audio 3456 VAT PCMU
m=video 2232 RTP H261
m=whiteboard 32416 UDP WB
a=orient:portrait
is rewritten compactly as
(session (v 0)(o mhandley 2890844526 2890842807 IN IP4 126.16.64.4)
(s Sd seminar)(i A seminar on the session description protocol)
(u http://www.cs.ucl.ac.uk/staff/M.Handley/sdp.01.ps)
(e M.Handley@cs.ucl.ac.uk (Mark Handley))
(c IN IP4 224.2.17.12/127)(t 2873397496 2873404696)(a recvonly)
(all (media (m audio 3456 VAT PCMU))
(media (m video 2232 RTP H261))
(media (m whiteboard 32416 UDP WB)(orient portrait))
))

When session descriptions are transmitted using SDAP, bandwidth and UDP packet size limitations make space efficiency an important consideration. Expressed in SDP, the session description takes 357 bytes, while the SDF version takes 420 bytes. Generally, each SDF parameter requires two "framing" bytes, the opening and closing parenthesis, while SDP requires one, the line terminator. IF space is at a premium, standard compression algorithms such as gzip can be applied. After gzip compression, the SDF example above is reduced to 315 bytes.

The example re-uses the parameter abbreviations of SDP, trading maximum space efficiency for limited readability. For environments which are not as sensitive to maximum encoding efficiency, longer parameter names could be used, with the same parameter having both a "human-readable", self-explanatory variant and the single-letter variation shown here. It should be noted that if maximum space efficiency is mandatory, a binary format is likely to be necessary.

There are still a number of unresolved issues:

Calendar
The interaction with calendaring efforts in the IETF and outside (vcalendar) should be addressed. In particular, the description of a sequence of active times should be coordinated between the two efforts. The ability to re-use a vCalendar spec within SDF is very desirable.

Instead of the current time specification, the human-readable ISO 8601 form should be considered: 847572017 (Sat Nov 9 15:40:29 EST 1996) becomes 19961109T154017Z.