Internet Engineering Task Force MMUSIC WG Internet Draft Handley/Schulzrinne/Schooler/Rosenberg ietf-mmusic-sip-11.txt ISI/Columbia U./Caltech/Bell Labs. December 15, 1998 Expires: June 1999 SIP: Session Initiation Protocol STATUS OF THIS MEMO This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress''. To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. ABSTRACT The Session Initiation Protocol (SIP) is an application- layer control (signaling) protocol for creating, modifying and terminating sessions with one or more participants. These sessions include Internet multimedia conferences, Internet telephone calls and multimedia distribution. Members in a session can communicate via multicast or via a mesh of unicast relations, or a combination of these. SIP invitations used to create sessions carry session descriptions which allow participants to agree on a set of compatible media types. SIP supports user mobility by proxying and redirecting requests to the user's current location. Users can register their current location. SIP is not tied to any particular conference control protocol. SIP is designed to be independent of the Handley/Schulzrinne/Schooler/Rosenberg [Page 1] Internet Draft SIP December 15, 1998 lower-layer transport protocol and can be extended with additional capabilities. This document is a product of the Multi-party Multimedia Session Control (MMUSIC) working group of the Internet Engineering Task Force. Comments are solicited and should be addressed to the working group's mailing list at confctrl@isi.edu and/or the authors. 1 Introduction 1.1 Overview of SIP Functionality The Session Initiation Protocol (SIP) is an application-layer control protocol that can establish, modify and terminate multimedia sessions or calls. These multimedia sessions include multimedia conferences, distance learning, Internet telephony and similar applications. SIP can invite both persons and "robots", such as a media storage service. SIP can invite parties to both unicast and multicast sessions; the initiator does not necessarily have to be a member of the session to which it is inviting. Media and participants can be added to an existing session. SIP can be used to initiate sessions as well as invite members to sessions that have been advertised and established by other means. Sessions can be advertised using multicast protocols such as SAP, electronic mail, news groups, web pages or directories (LDAP), among others. SIP transparently supports name mapping and redirection services, allowing the implementation of ISDN and Intelligent Network telephony subscriber services. These facilities also enable personal mobility. In the parlance of telecommunications intelligent network services, this is defined as: "Personal mobility is the ability of end users to originate and receive calls and access subscribed telecommunication services on any terminal in any location, and the ability of the network to identify end users as they move. Personal mobility is based on the use of a unique personal identity (i.e., personal number)." [1]. Personal mobility complements terminal mobility, i.e., the ability to maintain communications when moving a single end system from one subnet to another. SIP supports five facets of establishing and terminating multimedia communications: User location: determination of the end system to be used for communication; Handley/Schulzrinne/Schooler/Rosenberg [Page 2] Internet Draft SIP December 15, 1998 User capabilities: determination of the media and media parameters to be used; User availability: determination of the willingness of the called party to engage in communications; Call setup: "ringing", establishment of call parameters at both called and calling party; Call handling: including transfer and termination of calls. SIP can also initiate multi-party calls using a multipoint control unit (MCU) or fully-meshed interconnection instead of multicast. Internet telephony gateways that connect Public Switched Telephone Network (PSTN) parties can also use SIP to set up calls between them. SIP is designed as part of the overall IETF multimedia data and control architecture currently incorporating protocols such as RSVP (RFC 2205 [2]) for reserving network resources, the real-time transport protocol (RTP) (RFC 1889 [3]) for transporting real-time data and providing QOS feedback, the real-time streaming protocol (RTSP) (RFC 2326 [4]) for controlling delivery of streaming media, the session announcement protocol (SAP) [5] for advertising multimedia sessions via multicast and the session description protocol (SDP) (RFC 2327 [6]) for describing multimedia sessions. However, the functionality and operation of SIP does not depend on any of these protocols. SIP can also be used in conjunction with other call setup and signaling protocols. In that mode, an end system uses SIP exchanges to determine the appropriate end system address and protocol from a given address that is protocol-independent. For example, SIP could be used to determine that the party can be reached via H.323 [7], obtain the H.245 [8] gateway and user address and then use H.225.0 [9] to establish the call. In another example, SIP might be used to determine that the callee is reachable via the PSTN and indicate the phone number to be called, possibly suggesting an Internet-to-PSTN gateway to be used. SIP does not offer conference control services such as floor control or voting and does not prescribe how a conference is to be managed, but SIP can be used to introduce conference control protocols. SIP does not allocate multicast addresses. SIP can invite users to sessions with and without resource reservation. SIP does not reserve resources, but can convey to the invited system the information necessary to do this. Handley/Schulzrinne/Schooler/Rosenberg [Page 3] Internet Draft SIP December 15, 1998 1.2 Terminology In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in RFC 2119 [10] and indicate requirement levels for compliant SIP implementations. 1.3 Definitions This specification uses a number of terms to refer to the roles played by participants in SIP communications. The definitions of client, server and proxy are similar to those used by the Hypertext Transport Protocol (HTTP) (RFC 2068 [11]). The terms and generic syntax of URI and URL are defined in RFC 2396 [12]. The following terms have special significance for SIP. Call: A call consists of all participants in a conference invited by a common source. A SIP call is identified by a globally unique call-id (Section 6.12). Thus, if a user is, for example, invited to the same multicast session by several people, each of these invitations will be a unique call. A point-to-point Internet telephony conversation maps into a single SIP call. In a multiparty conference unit (MCU) based call-in conference, each participant uses a separate call to invite himself to the MCU. Call leg: A call leg is identified by the combination of Call-ID, To and From. Client: An application program that sends SIP requests. Clients may or may not interact directly with a human user. User agents and proxies contain clients (and servers). Conference: A multimedia session (see below), identified by a common session description. A conference can have zero or more members and includes the cases of a multicast conference, a full-mesh conference and a two-party "telephone call", as well as combinations of these. Any number of calls can be used to create a conference. Downstream: Requests sent in the direction from the caller to the callee (i.e., user agent client to user agent server). Final response: A response that terminates a SIP transaction, as opposed to a provisional response that does not. All 2xx, 3xx, 4xx, 5xx and 6xx responses are final. Initiator, calling party, caller: The party initiating a conference invitation. Note that the calling party does not have to be the Handley/Schulzrinne/Schooler/Rosenberg [Page 4] Internet Draft SIP December 15, 1998 same as the one creating the conference. Invitation: A request sent to a user (or service) requesting participation in a session. A successful SIP invitation consists of two transactions: an INVITE request followed by an ACK request. Invitee, invited user, called party, callee: The person or service that the calling party is trying to invite to a conference. Isomorphic request or response: Two requests or responses are defined to be isomorphic for the purposes of this document if they have the same values for the Call-ID, To, From and CSeq header fields. In addition, requests have to have the same Request-URI. Location server: See location service. Location service: A location service is used by a SIP redirect or proxy server to obtain information about a callee's possible location(s). Location services are offered by location servers. Location servers MAY be co-located with a SIP server, but the manner in which a SIP server requests location services is beyond the scope of this document. Parallel search: In a parallel search, a proxy issues several requests to possible user locations upon receiving an incoming request. Rather than issuing one request and then waiting for the final response before issuing the next request as in a sequential search , a parallel search issues requests without waiting for the result of previous requests. Provisional response: A response used by the server to indicate progress, but that does not terminate a SIP transaction. 1xx responses are provisional, other responses are considered final. Proxy, proxy server: An intermediary program that acts as both a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally or by passing them on, possibly after translation, to other servers. A proxy interprets, and, if necessary, rewrites a request message before forwarding it. Redirect server: A redirect server is a server that accepts a SIP request, maps the address into zero or more new addresses and returns these addresses to the client. Unlike a proxy server , it does not initiate its own SIP request. Unlike a user agent server , it does not accept calls. Handley/Schulzrinne/Schooler/Rosenberg [Page 5] Internet Draft SIP December 15, 1998 Registrar: A registrar is a server that accepts REGISTER requests. A registrar is typically co-located with a proxy or redirect server and MAY offer location services. Ringback: Ringback is the signaling tone produced by the calling client's application indicating that a called party is being alerted (ringing). Server: A server is an application program that accepts requests in order to service requests and sends back responses to those requests. Servers are either proxy, redirect or user agent servers or registrars. Session: From the SDP specification: "A multimedia session is a set of multimedia senders and receivers and the data streams flowing from senders to receivers. A multimedia conference is an example of a multimedia session." (RFC 2327 [6]) (A session as defined for SDP can comprise one or more RTP sessions.) As defined, a callee can be invited several times, by different calls, to the same session. If SDP is used, a session is defined by the concatenation of the user name , session id , network type , address type and address elements in the origin field. (SIP) transaction: A SIP transaction occurs between a client and a server and comprises all messages from the first request sent from the client to the server up to a final (non-1xx) response sent from the server to the client. A transaction is identified by the CSeq sequence number (Section 6.17) within a single call leg. The ACK request has the same CSeq number as the corresponding INVITE request, but comprises a transaction of its own. Upstream: Responses sent in the direction from the user agent server to the user agent client. URL-encoded: A character string encoded according to RFC 1738, Section 2.2 [13]. User agent client (UAC), calling user agent: A user agent client is a client application that initiates the SIP request. User agent server (UAS), called user agent: A user agent server is a server application that contacts the user when a SIP request is received and that returns a response on behalf of the user. The response accepts, rejects or redirects the request. An application program MAY be capable of acting both as a client and a server. For example, a typical multimedia conference control Handley/Schulzrinne/Schooler/Rosenberg [Page 6] Internet Draft SIP December 15, 1998 application would act as a user agent client to initiate calls or to invite others to conferences and as a user agent server to accept invitations. The properties of the different SIP server types are summarized in Table 1. property redirect proxy user agent registrar server server server __________________________________________________________________ also acts as a SIP client no yes no no returns 1xx status yes yes yes yes returns 2xx status no yes yes yes returns 3xx status yes yes yes yes returns 4xx status yes yes yes yes returns 5xx status yes yes yes yes returns 6xx status no yes yes no inserts Via header no yes no no accepts ACK yes yes yes no Table 1: Properties of the different SIP server types 1.4 Overview of SIP Operation This section explains the basic protocol functionality and operation. Callers and callees are identified by SIP addresses, described in Section 1.4.1. When making a SIP call, a caller first locates the appropriate server (Section 1.4.2) and then sends a SIP request (Section 1.4.3). The most common SIP operation is the invitation (Section 1.4.4). Instead of directly reaching the intended callee, a SIP request may be redirected or may trigger a chain of new SIP requests by proxies (Section 1.4.5). Users can register their location(s) with SIP servers (Section 4.2.6). 1.4.1 SIP Addressing The "objects" addressed by SIP are users at hosts, identified by a SIP URL. The SIP URL takes a form similar to a mailto or telnet URL, i.e., user@host. The user part is a user name or a telephone number. The host part is either a domain name having a DNS SRV (RFC 2052 [14]), CNAME or A record (RFC 1035 [15]), or a numeric network address. See section 2 for a detailed discussion of SIP URL's. A user's SIP address can be obtained out-of-band, can be learned via existing media agents, can be included in some mailers' message headers, or can be recorded during previous invitation interactions. In many cases, a user's SIP URL can be guessed from their email address. Handley/Schulzrinne/Schooler/Rosenberg [Page 7] Internet Draft SIP December 15, 1998 A SIP URL address can designate an individual (possibly located at one of several end systems), the first available person from a group of individuals or a whole group. The form of the address, for example, sip:sales@example.com , is not sufficient, in general, to determine the intent of the caller. If a user or service chooses to be reachable at an address that is guessable from the person's name and organizational affiliation, the traditional method of ensuring privacy by having an unlisted "phone" number is compromised. However, unlike traditional telephony, SIP offers authentication and access control mechanisms and can avail itself of lower-layer security mechanisms, so that client software can reject unauthorized or undesired call attempts. 1.4.2 Locating a SIP Server When a client wishes to send a request, the client either sends it to a locally configured SIP proxy server (as in HTTP), independent of the Request-URI, or sends it to the IP address and port corresponding to the Request-URI. For the latter case, the client must determine the protocol, port and IP address of a server to send the request to. A client SHOULD follow the steps below to obtain this information. At each step, unless stated otherwise, the client SHOULD try to contact a server at the port number listed in the Request-URI. If no port number is present in the Request-URI, the client uses port 5060. If the Request-URI specifies a protocol (TCP or UDP), the client contacts the server using that protocol. If no protocol is specified, the client tries UDP (if UDP is supported), and if the attempt fails, then tries TCP (if TCP is supported). The client tries to find one or more addresses for the server by querying DNS. There are several different ways it may query DNS, and these are listed in the steps below. If a step elicits no addresses, the client SHOULD continue to the next step. However if a step elicits one or more addresses, but no SIP server at any of those addresses responds, then the client concludes the server is down and SHOULD NOT continue on to the next step. 1. If the host portion of the Request-URI is an IP address, the client contacts the server at the given address. If the host portion of the Request-URI is not an IP address, the client proceeds to the next step. 2. An implementation MAY use the experimental procedure in Appendix D at this step. This step relies on DNS SRV records, which are currently experimental and may change Handley/Schulzrinne/Schooler/Rosenberg [Page 8] Internet Draft SIP December 15, 1998 before being published as a standard. If this step is used, and no servers are returned in the DNS query, the client proceeds to the next step. 3. The client queries the name server for CNAME records with the QNAME=sip.host, where host is the host portion of the Request-URI, as described in RFC 2219 [16]. For each address in the answer, the client attempts to contact a server at that address. If, however, there were no CNAME records, the client goes to the next step. 4. The client queries the name server for A records with the QNAME=sip.host, where host is the host portion of the Request-URI, as described in RFC 2219 [16]. For each address in the answer, the client attempts to contact a server at that address. If, however, there were no A records, the client goes to the next step. 5. The client queries the name server for CNAME records for the host portion of the Request-URI. For each address in the answer, the client attempts to contact a server at that address. If, however, there were no CNAME records, the client goes to the next step. 6. The client queries the name server for A records for the host portion of the Request-URI. For each address in the answer, the client attempts to contact a server at that address. If there were no A records, the client stops, as it has been unable to locate a server. A client SHOULD rely on explicit network notifications to determine that a server is not reachable, rather than relying on timeouts. In particular, a client SHOULD be prepared to receive ICMP port, protocol, or host unreachable messages as a form of explicit notifications, and MAY be prepared to process other messages. (For socket-based programs: For TCP, connect() returns ECONNREFUSED if the client could not connect to a server at that address. For UDP, the socket needs to be bound to the destination address using connect() rather than sendto() or similar so that a second write() fails with ECONNREFUSED if there is no server listening) If the client finds the server is not reachable at a particular address, it SHOULD behave as if it had received a 400-class error response to that request. A client MAY cache a successful DNS query result. A successful query is one which contained records in the answer, and a server was contacted at one of the addresses from the answer. When the client wishes to send a request to the same host, it MUST start the search as if it had just received this answer from the name server. The Handley/Schulzrinne/Schooler/Rosenberg [Page 9] Internet Draft SIP December 15, 1998 client MUST expire the cache entry after the DNS time-to-live value has elapsed. If the client does not find a SIP server among the addresses listed in the cached answer, it MUST start the search at the beginning of the sequence described above. For example, consider a client which wishes to send a SIP request. The Request-URI for the destination is sip:user@company.com. The client only supports UDP. It would follow these steps: 1. The host portion is not an IP address, so the client goes to step 2 above. 2. The client decides to implement the optional SRV record step. The client does a DNS query of QNAME="sip.udp.company.com", QCLASS=IN, QTYPE=SRV. Since it doesn't support TCP, it omits the TCP query. There were no addresses in the DNS response, so the client goes to the next step. 3. The client does a DNS query of QNAME="sip.company.com", QCLASS=IN, QTYPE=CNAME. There were no addresses in the DNS response. The client therefore goes to the next step. 4. The client does a DNS query of QNAME="sip.company.com", QCLASS=IN, QTYPE=A. There is one answer. The client tries that address using UDP at port 5060, and succeeds at delivering the request. 1.4.3 SIP Transaction Once the host part has been resolved to a SIP server, the client sends one or more SIP requests to that server and receives one or more responses from the server. A request (and its retransmissions) together with the responses triggered by that request make up a SIP transaction. All responses to a request contain the same values in the Call-ID, CSeq, To, and From fields (with the possible addition of a tag in the To field (section 6.37)). This allows responses to be matched with requests. The ACK request following an INVITE is not part of the transaction since it may traverse a different set of hosts. If TCP is used, request and responses within a single SIP transaction are carried over the same TCP connection (see Section 10). Several SIP requests from the same client to the same server MAY use the same TCP connection or MAY use a new connection for each request. If the client sent the request via unicast UDP, the response is sent to the address contained in the next Via header field (Section 6.40) Handley/Schulzrinne/Schooler/Rosenberg [Page 10] Internet Draft SIP December 15, 1998 of the response. If the request is sent via multicast UDP, the response is directed to the same multicast address and destination port. For UDP, reliability is achieved using retransmission (Section 10). The SIP message format and operation is independent of the transport protocol. 1.4.4 SIP Invitation A successful SIP invitation consists of two requests, INVITE followed by ACK. The INVITE (Section 4.2.1) request asks the callee to join a particular conference or establish a two-party conversation. After the callee has agreed to participate in the call, the caller confirms that it has received that response by sending an ACK (Section 4.2.2) request. If the caller no longer wants to participate in the call, it sends a BYE request instead of an ACK. The INVITE request typically contains a session description, for example written in SDP (RFC 2327 [6]) format, that provides the called party with enough information to join the session. For multicast sessions, the session description enumerates the media types and formats that are allowed to be distributed to that session. For a unicast session, the session description enumerates the media types and formats that the caller is willing to receive and where it wishes the media data to be sent. In either case, if the callee wishes to accept the call, it responds to the invitation by returning a similar description listing the media it wishes to receive. For a multicast session, the callee SHOULD only return a session description if it is unable to receive the media indicated in the caller's description or wants to receive data via unicast. The protocol exchanges for the INVITE method are shown in Fig. 1 for a proxy server and in Fig. 2 for a redirect server. (Note that the messages shown in the figures have been abbreviated slightly.) In Fig. 1, the proxy server accepts the INVITE request (step 1), contacts the location service with all or parts of the address (step 2) and obtains a more precise location (step 3). The proxy server then issues a SIP INVITE request to the address(es) returned by the location service (step 4). The user agent server alerts the user (step 5) and returns a success indication to the proxy server (step 6). The proxy server then returns the success result to the original caller (step 7). The receipt of this message is confirmed by the caller using an ACK request, which is forwarded to the callee (steps 8 and 9). Note that an ACK can also be sent directly to the callee, bypassing the proxy. All requests and responses have the same Call- ID. Handley/Schulzrinne/Schooler/Rosenberg [Page 11] Internet Draft SIP December 15, 1998 +....... cs.columbia.edu .......+ : : : (~~~~~~~~~~) : : ( location ) : : ( service ) : : (~~~~~~~~~~) : : ^ | : : | hgs@lab : : 2| 3| : : | | : : henning | : +.. cs.tu-berlin.de ..+ 1: INVITE : | | : : : henning@cs.col: | \/ 4: INVITE 5: ring : : cz@cs.tu-berlin.de ========================>(~~~~~~)=========>(~~~~~~) : : <........................( )<.........( ) : : : 7: 200 OK : ( )6: 200 OK ( ) : : : : ( work ) ( lab ) : : : 8: ACK : ( )9: ACK ( ) : : ========================>(~~~~~~)=========>(~~~~~~) : +.....................+ +...............................+ ====> SIP request ....> SIP response ^ | non-SIP protocols | Figure 1: Example of SIP proxy server The redirect server shown in Fig. 2 accepts the INVITE request (step 1), contacts the location service as before (steps 2 and 3) and, instead of contacting the newly found address itself, returns the address to the caller (step 4), which is then acknowledged via an ACK request (step 5). The caller issues a new request, with the same call-ID but a higher CSeq, to the address returned by the first server (step 6). In the example, the call succeeds (step 7). The caller and callee complete the handshake with an ACK (step 8). The next section discusses what happens if the location service Handley/Schulzrinne/Schooler/Rosenberg [Page 12] Internet Draft SIP December 15, 1998 returns more than one possible alternative. 1.4.5 Locating a User A callee may move between a number of different end systems over time. These locations can be dynamically registered with the SIP server (Sections 1.4.7, 4.2.6). A location server MAY also use one or more other protocols, such as finger (RFC 1288 [17]), rwhois (RFC 2167 [18]), LDAP (RFC 1777 [19]), multicast-based protocols [20] or operating-system dependent mechanisms to actively determine the end system where a user might be reachable. A location server MAY return several locations because the user is logged in at several hosts simultaneously or because the location server has (temporarily) inaccurate information. The SIP server combines the results to yield a list of a zero or more locations. The action taken on receiving a list of locations varies with the type of SIP server. A SIP redirect server returns the list to the client as Contact headers (Section 6.13). A SIP proxy server can sequentially or in parallel try the addresses until the call is successful (2xx response) or the callee has declined the call (6xx response). With sequential attempts, a proxy server can implement an "anycast" service. If a proxy server forwards a SIP request, it MUST add itself to the end of the list of forwarders noted in the Via (Section 6.40) headers. The Via trace ensures that replies can take the same path back, ensuring correct operation through compliant firewalls and avoiding request loops. On the response path, each host MUST remove its Via, so that routing internal information is hidden from the callee and outside networks. A proxy server MUST check that it does not generate a request to a host listed in the Via sent-by, via- received or via-maddr parameters (Section 6.40). (Note: If a host has several names or network addresses, this does not always work. Thus, each host also checks if it is part of the Via list.) A SIP invitation may traverse more than one SIP proxy server. If one of these "forks" the request, i.e., issues more than one request in response to receiving the invitation request, it is possible that a client is reached, independently, by more than one copy of the invitation request. Each of these copies bears the same Call-ID. The user agent MUST return the same status response returned in the first response. Duplicate requests are not an error. 1.4.6 Changing an Existing Session In some circumstances, it is desirable to change the parameters of an existing session. This is done by re-issuing the INVITE, using the Handley/Schulzrinne/Schooler/Rosenberg [Page 13] Internet Draft SIP December 15, 1998 +....... cs.columbia.edu .......+ : : : (~~~~~~~~~~) : : ( location ) : : ( service ) : : (~~~~~~~~~~) : : ^ | : : | hgs@lab : : 2| 3| : : | | : : henning| : +.. cs.tu-berlin.de ..+ 1: INVITE : | | : : : henning@cs.col: | \/ : : cz@cs.tu-berlin.de =======================>(~~~~~~) : : | ^ | <.......................( ) : : | . | : 4: 302 Moved : ( ) : : | . | : hgs@lab : ( work ) : : | . | : : ( ) : : | . | : 5: ACK : ( ) : : | . | =======================>(~~~~~~) : : | . | : : : +.......|...|.........+ : : | . | : : | . | : : | . | : : | . | : : | . | 6: INVITE hgs@lab.cs.columbia.edu (~~~~~~) : | . ==================================================> ( ) : | ..................................................... ( ) : | 7: 200 OK : ( lab ) : | : ( ) : | 8: ACK : ( ) : ======================================================> (~~~~~~) : +...............................+ ====> SIP request ....> SIP response ^ | non-SIP protocols | Figure 2: Example of SIP redirect server Handley/Schulzrinne/Schooler/Rosenberg [Page 14] Internet Draft SIP December 15, 1998 same Call-ID, but a new or different body or header fields to convey the new information. This re INVITE MUST have a higher CSeq than any previous request from the client to the server. For example, two parties may have been conversing and then want to add a third party, switching to multicast for efficiency. One of the participants invites the third party with the new multicast address and simultaneously sends an INVITE to the second party, with the new multicast session description, but with the old call identifier. 1.4.7 Registration Services The REGISTER request allows a client to let a proxy or redirect server know at which address(es) it can be reached. A client MAY also use it to install call handling features at the server. 1.5 Protocol Properties 1.5.1 Minimal State A single conference session or call involves one or more SIP request-response transactions. Proxy servers do not have to keep state for a particular call, however, they MAY maintain state for a single SIP transaction, as discussed in Section 12. For efficiency, a server MAY cache the results of location service requests. 1.5.2 Lower-Layer-Protocol Neutral SIP makes minimal assumptions about the underlying transport and network-layer protocols. The lower-layer can provide either a packet or a byte stream service, with reliable or unreliable service. In an Internet context, SIP is able to utilize both UDP and TCP as transport protocols, among others. UDP allows the application to more carefully control the timing of messages and their retransmission, to perform parallel searches without requiring TCP connection state for each outstanding request, and to use multicast. Routers can more readily snoop SIP UDP packets. TCP allows easier passage through existing firewalls. When TCP is used, SIP can use one or more connections to attempt to contact a user or to modify parameters of an existing conference. Different SIP requests for the same SIP call MAY use different TCP connections or a single persistent connection, as appropriate. For concreteness, this document will only refer to Internet protocols. However, SIP MAY also be used directly with protocols such as ATM AAL5, IPX, frame relay or X.25. The necessary naming Handley/Schulzrinne/Schooler/Rosenberg [Page 15] Internet Draft SIP December 15, 1998 conventions are beyond the scope of this document. User agents SHOULD implement both UDP and TCP transport. Proxy, registrar, and redirect servers MUST implement both UDP and TCP transport. 1.5.3 Text-Based SIP is text-based, using ISO 10646 in UTF-8 encoding throughout. This allows easy implementation in languages such as Java, Tcl and Perl, allows easy debugging, and most importantly, makes SIP flexible and extensible. As SIP is used for initiating multimedia conferences rather than delivering media data, it is believed that the additional overhead of using a text-based protocol is not significant. 2 SIP Uniform Resource Locators SIP URLs are used within SIP messages to indicate the originator (From), current destination (Request-URI) and final recipient (To) of a SIP request, and to specify redirection addresses (Contact). A SIP URL can also be embedded in web pages or other hyperlinks to indicate that a particular user or service can be called via SIP. When used as a hyperlink, the SIP URL indicates the use of the INVITE method. The SIP URL scheme is defined to allow setting SIP request-header fields and the SIP message-body. This corresponds to the use of mailto: URLs. It makes it possible, for example, to specify the subject, urgency or media types of calls initiated through a web page or as part of an email message. A SIP URL follows the guidelines of RFC 2396 [12] and has the syntax shown in Fig. 3. The syntax is described using Augmented Backus-Naur Form (See Section C). Note that reserved characters have to be escaped and that the "set of characters reserved within any given URI component is defined by that component. In general, a character is reserved if the semantics of the URI changes if the character is replaced with its escaped US-ASCII encoding" [12]. The URI character classes referenced above are described in Appendix C. The components of the SIP URI have the following meanings. user: If the host is an Internet telephony gateway, the user field MAY also encode a telephone number using the notation of Handley/Schulzrinne/Schooler/Rosenberg [Page 16] Internet Draft SIP December 15, 1998 SIP-URL = "sip:" [ userinfo "@" ] hostport url-parameters [ headers ] userinfo = user [ ":" password ] user = *( unreserved | escaped | ";" | "&" | "=" | "+" | "$" | "," ) password = *( unreserved | escaped | ";" | "&" | "=" | "+" | "$" | "," ) hostport = host [ ":" port ] host = hostname | IPv4address hostname = *( domainlabel "." ) toplabel [ "." ] domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum toplabel = alpha | alpha *( alphanum | "-" ) alphanum IPv4address = 1*digit "." 1*digit "." 1*digit "." 1*digit port = *digit url-parameters = *( ";" url-parameter ) url-parameter = transport-param | user-param | method-param | ttl-param | maddr-param | other-param transport-param = "transport=" ( "udp" | "tcp" ) ttl-param = "ttl=" ttl ttl = 1*3DIGIT ; 0 to 255 maddr-param = "maddr=" host user-param = "user=" ( "phone" | "ip" ) method-param = "method=" Method tag-param = "tag=" UUID UUID = 1*( hex | "-" ) other-param = ( token | ( token "=" ( token | quoted-string ))) headers = "?" header *( "&" header ) header = hname "=" hvalue hname = 1*uric hvalue = *uric uric = reserved | unreserved | escaped reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," digits = 1*DIGIT Figure 3: SIP URL syntax telephone-subscriber (Fig. 4). The telephone number is a special case of a user name and cannot be distinguished by a BNF. Thus, a URL parameter, user, is added to distinguish telephone numbers from user names. The phone identifier is to be used when connecting to a telephony gateway. Even without this parameter, recipients of SIP URLs MAY interpret the pre-@ part as a phone number if local restrictions on the name space for user name Handley/Schulzrinne/Schooler/Rosenberg [Page 17] Internet Draft SIP December 15, 1998 telephone-subscriber = global-phone-number | local-phone-number global-phone-number = "+" 1*phonedigit [isdn-subaddress] [post-dial] local-phone-number = 1*(phonedigit | dtmf-digit | pause-character) [isdn-subaddress] [post-dial] isdn-subaddress = ";isub=" 1*phonedigit post-dial = ";postd=" 1*(phonedigit | dtmf-digit | pause-character) phonedigit = DIGIT | visual-separator visual-separator = "-" | "." pause-character = one-second-pause | wait-for-dial-tone one-second-pause = "p" wait-for-dial-tone = "w" dtmf-digit = "*" | "#" | "A" | "B" | "C" | "D" Figure 4: SIP URL syntax; telephone subscriber allow it. If a server handles SIP addresses for another domain, it MUST URL- encode the "@" character (%40). The ";" character MUST be URL- encoded, as otherwise it is not possible to distinguish the case host;parameter and user;moreuser@host in one parsing pass. password: The SIP scheme MAY use the format "user:password" in the userinfo field. The use of passwords in the userinfo is NOT RECOMMENDED, because the passing of authentication information in clear text (such as URIs) has proven to be a security risk in almost every case where it has been used. host: The mailto: URL and RFC 822 email addresses require that numeric host addresses ("host numbers") are enclosed in square brackets (presumably, since host names might be numeric), while host numbers without brackets are used for all other URLs. The SIP URL requires the latter form, without brackets. The issue of IPv6 literal addresses in URLs is being looked at elsewhere in the IETF. SIP implementers are advised to keep up to date on that activity. port: The port number to send a request to. If not present, the procedures outlined in Section 1.4.2 are used to determine the port number to send a request to. Handley/Schulzrinne/Schooler/Rosenberg [Page 18] Internet Draft SIP December 15, 1998 URL parameters: SIP URLs can define specific parameters of the request. URL parameters are added after the host component and are separated by semi-colons. The transport parameter determines the the transport mechanism (UDP or TCP). UDP is to be assumed when no explicit transport parameter is included. The maddr parameter provides the server address to be contacted for this user, overriding the address supplied in the host field. This address is typically a multicast address, but could also be the address of a backup server. The ttl parameter determines the time-to-live value of the UDP multicast packet and MUST only be used if maddr is a multicast address and the transport protocol is UDP. The user parameter was described above. For example, to specify to call j.doe@big.com using multicast to 239.255.255.1 with a ttl of 15, the following URL would be used: sip:j.doe@big.com;maddr=239.255.255.1;ttl=15 The transport, maddr, and ttl parameters MUST NOT be used in the From and To header fields and the Request-URI; they are ignored if present. Headers: Headers of the SIP request can be defined with the "?" mechanism within a SIP URL. The special hname "body" indicates that the associated hvalue is the message-body of the SIP INVITE request. Headers MUST NOT be used in the From and To header fields and the Request-URI; they are ignored if present. hname and hvalue are encodings of a SIP header name and value, respectively. All URL reserved characters in the header names and values MUST be escaped. Method: The method of the SIP request can be specified with the method parameter. This parameter MUST NOT be used in the From and To header fields and the Request-URI; they are ignored if present. Table 2 summarizes where the components of the SIP URL can be used and what default values they assume if not present. Examples of SIP URLs are: sip:j.doe@big.com sip:j.doe:secret@big.com;transport=tcp sip:j.doe@big.com?subject=project sip:+1-212-555-1212:1234@gateway.com;user=phone Handley/Schulzrinne/Schooler/Rosenberg [Page 19] Internet Draft SIP December 15, 1998 default Req.-URI To From Contact external user -- x x x x x password -- x x x x host mandatory x x x x x port 5060 x x x x x user-param ip x x x x x method INVITE x x maddr-param -- x x ttl-param 1 x x transp.-param -- x x headers -- x x Table 2: Use and default values of URL components for SIP headers, Request-URI and references sip:1212@gateway.com sip:alice@10.1.2.3 sip:alice@example.com sip:alice sip:alice@registrar.com;method=REGISTER Within a SIP message, URLs are used to indicate the source and intended destination of a request, redirection addresses and the current destination of a request. Normally all these fields will contain SIP URLs. SIP URLs are case-insensitive, so that for example the two URLs sip:j.doe@example.com and SIP:J.Doe@Example.com are equivalent. All URL parameters are included when comparing SIP URLs for equality. SIP header fields MAY contain non-SIP URLs. As an example, if a call from a telephone is relayed to the Internet via SIP, the SIP From header field might contain a phone URL. 3 SIP Message Overview SIP is a text-based protocol and uses the ISO 10646 character set in UTF-8 encoding (RFC 2279 [21]). Senders MUST terminate lines with a CRLF, but receivers MUST also interpret CR and LF by themselves as line terminators. Except for the above difference in character sets, much of the message syntax is and header fields are identical to HTTP/1.1; rather than repeating the syntax and semantics here we use [HX.Y] to refer Handley/Schulzrinne/Schooler/Rosenberg [Page 20] Internet Draft SIP December 15, 1998 to Section X.Y of the current HTTP/1.1 specification (RFC 2068 [11]). In addition, we describe SIP in both prose and an augmented Backus- Naur form (ABNF). See section C for an overview of ABNF. Note, however, that SIP is not an extension of HTTP. Unlike HTTP, SIP MAY use UDP. When sent over TCP or UDP, multiple SIP transactions can be carried in a single TCP connection or UDP datagram. UDP datagrams, including all headers, SHOULD NOT be larger than the path maximum transmission unit (MTU) if the MTU is known, or 1500 bytes if the MTU is unknown. The 1500 bytes accommodates encapsulation within the "typical" ethernet MTU without IP fragmentation. Recent studies [22] indicate that an MTU of 1500 bytes is a reasonable assumption. The next lower common MTU values are 1006 bytes for SLIP and 296 for low-delay PPP (RFC 1191 [23]). Thus, another reasonable value would be a message size of 950 bytes, to accommodate packet headers within the SLIP MTU without fragmentation. A SIP message is either a request from a client to a server, or a response from a server to a client. SIP-message = Request | Response Both Request (section 4) and Response (section 5) messages use the generic-message format of RFC 822 [24] for transferring entities (the body of the message). Both types of messages consist of a start-line, one or more header fields (also known as "headers"), an empty line (i.e., a line with nothing preceding the carriage-return line-feed (CRLF)) indicating the end of the header fields, and an optional message-body. To avoid confusion with similar-named headers in HTTP, we refer to the headers describing the message body as entity headers. These components are described in detail in the upcoming sections. generic-message = start-line *message-header CRLF [ message-body ] Handley/Schulzrinne/Schooler/Rosenberg [Page 21] Internet Draft SIP December 15, 1998 start-line = Request-Line | ;Section 4.1 Status-Line ;Section 5.1 message-header = ( general-header | request-header | response-header | entity-header ) In the interest of robustness, any leading empty line(s) MUST be ignored. In other words, if the Request or Response message begins with one or more CRLF, CR, or LFs, these characters MUST be ignored. 4 Request The Request message format is shown below: Request = Request-Line ; Section 4.1 *( general-header | request-header | entity-header ) CRLF [ message-body ] ; Section 8 4.1 Request-Line The Request-Line begins with a method token, followed by the Request-URI and the protocol version, and ending with CRLF. The elements are separated by SP characters. No CR or LF are allowed except in the final CRLF sequence. Request-Line = Method SP Request-URI SP SIP-Version CRLF 4.2 Methods The methods are defined below. Methods that are not supported by a proxy or redirect server are treated by that server as if they were an OPTIONS method and forwarded accordingly. Methods that are not Handley/Schulzrinne/Schooler/Rosenberg [Page 22] Internet Draft SIP December 15, 1998 general-header = Accept ; Section 6.7 | Accept-Encoding ; Section 6.8 | Accept-Language ; Section 6.9 | Call-ID ; Section 6.12 | Contact ; Section 6.13 | CSeq ; Section 6.17 | Date ; Section 6.18 | Encryption ; Section 6.19 | Expires ; Section 6.20 | From ; Section 6.21 | Record-Route ; Section 6.29 | Timestamp ; Section 6.36 | To ; Section 6.37 | Via ; Section 6.40 entity-header = Content-Encoding ; Section 6.14 | Content-Length ; Section 6.15 | Content-Type ; Section 6.16 request-header = Authorization ; Section 6.11 | Contact ; Section 6.13 | Hide ; Section 6.22 | Max-Forwards ; Section 6.23 | Organization ; Section 6.24 | Priority ; Section 6.25 | Proxy-Authorization ; Section 6.27 | Proxy-Require ; Section 6.28 | Route ; Section 6.33 | Require ; Section 6.30 | Response-Key ; Section 6.31 | Subject ; Section 6.35 | User-Agent ; Section 6.39 response-header = Allow ; Section 6.10 | Proxy-Authenticate ; Section 6.26 | Retry-After ; Section 6.32 | Server ; Section 6.34 | Unsupported ; Section 6.38 | Warning ; Section 6.41 | WWW-Authenticate ; Section 6.42 Table 3: SIP headers supported by a user agent server or registrar cause a 501 (Not Implemented) response to be returned (Section 7). As in HTTP, the Method token is case-sensitive. Handley/Schulzrinne/Schooler/Rosenberg [Page 23] Internet Draft SIP December 15, 1998 Method = "INVITE" | "ACK" | "OPTIONS" | "BYE" | "CANCEL" | "REGISTER" 4.2.1 INVITE The INVITE method indicates that the user or service is being invited to participate in a session. The message body contains a description of the session to which the callee is being invited. For two-party calls, the caller indicates the type of media it is able to receive and possibly the media it is willing to send as well as their parameters such as network destination. A success response MUST indicate in its message body which media the callee wishes to receive and MAY indicate the media the callee is going to send. Not all session description formats have the ability to indicate sending media. A server MAY automatically respond to an invitation for a conference the user is already participating in, identified either by the SIP Call-ID or a globally unique identifier within the session description, with a 200 (OK) response. If a user agent receives an INVITE request for an existing call leg with a higher CSeq sequence number than any previous INVITE for the same Call-ID, it MUST check any version identifiers in the session description or, if there are no version identifiers, the content of the session description to see if it has changed. It MUST also inspect any other header fields for changes. If there is a change, the user agent MUST update any internal state or information generated as a result of that header. If the session description has changed, the user agent server MUST adjust the session parameters accordingly, possibly after asking the user for confirmation. (Versioning of the session description can be used to accommodate the capabilities of new arrivals to a conference, add or delete media or change from a unicast to a multicast conference.) This method MUST be supported by SIP proxy, redirect and user agent servers as well as clients. 4.2.2 ACK The ACK request confirms that the client has received a final response to an INVITE request. (ACK is used only with INVITE requests.) 2xx responses are acknowledged by client user agents, all other final responses by the first proxy or client user agent to receive the response. The Via is always initialized to the host that Handley/Schulzrinne/Schooler/Rosenberg [Page 24] Internet Draft SIP December 15, 1998 originates the ACK request, i.e., the client user agent after a 2xx response or the first proxy to receive a non-2xx final response. The ACK request is forwarded as the corresponding INVITE request, based on its Request-URI. See Section 10 for details. The ACK request MAY contain a message body with the final session description to be used by the callee. If the ACK message body is empty, the callee uses the session description in the INVITE request. A proxy server receiving an ACK request after having sent a 3xx, 4xx, 5xx, or 6xx response must make a determination about whether the ACK is for it, or for some user agent or proxy server further downstream. This determination is made by examining the tag in the To field. If the tag in the ACK To header field matches the tag in the To header field of the response, and the From, CSeq and Call-ID header fields in the response match those in the ACK, the ACK is meant for the proxy server. Otherwise, the ACK SHOULD be proxied downstream as any other request. It is possible for a user agent client or proxy server to receive multiple 3xx, 4xx, 5xx, and 6xx responses to a request along a single branch. This can happen under various error conditions, typically when a forking proxy transitions from stateful to stateless before receiving all responses. The various responses will all be identical, except for the tag in the To field, which is different for each one. It can therefore be used as a means to disambiguate them. This method MUST be supported by SIP proxy, redirect and user agent servers as well as clients. 4.2.3 OPTIONS The server is being queried as to its capabilities. A server that believes it can contact the user, such as a user agent where the user is logged in and has been recently active, MAY respond to this request with a capability set. A called user agent MAY return a status reflecting how it would have responded to an invitation, e.g., 600 (Busy). Such a server SHOULD return an Allow header field indicating the methods that it supports. Proxy and redirect servers simply forward the request without indicating their capabilities. This method MUST be supported by SIP proxy, redirect and user agent servers, registrars and clients. 4.2.4 BYE Handley/Schulzrinne/Schooler/Rosenberg [Page 25] Internet Draft SIP December 15, 1998 The user agent client uses BYE to indicate to the server that it wishes to release the call. A BYE request is forwarded like an INVITE request and MAY be issued by either caller or callee. A party to a call SHOULD issue a BYE request before releasing a call ("hanging up"). A party receiving a BYE request MUST cease transmitting media streams specifically directed at the party issuing the BYE request. If the INVITE request contained a Contact header, the callee SHOULD send a BYE request to that address rather than the From address. This method MUST be supported by proxy servers and SHOULD be supported by redirect and user agent SIP servers. 4.2.5 CANCEL The CANCEL request cancels a pending request with the same Call-ID, To, From and CSeq (sequence number only) header field values, but does not affect a completed request. (A request is considered completed if the server has returned a final status response.) A user agent client or proxy client MAY issue a CANCEL request at any time. A proxy, in particular, MAY choose to send a CANCEL to destinations that have not yet returned a final response after it has received a 2xx or 6xx response for one or more of the parallel-search requests. A proxy that receives a CANCEL request forwards the request to all destinations with pending requests. The Call-ID, To, the numeric part of CSeq and From headers in the CANCEL request are identical to those in the original request. This allows a CANCEL request to be matched with the request it cancels. However, to allow the client to distinguish responses to the CANCEL from those to the original request, the CSeq Method component is set to CANCEL. The Via header field is initialized to the proxy issuing the CANCEL request. (Thus, responses to this CANCEL request only reach the issuing proxy.) Once a user agent server has received a CANCEL, it MUST NOT issue a 2xx response for the cancelled original request. A redirect or user agent server receiving a CANCEL request responds with a status of 200 (OK) if the transaction exists and a status of 481 (Transaction Does Not Exist) if not, but takes no further action. In particular, any existing call is unaffected. The BYE request cannot be used to cancel branches of a parallel search, since several branches may, through intermediate proxies, find the same user agent server and Handley/Schulzrinne/Schooler/Rosenberg [Page 26] Internet Draft SIP December 15, 1998 then terminate the call. To terminate a call instead of just pending searches, the UAC must use BYE instead of or in addition to CANCEL. While CANCEL can terminate any pending request other than ACK or CANCEL, it is typically useful only for INVITE. 200 responses to INVITE and 200 responses to CANCEL are distinguished by the method in the Cseq header field, so there is no ambiguity. This method MUST be supported by proxy servers and SHOULD be supported by all other SIP server types. 4.2.6 REGISTER A client uses the REGISTER method to register the address listed in the To header field with a SIP server. A user agent MAY register with a local server on startup by sending a REGISTER request to the well-known "all SIP servers" multicast address "sip.mcast.net" (224.0.1.75). This request SHOULD be scoped to ensure it is not forwarded beyond the boundaries of the administrative system. This MAY be done with either TTL or administrative scopes[25], depending on what is implemented in the network. SIP user agents MAY listen to that address and use it to become aware of the location of other local users [20]; however, they do not respond to the request. A user agent MAY also be configured with the address of a registrar server to which it sends a REGISTER request upon startup. Requests are processed in the order received. Clients SHOULD avoid sending a new registration (as opposed to a retransmission) until they have received the response from the server for the previous one. Clients may register from different locations, by necessity using different Call-ID values. Thus, the CSeq value cannot be used to enforce ordering. Since registrations are additive, ordering is less of a problem than if each REGISTER request completely replaced all earlier ones. The meaning of the REGISTER request-header fields is defined as follows. We define "address-of-record" as the SIP address that the registry knows the registrand, typically of the form "user@domain" rather than "user@host". In third-party registration, the entity issuing the request is different from the entity being registered. To: The To header field contains the address-of-record whose registration is to be created or updated. Handley/Schulzrinne/Schooler/Rosenberg [Page 27] Internet Draft SIP December 15, 1998 From: The From header field contains the address-of-record of the person responsible for the registration. For first-party registration, it is identical to the To header field value. Request-URI: The Request-URI names the destination of the registration request, i.e., the domain of the registrar. The user name MUST be empty. Generally, the domains in the Request- URI and the To header field have the same value; however, it is possible to register as a "visitor", while maintaining one's name. For example, a traveller sip:alice@acme.com (To) might register under the Request-URI sip:atlanta.hiayh.org , with the former as the To header field and the latter as the Request-URI. The request is no longer forwarded once it reached the server whose authoritative domain is the one listed in the Request-URI. Call-ID: All registrations from a client SHOULD use the same Call-ID header value, at least within the same reboot cycle. Cseq: Registrations with the same Call-ID MUST have increasing CSeq header values. However, the server does not reject out-of-order requests. Contact: The request MAY contain a Contact header field; future non- REGISTER requests for the URI given in the To header field SHOULD be directed to the address(es) given in the Contact header. If the request does not contain a Contact header, the registration remains unchanged. This is useful to obtain the current list of registrations in the response. Registrations using SIP URIs that differ in one or more of host, port, transport-param or maddr- param (see Figure 3) from an existing registration are added to the list of registrations. Other URI types are compared according to the standard URI equivalency rules for the URI schema. If the URIs are equivalent to that of an existing registration, the new registration replaces the old one if it has a higher q value or, for the same value of q, if the ttl value is higher. All current registrations MUST share the same action value. Registrations that have a different action than current registrations for the same user MUST be rejected with status of 409 (Conflict). A proxy server ignores the q parameter when processing non-REGISTER requests, while a redirect server simply returns that parameter in its Contact response header field. Handley/Schulzrinne/Schooler/Rosenberg [Page 28] Internet Draft SIP December 15, 1998 Having the proxy server interpret the q parameter is not sufficient to guide proxy behavior, as it is not clear, for example, how long it is supposed to wait between trying addresses. If the registration is changed while a user agent or proxy server processes an invitation, the new information SHOULD be used. This allows a service known as "directed pick-up". In the telephone network, directed pickup permits a user at a remote station who hears his own phone ringing to pick up at that station, dial an access code, and be connected to the calling user as if he had answered his own phone. A server MAY choose any duration for the registration lifetime. Registrations not refreshed after this amount of time SHOULD be silently discarded. Responses to a registration SHOULD include an Expires header (Section 6.20) or expires Contact parameters (Section 6.13), indicating the time at which the server will drop the registration. If none is present, one hour is assumed. Clients MAY request a registration lifetime by indicating the time in an Expires header in the request. A server SHOULD NOT use a higher lifetime than the one requested, but MAY use a lower one. A single address (if host-independent) MAY be registered from several different clients. A client cancels an existing registration by sending a REGISTER request with an expiration time (Expires) of zero seconds for a particular Contact or the wildcard Contact designated by a "*" for all registrations. Registrations are matched based on the user, host, port and maddr parameters. The server SHOULD return the current list of registrations in the 200 response as Contact header fields. It is particularly important that REGISTER requests are authenticated since they allow to redirect future requests (see Section 13.2). Beyond its use as a simple location service, this method is needed if there are several SIP servers on a single host. In that case, only one of the servers can use the default port number. Support of this method is RECOMMENDED. 4.3 Request-URI Handley/Schulzrinne/Schooler/Rosenberg [Page 29] Internet Draft SIP December 15, 1998 The Request-URI is a SIP URL as described in Section 2 or a general URI. It indicates the user or service to which this request is being addressed. Unlike the To field, the Request-URI MAY be re-written by proxies. When used as a Request-URI, a SIP-URL MUST NOT contain the transport-param, maddr-param, ttl-param, or headers elements. A server that receives a SIP-URL with these elements removes them before further processing. Typically, the UAC sets the Request-URI and To to the same SIP URL, presumed to remain unchanged over long time periods. However, if the UAC has cached a more direct path to the callee, e.g., from the Contact header field of a response to a previous request, the To would still contain the long-term, "public" address, while the Request-URI would be set to the cached address. Proxy and redirect servers MAY use the information in the Request-URI and request header fields to handle the request and possibly rewrite the Request-URI. For example, a request addressed to the generic address sip:sales@acme.com is proxied to the particular person, e.g., sip:bob@ny.acme.com , with the To field remaining as sip:sales@acme.com. At ny.acme.com , Bob then designates Alice as the temporary substitute. The host part of the Request-URI typically agrees with one of the host names of the receiving server. If it does not, the server SHOULD proxy the request to the address indicated or return a 404 (Not Found) response if it is unwilling or unable to do so. For example, the Request-URI and server host name can disagree in the case of a firewall proxy that handles outgoing calls. This mode of operation is similar to that of HTTP proxies. If a SIP server receives a request with a URI indicating a scheme other than SIP which that server does not understand, the server MUST return a 400 (Bad Request) response. It MUST do this even if the To header field contains a scheme it does understand. This is because proxies are responsible for processing the Request-URI; the To field is of end-to-end significance. 4.3.1 SIP Version Both request and response messages include the version of SIP in use, and follow [H3.1] (with HTTP replaced by SIP, and HTTP/1.1 replaced by SIP/2.0) regarding version ordering, compliance requirements, and upgrading of version numbers. To be compliant with this Handley/Schulzrinne/Schooler/Rosenberg [Page 30] Internet Draft SIP December 15, 1998 specification, applications sending SIP messages MUST include a SIP- Version of "SIP/2.0". 4.4 Option Tags Option tags are unique identifiers used to designate new options in SIP. These tags are used in Require (Section 6.30) and Unsupported (Section 6.38) fields. Syntax: option-tag = token See Section C for a definition of token. The creator of a new SIP option MUST either prefix the option with their reverse domain name or register the new option with the Internet Assigned Numbers Authority (IANA). For example, "com.foo.mynewfeature" is an apt name for a feature whose inventor can be reached at "foo.com". Individual organizations are then responsible for ensuring that option names don't collide. Options registered with IANA have the prefix "org.iana.sip.", options described in RFCs have the prefix "org.ietf.rfc.N", where N is the RFC number. Option tags are case- insensitive. 4.4.1 Registering New Option Tags with IANA When registering a new SIP option, the following information MUST be provided: o Name and description of option. The name MAY be of any length, but SHOULD be no more than twenty characters long. The name MUST consist of alphanum (See Figure 3) characters only; o Indication of who has change control over the option (for example, IETF, ISO, ITU-T, other international standardization bodies, a consortium or a particular company or group of companies); o A reference to a further description, if available, for example (in order of preference) an RFC, a published paper, a patent filing, a technical report, documented source code or a computer manual; o Contact information (postal and email address); Handley/Schulzrinne/Schooler/Rosenberg [Page 31] Internet Draft SIP December 15, 1998 This procedure has been borrowed from RTSP [4] and the RTP AVP [26]. 5 Response After receiving and interpreting a request message, the recipient responds with a SIP response message. The response message format is shown below: Response = Status-Line ; Section 5.1 *( general-header | response-header | entity-header ) CRLF [ message-body ] ; Section 8 SIP's structure of responses is similar to [H6], but is defined explicitly here. 5.1 Status-Line The first line of a Response message is the Status-Line, consisting of the protocol version (Section 4.3.1) followed by a numeric Status-Code and its associated textual phrase, with each element separated by SP characters. No CR or LF is allowed except in the final CRLF sequence. Status-Line = SIP-version SP Status-Code SP Reason-Phrase CRLF 5.1.1 Status Codes and Reason Phrases The Status-Code is a 3-digit integer result code that indicates the outcome of the attempt to understand and satisfy the request. The Reason-Phrase is intended to give a short textual description of the Status-Code. The Status-Code is intended for use by automata, whereas the Reason-Phrase is intended for the human user. The client is not required to examine or display the Reason-Phrase. Status-Code = Informational ;Fig. 5 | Success ;Fig. 5 Handley/Schulzrinne/Schooler/Rosenberg [Page 32] Internet Draft SIP December 15, 1998 | Redirection ;Fig. 6 | Client-Error ;Fig. 7 | Server-Error ;Fig. 8 | Global-Failure ;Fig. 9 | extension-code extension-code = 3DIGIT Reason-Phrase = * We provide an overview of the Status-Code below, and provide full definitions in Section 7. The first digit of the Status-Code defines the class of response. The last two digits do not have any categorization role. SIP/2.0 allows 6 values for the first digit: 1xx: Informational -- request received, continuing to process the request; 2xx: Success -- the action was successfully received, understood, and accepted; 3xx: Redirection -- further action needs to be taken in order to complete the request; 4xx: Client Error -- the request contains bad syntax or cannot be fulfilled at this server; 5xx: Server Error -- the server failed to fulfill an apparently valid request; 6xx: Global Failure -- the request cannot be fulfilled at any server. Figures 5 through 9 present the individual values of the numeric response codes, and an example set of corresponding reason phrases for SIP/2.0. These reason phrases are only recommended; they may be replaced by local equivalents without affecting the protocol. Note that SIP adopts many HTTP/1.1 response codes. SIP/2.0 adds response codes in the range starting at x80 to avoid conflicts with newly defined HTTP response codes, and adds a new class, 6xx, of response codes. SIP response codes are extensible. SIP applications are not required to understand the meaning of all registered response codes, though such understanding is obviously desirable. However, applications MUST understand the class of any response code, as indicated by the first digit, and treat any unrecognized response as being equivalent to the x00 response code of that class, with the exception that an unrecognized response MUST NOT be cached. For example, if a client receives an unrecognized response code of 431, it can safely assume Handley/Schulzrinne/Schooler/Rosenberg [Page 33] Internet Draft SIP December 15, 1998 that there was something wrong with its request and treat the response as if it had received a 400 (Bad Request) response code. In such cases, user agents SHOULD present to the user the message body returned with the response, since that message body is likely to include human-readable information which will explain the unusual status. Informational = "100" ; Trying | "180" ; Ringing | "181" ; Call Is Being Forwarded | "182" ; Queued Success = "200" ; OK Figure 5: Informational and success status codes Redirection = "300" ; Multiple Choices | "301" ; Moved Permanently | "302" ; Moved Temporarily | "303" ; See Other | "305" ; Use Proxy | "380" ; Alternative Service Figure 6: Redirection status codes 6 Header Field Definitions SIP header fields are similar to HTTP header fields in both syntax and semantics. In particular, SIP header fields follow the syntax for message-header as described in [H4.2]. The rules for extending header fields over multiple lines, and use of multiple message-header fields with the same field-name, described in [H4.2] also apply to SIP. The rules in [H4.2] regarding ordering of header fields apply to SIP, with the exception of Via fields, see below, whose order matters. Additionally, header fields which are hop-by-hop MUST appear before any header fields which are end-to-end. Proxies SHOULD NOT reorder Handley/Schulzrinne/Schooler/Rosenberg [Page 34] Internet Draft SIP December 15, 1998 Client-Error = "400" ; Bad Request | "401" ; Unauthorized | "402" ; Payment Required | "403" ; Forbidden | "404" ; Not Found | "405" ; Method Not Allowed | "406" ; Not Acceptable | "407" ; Proxy Authentication Required | "408" ; Request Timeout | "409" ; Conflict | "410" ; Gone | "411" ; Length Required | "413" ; Request Entity Too Large | "414" ; Request-URI Too Large | "415" ; Unsupported Media Type | "420" ; Bad Extension | "480" ; Temporarily not available | "481" ; Call Leg/Transaction Does Not Exist | "482" ; Loop Detected | "483" ; Too Many Hops | "484" ; Address Incomplete | "485" ; Ambiguous | "486" ; Busy Here Figure 7: Client error status codes Server-Error = "500" ; Internal Server Error | "501" ; Not Implemented | "502" ; Bad Gateway | "503" ; Service Unavailable | "504" ; Gateway Time-out | "505" ; SIP Version not supported Figure 8: Server error status codes header fields. Proxies add Via header fields and MAY add other hop- by-hop header fields. They can modify certain header fields, such as Max-Forwards 6.23 and "fix up" the Via header fields with "received" parameters as described in Section 6.40.1. Proxies MUST NOT alter any fields that are authenticated (see Section 13.2). Handley/Schulzrinne/Schooler/Rosenberg [Page 35] Internet Draft SIP December 15, 1998 Global-Failure | "600" ; Busy Everywhere | "603" ; Decline | "604" ; Does not exist anywhere | "606" ; Not Acceptable Figure 9: Global failure status codes The header fields required, optional and not applicable for each method are listed in Table 4 and Table 5. The table uses "o" to indicate optional, "m" mandatory and "-" for not applicable. A "*" indicates that the header fields are needed only if message body is not empty. See sections 6.15, 6.16 and 8 for details. The "where" column describes the request and response types with which the header field can be used. "R" refers to header fields that can be used in requests (that is, request and general header fields). "r" designates a response or general-header field as applicable to all responses, while a list of numeric values indicates the status codes with which the header field can be used. "g" and "e" designate general (Section 6.1) and entity header (Section 6.2) fields, respectively. If a header field is marked "c", it is copied from the request to the response. The "enc." column describes whether this message header field MAY be encrypted end-to-end. A "n" designates fields that MUST NOT be encrypted, while "c" designates fields that SHOULD be encrypted if encryption is used. The "e-e" column has a value of "e" for end-to-end and a value of "h" for hop-by-hop header fields. Other header fields can be added as required; a server MUST ignore header fields not defined in this specification that it does not understand. A proxy MUST NOT remove or modify header fields not defined in this specification that it does not understand. A compact form of these header fields is also defined in Section 9 for use over UDP when the request has to fit into a single packet and size is an issue. Table 6 in Appendix A lists those header fields that different client and server types MUST be able to parse. Handley/Schulzrinne/Schooler/Rosenberg [Page 36] Internet Draft SIP December 15, 1998 where enc. e-e ACK BYE CAN INV OPT REG __________________________________________________________ Accept R e - - - o o o Accept 415 e - - - o o o Accept-Encoding R e - - - o o o Accept-Encoding 415 e - - - o o o Accept-Language R e - o o o o o Accept-Language 415 e - o o o o o Allow 200 e - - - - m - Allow 405 e o o o o o o Authorization R e o o o o o o Call-ID gc n e m m m m m m Contact R e o - - o o o Contact 1xx e - - - o o - Contact 2xx e - - - o o o Contact 3xx e - o - o o o Contact 485 e - o - o o o Content-Encoding e e o - - o o o Content-Length e e o - - o o o Content-Type e e * - - * * * CSeq gc n e m m m m m m Date g e o o o o o o Encryption g n e o o o o o o Expires g e - - - o - o From gc n e m m m m m m Hide R n h o o o o o o Max-Forwards R n e o o o o o o Organization g c h - - - o o o Table 4: Summary of header fields, A--O 6.1 General Header Fields General header fields apply to both request and response messages. The "general-header" field names can be extended reliably only in combination with a change in the protocol version. However, new or experimental header fields MAY be given the semantics of general header fields if all parties in the communication recognize them to be "general-header" fields. Unrecognized header fields are treated as "entity-header" fields. 6.2 Entity Header Fields The "entity-header" fields define meta-information about the message-body or, if no body is present, about the resource identified by the request. The term "entity header" is an HTTP 1.1 term where the response body can contain a transformed version of the message Handley/Schulzrinne/Schooler/Rosenberg [Page 37] Internet Draft SIP December 15, 1998 where enc. e-e ACK BYE CAN INV OPT REG ___________________________________________________________________ Proxy-Authenticate 407 n h o o o o o o Proxy-Authorization R n h o o o o o o Proxy-Require R n h o o o o o o Priority R c e - - - o - - Require R e o o o o o o Retry-After R c e - - - - - o Retry-After 404,480,486 c e o o o o o o 503 c e o o o o o o 600,603 c e o o o o o o Response-Key R c e - o o o o o Record-Route R h o o o o o o Record-Route 2xx h o o o o o o Route R h - o o o o o Server r c e o o o o o o Subject R c e - - - o - - Timestamp g e o o o o o o To gc(1) n e m m m m m m Unsupported 420 e o o o o o o User-Agent g c e o o o o o o Via gc(2) n e m m m m m m Warning r e o o o o o o WWW-Authenticate 401 c e o o o o o o Table 5: Summary of header fields, P--Z; (1): copied with possible addition of tag; (2): UAS removes first Via header field body. The original message body is referred to as the "entity". We retain the same terminology for header fields but usually refer to the "message body" rather then the entity as the two are the same in SIP. 6.3 Request Header Fields The "request-header" fields allow the client to pass additional information about the request, and about the client itself, to the server. These fields act as request modifiers, with semantics equivalent to the parameters of a programming language method invocation. The "request-header" field names can be extended reliably only in combination with a change in the protocol version. However, new or experimental header fields MAY be given the semantics of "request- header" fields if all parties in the communication recognize them to be request-header fields. Unrecognized header fields are treated as "entity-header" fields. Handley/Schulzrinne/Schooler/Rosenberg [Page 38] Internet Draft SIP December 15, 1998 6.4 Response Header Fields The "response-header" fields allow the server to pass additional information about the response which cannot be placed in the Status- Line. These header fields give information about the server and about further access to the resource identified by the Request-URI. Response-header field names can be extended reliably only in combination with a change in the protocol version. However, new or experimental header fields MAY be given the semantics of "response- header" fields if all parties in the communication recognize them to be "response-header" fields. Unrecognized header fields are treated as "entity-header" fields. 6.5 End-to-end and Hop-by-hop Headers End-to-end headers MUST be transmitted unmodified across all proxies, while hop-by-hop headers MAY be modified or added by proxies. 6.6 Header Field Format Header fields ("general-header", "request-header", "response-header", and "entity-header") follow the same generic header format as that given in Section 3.1 of RFC 822 [24]. Each header field consists of a name followed by a colon (":") and the field value. Field names are case-insensitive. The field value MAY be preceded by any amount of leading white space (LWS), though a single space (SP) is preferred. Header fields can be extended over multiple lines by preceding each extra line with at least one SP or horizontal tab (HT). Applications MUST follow HTTP "common form" when generating these constructs, since there might exist some implementations that fail to accept anything beyond the common forms. message-header = field-name ":" [ field-value ] CRLF field-name = token field-value = *( field-content | LWS ) field-content = < the OCTETs making up the field-value and consisting of either *TEXT-UTF8 or combinations of token, tspecials, and quoted-string> The relative order of header fields with different field names is not significant. Multiple header fields with the same field-name may be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list (i.e., #(values)). Handley/Schulzrinne/Schooler/Rosenberg [Page 39] Internet Draft SIP December 15, 1998 It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded. Field names are not case-sensitive, although their values may be. 6.7 Accept The Accept header follows the syntax defined in [H14.1]. The semantics are also identical, with the exception that if no Accept header is present, the server SHOULD assume a default value of application/sdp. This request-header field is used only with the INVITE, OPTIONS and REGISTER request methods to indicate what media types are acceptable in the response. Example: Accept: application/sdp;level=1, application/x-private, text/html 6.8 Accept-Encoding The Accept-Encoding request-header field is similar to Accept, but restricts the content-codings [H3.4.1] that are acceptable in the response. See [H14.3]. The syntax of this header is defined in [H14.3]. The semantics in SIP are identical to those defined in [H14.3]. 6.9 Accept-Language The Accept-Language header follows the syntax defined in [H14.4]. The rules for ordering the languages based on the q parameter apply to SIP as well. When used in SIP, the Accept-Language request-header field can be used to allow the client to indicate to the server in which language it would prefer to receive reason phrases, session descriptions or status responses carried as message bodies. A proxy MAY use this field to help select the destination for the call, for example, a human operator conversant in a language spoken by the caller. Handley/Schulzrinne/Schooler/Rosenberg [Page 40] Internet Draft SIP December 15, 1998 Example: Accept-Language: da, en-gb;q=0.8, en;q=0.7 6.10 Allow The Allow entity-header field lists the set of methods supported by the resource identified by the Request-URI. The purpose of this field is strictly to inform the recipient of valid methods associated with the resource. An Allow header field MUST be present in a 405 (Method Not Allowed) response and SHOULD be present in an OPTIONS response. Allow = "Allow" ":" 1#Method 6.11 Authorization A user agent that wishes to authenticate itself with a server -- usually, but not necessarily, after receiving a 401 response -- MAY do so by including an Authorization request-header field with the request. The Authorization field value consists of credentials containing the authentication information of the user agent for the realm of the resource being requested. Section 13.2 overviews the use of the Authorization header, and section 15 describes the syntax and semantics when used with PGP based authentication. 6.12 Call-ID The Call-ID general-header field uniquely identifies a particular invitation or all registrations of a particular client. Note that a single multimedia conference can give rise to several calls with different Call-IDs, e.g., if a user invites a single individual several times to the same (long-running) conference. For an INVITE request, a callee user agent server SHOULD NOT alert the user if the user has responded previously to the Call-ID in the INVITE request. If the user is already a member of the conference and the conference parameters contained in the session description have not changed, a callee user agent server MAY silently accept the call, regardless of the Call-ID. An invitation for an existing Call-ID or session can change the parameters of the conference. A client Handley/Schulzrinne/Schooler/Rosenberg [Page 41] Internet Draft SIP December 15, 1998 application MAY decide to simply indicate to the user that the conference parameters have been changed and accept the invitation automatically or it MAY require user confirmation. A user may be invited to the same conference or call using several different Call-IDs. If desired, the client MAY use identifiers within the session description to detect this duplication. For example, SDP contains a session id and version number in the origin (o) field. The REGISTER and OPTIONS methods use the Call-ID value to unambiguously match requests and responses. All REGISTER requests issued by a single client SHOULD use the same Call-ID, at least within the same boot cycle. Since the Call-ID is generated by and for SIP, there is no reason to deal with the complexity of URL-encoding and case-ignoring string comparison. Call-ID = ( "Call-ID" | "i" ) ":" local-id "@" host local-id = 1*uric "host" SHOULD be either a fully qualified domain name or a globally routable IP address. If this is the case, the "local-id" SHOULD be an identifier consisting of URI characters that is unique within "host". Use of cryptographically random identifiers [27] is RECOMMENDED. If, however, host is not an FQDN or globally routable IP address (such as a net 10 address), the local-id MUST be globally unique, as opposed to unique within host. These rules guarantee overall global uniqueness of the Call-ID. The value for Call-ID MUST NOT be reused for a different call. Call-IDs are case-sensitive. Using cryptographically random identifiers provides some protection against session hijacking. Call-ID, To and From are needed to identify a call leg. The distinction between call and call leg matters in calls with third-party control. For systems which have tight bandwidth constraints, many of the mandatory SIP headers have a compact form, as discussed in Section 9. These are alternate names for the headers which occupy less space in the message. In the case of Call-ID, the compact form is i. For example, both of the following are valid: Handley/Schulzrinne/Schooler/Rosenberg [Page 42] Internet Draft SIP December 15, 1998 Call-ID: f81d4fae-7dec-11d0-a765-00a0c91e6bf6@foo.bar.com or i:f81d4fae-7dec-11d0-a765-00a0c91e6bf6@foo.bar.com 6.13 Contact The Contact general-header field can appear in INVITE, ACK, and REGISTER requests, and in 1xx, 2xx, 3xx, and 485 responses. In general, it provides a URL where the user can be reached for further communications. INVITE and ACK requests: INVITE and ACK requests MAY contain Contact headers indicating from which location the request is originating. This allows the callee to send future requests, such as BYE, directly to the caller instead of through a series of proxies. The Via header is not sufficient since the desired address may be that of a proxy. INVITE 2xx responses: A user agent server sending a definitive, positive response (2xx) MAY insert a Contact response header field indicating the SIP address under which it is reachable most directly for future SIP requests, such as ACK, within the same Call-ID. The Contact header field contains the address of the server itself or that of a proxy, e.g., if the host is behind a firewall. The value of this Contact header is copied into the Request-URI of subsequent requests for this call. If the response also contains a Record-Route header field, the address in the Contact header field is added as the last item in the Route header field. See Section 6.29 for details. The Contact value SHOULD NOT be cached across calls, as it may not represent the most desirable location for a particular destination address. INVITE 1xx responses: A UAS sending a provisional response (1xx) MAY insert a Contact response header. It has the same semantics in a 1xx response as a 2xx INVITE response. Note that CANCEL requests MUST NOT be sent to that address, but rather follow the same path as the original request. Handley/Schulzrinne/Schooler/Rosenberg [Page 43] Internet Draft SIP December 15, 1998 REGISTER requests: REGISTER requests MAY contain a Contact header field indicating at which locations the user is reachable. The REGISTER request defines a wildcard Contact field, "*", which MUST only be used with Expires: 0 to remove all registrations for a particular user. An optional "expires" parameter indicates the desired expiration time of the registration. If a Contact entry does not have an "expires" parameter, the Expires header field is used as the default value. If neither of these mechanisms is used, SIP URIs are assumed to expire after one hour. Other URI schemes have no expiration times. REGISTER 2xx responses: A REGISTER response MAY return all locations at which the user is currently reachable. An optional "expires" parameter indicates the expiration time of the registration. If a Contact entry does not have an "expires" parameter, the value of the Expires header field indicates the expiration time. If neither mechanism is used, the expiration time specified in the request, explicitly or by default, is used. 3xx and 485 responses: The Contact response-header field can be used with a 3xx or 485 (Ambiguous) response codes to indicate one or more alternate addresses to try. It can appear in responses to BYE, INVITE and OPTIONS methods. The Contact header field contains URIs giving the new locations or user names to try, or may simply specify additional transport parameters. A 300 (Multiple Choices), 301 (Moved Permanently), 302 (Moved Temporarily) or 485 (Ambiguous) response SHOULD contain a Contact field containing URIs of new addresses to be tried. A 301 or 302 response may also give the same location and username that was being tried but specify additional transport parameters such as a different server or multicast address to try or a change of SIP transport from UDP to TCP or vice versa. The client copies the "user", "password", "host", "port" and "user- param" elements of the Contact URI into the Request-URI of the redirected request and directs the request to the address specified by the "maddr" and "port" parameters, using the transport protocol given in the "transport" parameter. If "maddr" is a multicast address, the value of "ttl" is used as the time-to-live value. Note that the Contact header field MAY also refer to a different entity than the one originally called. For example, a SIP call connected to GSTN gateway may need to deliver a special information announcement such as "The number you have dialed has been changed." A Contact response header field can contain any suitable URI indicating where the called party can be reached, not limited to SIP URLs. For example, it can contain a phone or fax, mailto: (RFC 2368, Handley/Schulzrinne/Schooler/Rosenberg [Page 44] Internet Draft SIP December 15, 1998 [28]) or irc: URL. The following parameters are defined. Additional parameters may be defined in other specifications. q: The "qvalue" indicates the relative preference among the locations given. "qvalue" values are decimal numbers from 0 to 1, with higher values indicating higher preference. action: The "action" parameter is used only when registering with the REGISTER request. It indicates whether the client wishes that the server proxy or redirect future requests intended for the client. If this parameter is not specified the action taken depends on server configuration. In its response, the registrar SHOULD indicate the mode used. This parameter is ignored for other requests. expires: The "expires" parameter indicates how long the URI is valid. The parameter is either a number indicating seconds or a quoted string containing a SIP-date. If this parameter is not provided, the value of the Expires header field determines how long the URI is valid. Implementations MAY treat values larger than 2**32-1 (4294967295 or 136 years) as equivalent to 2**32-1. Contact = ( "Contact" | "m" ) ":" ("*" | (1# ( name-addr | addr-spec ) [ *( ";" contact-params ) ] [ comment ] )) name-addr = [ display-name ] "<" addr-spec ">" addr-spec = SIP-URL | URI display-name = *token | quoted-string contact-params = "q" "=" qvalue | "action" "=" "proxy" | "redirect" | "expires" "=" delta-seconds | <"> SIP-date <"> | extension-attribute extension-attribute = extension-name [ "=" extension-value ] Even if the "display-name" is empty, the "name-addr" form MUST be used if the "addr-spec" contains a comma, semicolon or question mark. The Contact header field fulfills functionality similar to the Location header field in HTTP. However, the HTTP header Handley/Schulzrinne/Schooler/Rosenberg [Page 45] Internet Draft SIP December 15, 1998 only allows one address, unquoted. Since URIs can contain commas and semicolons as reserved characters, they can be mistaken for header or parameter delimiters, respectively. The current syntax corresponds to that for the To and From header, which also allows the use of display names. Example: Contact: "Mr. Watson" ;q=0.7; expires=3600, "Mr. Watson" ;q=0.1 6.14 Content-Encoding Content-Encoding = ( "Content-Encoding" | "e" ) ":" 1#content-coding The Content-Encoding entity-header field is used as a modifier to the "media-type". When present, its value indicates what additional content codings have been applied to the entity-body, and thus what decoding mechanisms MUST be applied in order to obtain the media-type referenced by the Content-Type header field. Content-Encoding is primarily used to allow a body to be compressed without losing the identity of its underlying media type. If multiple encodings have been applied to an entity, the content codings MUST be listed in the order in which they were applied. All content-coding values are case-insensitive. The Internet Assigned Numbers Authority (IANA) acts as a registry for content-coding value tokens. See [3.5] for a definition of the syntax for content-coding. Clients MAY apply content encodings to the body in requests. If the server is not capable of decoding the body, or does not recognize any of the content-coding values, it MUST send a 415 "Unsupported Media Type" response, listing acceptable encodings in the Accept-Encoding header. A server MAY apply content encodings to the bodies in responses. The server MUST only use encodings listed in the Accept- Encoding header in the request. 6.15 Content-Length Handley/Schulzrinne/Schooler/Rosenberg [Page 46] Internet Draft SIP December 15, 1998 The Content-Length entity-header field indicates the size of the message-body, in decimal number of octets, sent to the recipient. Content-Length = ( "Content-Length" | "l" ) ":" 1*DIGIT An example is Content-Length: 3495 Applications SHOULD use this field to indicate the size of the message-body to be transferred, regardless of the media type of the entity. Any Content-Length greater than or equal to zero is a valid value. If no body is present in a message, then the Content-Length header field MUST be set to zero. If a server receives a UDP request without Content-Length, it MUST assume that the request encompasses the remainder of the packet. If a server receives a UDP request with a Content-Length, but the value is larger than the size of the body sent in the request, the client SHOULD generate a 400 class response. If there is additional data in the UDP packet after the last byte of the body has been read, the server MUST treat the remaining data as a separate message. This allows several messages to be placed in a single UDP packet. If a response does not contain a Content-Length, the client assumes that it encompasses the remainder of the UDP packet or the data until the TCP connection is closed, as applicable. Section 8 describes how to determine the length of the message body. 6.16 Content-Type The Content-Type entity-header field indicates the media type of the message-body sent to the recipient. The "media-type" element is defined in [H3.7]. Content-Type = ( "Content-Type" | "c" ) ":" media-type Examples of this header field are Content-Type: application/sdp Content-Type: text/html; charset=ISO-8859-4 Handley/Schulzrinne/Schooler/Rosenberg [Page 47] Internet Draft SIP December 15, 1998 6.17 CSeq Clients MUST add the CSeq (command sequence) general-header field to every request. A CSeq header field in a request contains the request method and a single decimal sequence number chosen by the requesting client, unique within a single value of Call-ID. The sequence number MUST be expressible as a 32-bit unsigned integer. The initial value of the sequence number is arbitrary, but MUST be less than 2**31. Consecutive requests that differ in request method, headers or body, but have the same Call-ID MUST contain strictly monotonically increasing and contiguous sequence numbers; sequence numbers do not wrap around. Retransmissions of the same request carry the same sequence number, but an INVITE with a different message body or different header fields (a "re-invitation") acquires a new, higher sequence number. A server MUST echo the CSeq value from the request in its response. If the Method value is missing in the received CSeq header field, the server fills it in appropriately. The ACK and CANCEL requests MUST contain the same CSeq value as the INVITE request that it refers to, while a BYE request cancelling an invitation MUST have a higher sequence number. A BYE request with a CSeq that is not higher should cause a 400 response to be generated. A user agent server MUST remember the highest sequence number for any INVITE request with the same Call-ID value. The server MUST respond to, and then discard, any INVITE request with a lower sequence number. All requests spawned in a parallel search have the same CSeq value as the request triggering the parallel search. CSeq = "CSeq" ":" 1*DIGIT Method Strictly speaking, CSeq header fields are needed for any SIP request that can be cancelled by a BYE or CANCEL request or where a client can issue several requests for the same Call-ID in close succession. Without a sequence number, the response to an INVITE could be mistaken for the response to the cancellation (BYE or CANCEL). Also, if the network duplicates packets or if an ACK is delayed until the server has sent an additional response, the client could interpret an old response as the response to a re- invitation issued shortly thereafter. Using CSeq also makes it easy for the server to distinguish different versions of Handley/Schulzrinne/Schooler/Rosenberg [Page 48] Internet Draft SIP December 15, 1998 an invitation, without comparing the message body. The Method value allows the client to distinguish the response to an INVITE request from that of a CANCEL response. CANCEL requests can be generated by proxies; if they were to increase the sequence number, it might conflict with a later request issued by the user agent for the same call. With a length of 32 bits, a server could generate, within a single call, one request a second for about 136 years before needing to wrap around. The initial value of the sequence number is chosen so that subsequent requests within the same call will not wrap around. A non-zero initial value allows to use a time-based initial sequence number, if the client desires. A client could, for example, choose the 31 most significant bits of a 32-bit second clock as an initial sequence number. Forked requests MUST have the same CSeq as there would be ambiguity otherwise between these forked requests and later BYE issued by the client user agent. Example: CSeq: 4711 INVITE 6.18 Date Date is a general-header field. Its syntax is: SIP-date = rfc1123-date See [H14.19] for a definition of rfc1123-date. Note that unlike HTTP/1.1, SIP only supports the most recent RFC1123 [29] formatting for dates. The Date header field reflects the time when the request or response is first sent. Thus, retransmissions have the same Date header field value as the original. The Date header field can be used by simple end systems without a battery-backed clock to acquire a notion of Handley/Schulzrinne/Schooler/Rosenberg [Page 49] Internet Draft SIP December 15, 1998 current time. 6.19 Encryption The Encryption general-header field specifies that the content has been encrypted. Section 13 describes the overall SIP security architecture and algorithms. This header field is intended for end- to-end encryption of requests and responses. Requests are encrypted based on the public key belonging to the entity named in the To header field. Responses are encrypted based on the public key conveyed in the Response-Key header field. Note that the public keys themselves may not be used for the encryption. This depends on the particular algorithms used. For any encrypted message, at least the message body and possibly other message header fields are encrypted. An application receiving a request or response containing an Encryption header field decrypts the body and then concatenates the plaintext to the request line and headers of the original message. Message headers in the decrypted part completely replace those with the same field name in the plaintext part. (Note: If only the body of the message is to be encrypted, the body has to be prefixed with CRLF to allow proper concatenation.) Note that the request method and Request-URI cannot be encrypted. Encryption only provides privacy; the recipient has no guarantee that the request or response came from the party listed in the From message header, only that the sender used the recipient's public key. However, proxies will not be able to modify the request or response. Encryption = "Encryption" ":" encryption-scheme 1*SP #encryption-params encryption-scheme = token encryption-params = token "=" ( token | quoted-string ) The token indicates the form of encryption used; it is described in section 13. The example in Figure 10 shows a message encrypted with ASCII-armored PGP that was generated by applying "pgp -ea" to the payload to be encrypted. Handley/Schulzrinne/Schooler/Rosenberg [Page 50] Internet Draft SIP December 15, 1998 INVITE sip:watson@boston.bell-telephone.com SIP/2.0 Via: SIP/2.0/UDP 169.130.12.5 From: To: T. A. Watson Call-ID: 187602141351@worcester.bell-telephone.com Content-Length: 885 Encryption: PGP version=2.6.2,encoding=ascii hQEMAxkp5GPd+j5xAQf/ZDIfGD/PDOM1wayvwdQAKgGgjmZWe+MTy9NEX8O25Red h0/pyrd/+DV5C2BYs7yzSOSXaj1C/tTK/4do6rtjhP8QA3vbDdVdaFciwEVAcuXs ODxlNAVqyDi1RqFC28BJIvQ5KfEkPuACKTK7WlRSBc7vNPEA3nyqZGBTwhxRSbIR RuFEsHSVojdCam4htcqxGnFwD9sksqs6LIyCFaiTAhWtwcCaN437G7mUYzy2KLcA zPVGq1VQg83b99zPzIxRdlZ+K7+bAnu8Rtu+ohOCMLV3TPXbyp+err1YiThCZHIu X9dOVj3CMjCP66RSHa/ea0wYTRRNYA/G+kdP8DSUcqYAAAE/hZPX6nFIqk7AVnf6 IpWHUPTelNUJpzUp5Ou+q/5P7ZAsn+cSAuF2YWtVjCf+SQmBR13p2EYYWHoxlA2/ GgKADYe4M3JSwOtqwU8zUJF3FIfk7vsxmSqtUQrRQaiIhqNyG7KxJt4YjWnEjF5E WUIPhvyGFMJaeQXIyGRYZAYvKKklyAJcm29zLACxU5alX4M25lHQd9FR9Zmq6Jed wbWvia6cAIfsvlZ9JGocmQYF7pcuz5pnczqP+/yvRqFJtDGD/v3s++G2R+ViVYJO z/lxGUZaM4IWBCf+4DUjNanZM0oxAE28NjaIZ0rrldDQmO8V9FtPKdHxkqA5iJP+ 6vGOFti1Ak4kmEz0vM/Nsv7kkubTFhRl05OiJIGr9S1UhenlZv9l6RuXsOY/EwH2 z8X9N4MhMyXEVuC9rt8/AUhmVQ== =bOW+ Figure 10: PGP Encryption Example Since proxies can base their forwarding decision on any combination of SIP header fields, there is no guarantee that an encrypted request "hiding" header fields will reach the same destination as an otherwise identical un-encrypted request. 6.20 Expires The Expires entity-header field gives the date and time after which the message content expires. This header field is currently defined only for the REGISTER and INVITE methods. For REGISTER, it is a request and response-header field. In a REGISTER request, the client indicates how long it wishes the registration to be valid. In the response, the server indicates the earliest expiration time of all registrations. The server MAY choose a shorter time interval than that requested by the client, but SHOULD NOT choose a longer one. For INVITE requests, it is a request and response-header field. In a request, the caller can limit the validity of an invitation, for Handley/Schulzrinne/Schooler/Rosenberg [Page 51] Internet Draft SIP December 15, 1998 example, if a client wants to limit the time duration of a search or a conference invitation. A user interface MAY take this as a hint to leave the invitation window on the screen even if the user is not currently at the workstation. This also limits the duration of a search. If the request expires before the search completes, the proxy returns a 408 (Request Timeout) status. In a 302 (Moved Temporarily) response, a server can advise the client of the maximal duration of the redirection. The value of this field can be either a SIP-date or an integer number of seconds (in decimal), measured from the receipt of the request. The latter approach is preferable for short durations, as it does not depend on clients and servers sharing a synchronized clock. Implementations MAY treat values larger than 2**32-1 (4294967295 or 136 years) as equivalent to 2**32-1. Expires = "Expires" ":" ( SIP-date | delta-seconds ) Two examples of its use are Expires: Thu, 01 Dec 1994 16:00:00 GMT Expires: 5 6.21 From Requests and responses MUST contain a From general-header field, indicating the initiator of the request. The From field MAY contain the "tag" parameter. The server copies the From header field from the request to the response. The optional "display-name" is meant to be rendered by a human-user interface. A system SHOULD use the display name "Anonymous" if the identity of the client is to remain hidden. The SIP-URL MUST NOT contain the "transport-param", "maddr-param", "ttl-param", or "headers" elements. A server that receives a SIP-URL with these elements removes them before further processing. Even if the "display-name" is empty, the "name-addr" form MUST be used if the "addr-spec" contains a comma, question mark, or semicolon. From = ( "From" | "f" ) ":" ( name-addr | addr-spec ) Handley/Schulzrinne/Schooler/Rosenberg [Page 52] Internet Draft SIP December 15, 1998 *( ";" addr-params ) addr-params = tag-param tag-param = "tag=" UUID UUID = 1*( hex | "-" ) Examples: From: "A. G. Bell" From: sip:+12125551212@server.phone2net.com From: Anonymous The "tag" MAY appear in the From field of a request. It MUST be present when it is possible that two instances of a user sharing a SIP address can make call invitations with the same Call-ID. The "tag" value MUST be globally unique and cryptographically random with at least 32 bits of randomness. A single user maintains the same tag throughout the call identified by the Call-ID. Call-ID, To and From are needed to identify a call leg. The distinction between call and call leg matters in calls with multiple responses to a forked request. The format is similar to the equivalent RFC 822 [24] header, but with a URI instead of just an email address. 6.22 Hide A client uses the Hide request header field to indicate that it wants the path comprised of the Via header fields (Section 6.40) to be hidden from subsequent proxies and user agents. It can take two forms: Hide: route and Hide: hop. Hide header fields are typically added by the client user agent, but MAY be added by any proxy along the path. If a request contains the "Hide: route" header field, all following proxies SHOULD hide their previous hop. If a request contains the "Hide: hop" header field, only the next proxy SHOULD hide the previous hop and then remove the Hide option unless it also wants to remain anonymous. A server hides the previous hop by encrypting the "host" and "port" parts of the top-most Via header field with an algorithm of its choice. Servers SHOULD add additional "salt" to the "host" and "port" Handley/Schulzrinne/Schooler/Rosenberg [Page 53] Internet Draft SIP December 15, 1998 information prior to encryption to prevent malicious downstream proxies from guessing earlier parts of the path based on seeing identical encrypted Via headers. Hidden Via fields are marked with the "hidden" Via option, as described in Section 6.40. A server that is capable of hiding Via headers MUST attempt to decrypt all Via headers marked as "hidden" to perform loop detection. Servers that are not capable of hiding can ignore hidden Via fields in their loop detection algorithm. If hidden headers were not marked, a proxy would have to decrypt all headers to detect loops, just in case one was encrypted, as the Hide: Hop option may have been removed along the way. A host MUST NOT add such a "Hide: hop" header field unless it can guarantee it will only send a request for this destination to the same next hop. The reason for this is that it is possible that the request will loop back through this same hop from a downstream proxy. The loop will be detected by the next hop if the choice of next hop is fixed, but could loop an arbitrary number of times otherwise. A client requesting "Hide: route" can only rely on keeping the request path private if it sends the request to a trusted proxy. Hiding the route of a SIP request is of limited value if the request results in data packets being exchanged directly between the calling and called user agent. The use of Hide header fields is discouraged unless path privacy is truly needed; Hide fields impose extra processing costs and restrictions for proxies and can cause requests to generate 482 (Loop Detected) responses that could otherwise be avoided. The encryption of Via header fields is described in more detail in Section 13. The Hide header field has the following syntax: Hide = "Hide" ":" ( "route" | "hop" ) 6.23 Max-Forwards The Max-Forwards request-header field may be used with any SIP method to limit the number of proxies or gateways that can forward the request to the next downstream server. This can also be useful when Handley/Schulzrinne/Schooler/Rosenberg [Page 54] Internet Draft SIP December 15, 1998 the client is attempting to trace a request chain which appears to be failing or looping in mid-chain. Max-Forwards = "Max-Forwards" ":" 1*DIGIT The Max-Forwards value is a decimal integer indicating the remaining number of times this request message is allowed to be forwarded. Each proxy or gateway recipient of a request containing a Max- Forwards header field MUST check and update its value prior to forwarding the request. If the received value is zero (0), the recipient MUST NOT forward the request. Instead, for the OPTIONS and REGISTER methods, it MUST respond as the final recipient. For all other methods, the server returns 483 (Too many hops). If the received Max-Forwards value is greater than zero, then the forwarded message MUST contain an updated Max-Forwards field with a value decremented by one (1). Example: Max-Forwards: 6 6.24 Organization The Organization general-header field conveys the name of the organization to which the entity issuing the request or response belongs. It MAY also be inserted by proxies at the boun