SIPPING Working Group                                            V. Hilt
Internet-Draft                             Bell Labs/Lucent Technologies
Intended status: Informational                                  D. Malas
Expires: April 18, 2007                           Level 3 Communications
                                                              I. Widjaja
                                           Bell Labs/Lucent Technologies
                                                             R. Terpstra
                                                  Level 3 Communications
                                                        October 15, 2006


           Session Initiation Protocol (SIP) Overload Control
                     draft-hilt-sipping-overload-00

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 18, 2007.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   Overload occurs in Session Initiation Protocol (SIP) networks when
   SIP servers have insufficient resources to handle all SIP messages
   they receive.  Even though the SIP protocol provides a limited


Hilt, et al.             Expires April 18, 2007                 [Page 1]

Internet-Draft              Overload Control                October 2006


   overload control mechanism through its 503 response code, SIP servers
   are still vulnerable to overload.  This specification defines a new
   SIP overload control mechanism that protects SIP servers against
   overload.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Design Considerations  . . . . . . . . . . . . . . . . . . . .  4
     3.1.  Control Model  . . . . . . . . . . . . . . . . . . . . . .  4
     3.2.  Applying the Overload Control Loop . . . . . . . . . . . .  7
     3.3.  Load Feedback  . . . . . . . . . . . . . . . . . . . . . .  9
     3.4.  SIP Mechanism  . . . . . . . . . . . . . . . . . . . . . . 11
     3.5.  Backwards Compatibility  . . . . . . . . . . . . . . . . . 12
   4.  SIP Application Considerations . . . . . . . . . . . . . . . . 13
     4.1.  How to Calculate Load Levels . . . . . . . . . . . . . . . 13
     4.2.  Responding to a Load Level Message . . . . . . . . . . . . 13
     4.3.  Emergency Services Requests  . . . . . . . . . . . . . . . 14
     4.4.  Downstream Server Failures . . . . . . . . . . . . . . . . 15
     4.5.  B2BUAs (Back-to-Back User Agents)  . . . . . . . . . . . . 15
     4.6.  Operations and Management  . . . . . . . . . . . . . . . . 15
   5.  SIP Load Header  . . . . . . . . . . . . . . . . . . . . . . . 15
     5.1.  Generating the Load Header . . . . . . . . . . . . . . . . 16
     5.2.  Determining the Load Header Value  . . . . . . . . . . . . 16
     5.3.  Determining the Throttle Parameter Value . . . . . . . . . 17
     5.4.  Processing the Load Header . . . . . . . . . . . . . . . . 17
     5.5.  Using the Load Header Value  . . . . . . . . . . . . . . . 18
     5.6.  Using the Throttle Parameter Value . . . . . . . . . . . . 18
     5.7.  503 Responses  . . . . . . . . . . . . . . . . . . . . . . 19
   6.  Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 20
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 21
   Appendix A.  Acknowledgements  . . . . . . . . . . . . . . . . . . 21
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 21
     9.2.  Informative References . . . . . . . . . . . . . . . . . . 21
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21
   Intellectual Property and Copyright Statements . . . . . . . . . . 23


Hilt, et al.             Expires April 18, 2007                 [Page 2]

Internet-Draft              Overload Control                October 2006


1.  Introduction

   A Session Initiation Protocol (SIP) [2] server can be overloaded.
   Overload occurs when a SIP server has insufficient resources to
   process all SIP requests and responses it receives.  SIP server
   overload poses a serious problem for a SIP network.  During periods
   of overload, the message processing capacity of a SIP network can be
   significantly degraded.  In particular, SIP server overload may lead
   to a situation in which the actual message throughput of a SIP
   network drops to a small fraction of its original capacity.  This is
   called congestion collapse.

   The SIP protocol provides a limited mechanism for overload control
   through its 503 response code.  However, it has been shown that this
   mechanism often cannot prevent SIP server overload and it cannot
   prevent congestion collapse in SIP networks.  In fact, the 503
   response code mechanism may cause traffic to oscillate between SIP
   servers and thereby worsens an overload condition.  A detailed
   discussion of the SIP overload problem, the 503 response code and the
   requirements for a SIP overload control solution can be found in [5].

   This specification defines a new mechanism for SIP overload control.
   The idea of this mechanism is to introduce a feedback control loop
   between a SIP server and its upstream neighbors to regulate the load
   that is forwarded to a server.  With this mechanism, a SIP server
   sends load feedback to its upstream neighbors.  Upstream neighbors
   use this information to adjust the load they forward downstream and,
   in particular, to throttle the load if a downstream neighbor reaches
   an overload condition.

   Overload occurs if a SIP server does not have sufficient resources to
   process all incoming SIP messages.  These resources may include CPU
   processing capacity, memory, I/O, or disk resources.  In some cases,
   it can also include external resources, such as a database or a DNS
   server.  Generally speaking, overload occurs in all cases in which a
   SIP server becomes endangered of not being able to process or respond
   to incoming SIP messages.

   Overload does not occur in and, hence, overload control is not
   suitable for conditions in which a SIP server is able to process
   incoming SIP requests (e.g. by rejecting them with an appropriate
   response code) but does not have the resources to complete the
   requests successfully.  These cases are covered by the respective
   response codes defined in other specifications.  Overload control
   MUST NOT be used in these scenarios.  For example, a PSTN gateway
   that runs out of trunk lines but still has plenty of capacity to
   process SIP messages should reject incoming INVITEs using a 488 (Not
   Acceptable Here) response [4].  Similarly, a SIP registrar that has


Hilt, et al.             Expires April 18, 2007                 [Page 3]

Internet-Draft              Overload Control                October 2006


   lost connectivity to its registration database but is still capable
   of processing SIP messages should reject REGISTER requests with a 500
   (Server Error) response [2].

   This specification is structured as follows: Section 3 discusses
   general design principles of an SIP overload control mechanism.
   Section 4 discusses general considerations for applying SIP overload
   control.  Section 5 defines a SIP protocol extension for overload
   control and Section 6 introduces the syntax of this extension.
   Section 7 and Section 8 discuss security and IANA considerations
   respectively.


2.  Terminology

   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
   RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as
   described in BCP 14, RFC 2119 [1] and indicate requirement levels for
   compliant implementations.


3.  Design Considerations

   This section discusses some of the key design considerations for a
   SIP overload control mechanism.

3.1.  Control Model

   The following model for SIP overload control is based on a closed-
   loop control system and consists of the following functional
   components (see Figure 1 and Figure 2):

   o  Monitor: component that monitors the current load of the SIP
      processor on the receiving entity.  The monitor implements the
      mechanisms needed to measure the current usage of resources
      relevant for the SIP processor.  It reports load samples (S) to
      the Control Function.
   o  Control Function: component that implements the overload control
      algorithm, which decides whether actions need to be taken based on
      the current load.  The control function uses the load samples (S)
      provided by the monitor and determines if overload control
      commands (C) need to be issued to adjust the load sent to the SIP
      processor on the receiving entity.  By generating control
      commands, the control function can impact load forwarded from the
      sending entity to the receiving entity.  In particular, it may
      throttle this load if necessary to prevent overload in the
      receiving entity.


Hilt, et al.             Expires April 18, 2007                 [Page 4]

Internet-Draft              Overload Control                October 2006


   o  Actuator: component that acts on the control commands (C) it
      receives from the control function.  The control function relies
      on the actuator to properly execute control commands on the
      traffic forwarded from the sending to the receiving entity.  For
      example, a control command may instruct the actuator to shed 10%
      of the load destined to the receiving entity.  The actuator
      decides how the load reduction is achieved (e.g. by redirecting or
      rejecting requests).
   o  SIP Processor: component that processes SIP messages.  The SIP
      processor is the component that is protected by overload control.

   Three models for SIP overload control can be defined based on these
   components.  All models implement a closed-loop controller, however,
   they differ in the way the control system components are distributed
   across the entities that participate in overload control.  All models
   can be chained, i.e., a server can have the role of a receiving
   entity in the control loop with its upstream neighbor and act as a
   sending entity in another control loop with a downstream neighbor.

   In the first model, the control function is located on the receiving
   entity (see Figure 1).  Thus, overload control algorithms are
   implemented in the receiving entity.  The feedback sent from
   receiving to sending entity consists of control commands (C).

   Since the control function is on the receiving entity, it is the
   receiving entity that decides how much traffic it wants to receive
   from a sending entity.  The receiving entity sends control commands
   to possible multiple upstream neighbors.  All upstream neighbors of a
   receiving entity are therefore governed by the same control algorithm
   on the receiving entity.  This control algorithm may consider local
   policies when creating a control command for an upstream neighbor.
   Since load samples are not conveyed to upstream neighbors in this
   model, this information is not available for load balancing and
   target selection in upstream neighbors.


Hilt, et al.             Expires April 18, 2007                 [Page 5]

Internet-Draft              Overload Control                October 2006


          Sending                Receiving
           Entity                  Entity
     +----------------+      +----------------+
     |    Server A    |      |    Server B    |
     |                |      |  +----------+  |    -+
     |                |  C   |  | Control  |  |     |
     |       +--------+------+--| Function |  |     |
     |       |        |      |  +----------+  |     |
     |       |        |      |       ^        |     | Overload
     |       v        |      |       |S       |     | Control
     |  +----------+  |      |  +----------+  |     |
     |  | Actuator |  |      |  | Monitor  |  |     |
     |  +----------+  |      |  +----------+  |     |
     |       |        |      |       ^        |    -+
     |       v        |      |       |        |    -+
     |  +----------+  |      |  +----------+  |     |
   <-+--|   SIP    |  |      |  |   SIP    |  |     |  SIP
   --+->|Processor |--+------+->|Processor |--+->   | System
     |  +----------+  |      |  +----------+  |     |
     +----------------+      +----------------+    -+


              Figure 1: Control Function on Receiving Entity

   In the second alternative, the control function is located on the
   sending entity (see Figure 2).  In this model, load samples (S) are
   reported from the receiving to the sending entity.  The sending
   entity decides how much traffic it forwards to the receiving entity
   based on the received load information.  The receiving entity can
   only indirectly impact the load it receives by adjusting the load
   samples (S) it is reporting.

   In this model, each sending entity can implement its own overload
   control algorithm to generates control commands.  Since load
   information is available in the sending entity, the sending entity
   can use information from various downstream neighbors when generating
   control commands.


Hilt, et al.             Expires April 18, 2007                 [Page 6]

Internet-Draft              Overload Control                October 2006


          Sending                Receiving
           Entity                  Entity
     +----------------+      +----------------+
     |    Server A    |      |    Server B    |
     |  +----------+  |      |                |    -+
     |  | Control  |  |  S   |                |     |
     |  | Function |<-+------+-------+        |     |
     |  +----------+  |      |       |        |     |
     |       |C       |      |       |        |     | Overload
     |       v        |      |       |        |     | Control
     |  +----------+  |      |  +----------+  |     |
     |  | Actuator |  |      |  | Monitor  |  |     |
     |  +----------+  |      |  +----------+  |     |
     |       |        |      |       ^        |    -+
     |       v        |      |       |        |    -+
     |  +----------+  |      |  +----------+  |     |
   <-+--|   SIP    |  |      |  |   SIP    |  |     |  SIP
   --+->|Processor |--+------+->|Processor |--+->   | System
     |  +----------+  |      |  +----------+  |     |
     +----------------+      +----------------+    -+


               Figure 2: Control Function on Sending Entity

   Yet a third alternative is a variant of the first model.  In this
   model, the control function is located on the receiving entity.
   However, in addition to control commands (C), the receiving entity
   also reports load samples (S) to the sending entity.  In this model,
   the traffic forwarded to the receiving entity is controlled by the
   receiving entity as in the first model.  However, the additional load
   information assists the sending entity in balancing load and finding
   under-utilized servers.

      OPEN ISSUE: Which model is preferable?
      Having the control function on the receiving entity has the
      advantage that the receiving entity can implement overload control
      algorithms that can consider the specifics of this system (e.g.
      determine how early load needs to be reduced and which load values
      are critical for the system) and can apply throttling individually
      to each upstream neighbor.  Control commands need to be well
      defined so that they can be used unambiguously by receiving and
      sending entity.  It also seems preferable to have the load status
      in the sending entity, which makes the third model attractive.

3.2.  Applying the Overload Control Loop

   A typical SIP request is processed by more than one SIP server.
   Thus, the question arises as to how the overload control model is


Hilt, et al.             Expires April 18, 2007                 [Page 7]

Internet-Draft              Overload Control                October 2006


   applied to multiple servers along the path of a SIP request.  In
   principle, the overload control loop can be applied hop-by-hop (i.e.
   pair wise between the servers on a path) or as one single control
   loop that stretches across the entire path from UAC to UAS.  The two
   alternatives are illustrated in Figure 3.


               +---------+             +-------+----------+
      +------+ |         |             |       |          |
      |      | |        +---+          |       |         +---+
      v      | v    //=>| C |          v       |     //=>| C |
   +---+    +---+ //    +---+       +---+    +---+ //    +---+
   | A |===>| B |                   | A |===>| B |
   +---+    +---+ \\    +---+       +---+    +---+ \\    +---+
               ^    \\=>| D |          ^       |     \\=>| D |
               |        +---+          |       |         +---+
               |         |             |       |          |
               +---------+             +-------+----------+

      (a) hop-by-hop loop               (b) full path loop

    ==> SIP request flow
    <-- Load feedback loop


                      Figure 3: Overload Control Loop

   The idea of hop-by-hop overload control is to have an individual
   feedback control loop between two neighboring SIP servers.  Each SIP
   server reports feedback to its upstream neighbors, i.e., the entities
   it receives SIP requests from.  The upstream neighbors adjust the
   amount of traffic they are forwarding to the SIP server based on this
   feedback.

   A SIP server that receives traffic from multiple upstream neighbors
   has a separate feedback loop with each of the upstream neighbors.
   The server can use local policies to determine how feedback is
   balanced across upstream neighbors.  For example, the server can
   treat all neighbors equally and provide the same feedback to all of
   them (e.g. ask all upstream neighbors to throttle traffic by 10% when
   load builds up).  It may also prefer some servers over others.  For
   example, it may throttle a less preferred upstream neighbor earlier
   than a preferred neighbor or it may throttle the neighbor first that
   sends the most traffic.  In any case, a SIP server needs to provide
   feedback such that it does not get overloaded as the load increases.

   The upstream neighbors of a SIP server don't forward the load status
   they receive further upstream, since they can act on this information


Hilt, et al.             Expires April 18, 2007                 [Page 8]

Internet-Draft              Overload Control                October 2006


   and resolve the overload condition if needed (e.g. by re-routing or
   rejecting traffic).  The upstream neighbor, of course, should report
   its own load status to its upstream neighbors.  If the upstream
   neighbor becomes congested itself, its upstream neighbors can take
   action based on the reported feedback and resolve the situation.
   Thus, overload of a SIP server is resolved by its direct neighbors
   without the need to involve entities that are located multiple SIP
   hops away.

   Hop-by-hop overload control can effectively reduce the impact of
   overload on a SIP network and, in particular, can avoid the signaling
   congestion collapse.  This is achieved by enabling SIP entities to
   gradually lower the amount of traffic they receive.  Thus, a server
   that reaches a high level of load can effectively offload the task of
   redirecting and rejecting messages to its upstream neighbors.  The
   approach of hop-by-hop overload control is simple and scales well to
   networks with many SIP entities.  It does not require a SIP entity to
   aggregate a large number of load status values or keep track of the
   load status of SIP servers it is not communicating with.  A SIP
   entity only needs to observe the load status of the downstream
   neighbors it is currently forwarding traffic to.

   The main goal of full path overload control is to provide overload
   control for a full path, from UAC to UAS.  Full path overload control
   considers load information from all SIP servers (including all
   proxies and the UAS).  A full path overload control mechanism has to
   be able to frequently collect the load status of all servers on the
   potential path of a SIP request and combine this data into meaningful
   load feedback.  The main problem of full path overload control is its
   inherent complexity.  A UAC or SIP server would have to monitor all
   potential paths it may use for SIP requests.

      OPEN ISSUE: which approach is preferable?  Hop-by-hop overload
      control seems to be much simpler and manageable and does achieve
      the goal of avoiding overload in SIP servers.

3.3.  Load Feedback

   A SIP server frequently provides load feedback to its upstream
   neighbors.  Load feedback reports consist of two parts: i) the
   control commands (if any) and ii) the current load status value.  In
   addition, load feedback reports have an expiration time so that they
   can eventually be timed out.

   Control commands in a load feedback report instruct the actuator on
   the upstream neighbor to modify the traffic forwarded to the
   receiving SIP server.  The following control commands are defined:


Hilt, et al.             Expires April 18, 2007                 [Page 9]

Internet-Draft              Overload Control                October 2006


   o  Throttle X%: instructs the actuator to reduce the amount of
      traffic it would normally forward to the receiving SIP server by X
      percent.  This requires that the actuator diverts or rejects X
      percent of the traffic destined to the receiving server.

      OPEN ISSUE: the throttle X% command has the advantage that it is
      simple.  It does not set a fixed boundary for the load.  E.g. if
      the offered load increases significantly (say by 20%), the
      forwarded load will increase even if the downstream server has
      sent a throttle 10% command.  Other possibilities for control
      commands are max. messages per second, message gapping, etc.
      Also, there might be other commands that are useful.  Generally,
      the idea is to define a few well-defined commands the receiving
      entity can send to the sending entity and can expect a given,
      well-defined behavior.

      OPEN ISSUE: it probably makes sense to define a default control
      function algorithm that generates these control commands based on
      load numbers.

   The load status value is used to indicate to which degree the
   resources needed by a SIP server to process SIP messages are
   utilized.  Load status values range from 0 to 100.  A value of 0
   indicates that the server is completely idle, a value of 100 that all
   resources are fully utilized and the server is overloaded.

   The algorithm used to determine the load status of a SIP server is
   specific to the type of resources needed by the server to process SIP
   messages.  Different SIP servers may have different resource
   requirements and constrains and therefore may use different
   algorithms to compute the load status value.  A common mechanism is
   to use the processor utilization to derive the load status.  However,
   other metrics such as memory usage or queue length may also be used.

      OPEN ISSUE: it might make sense to define a default algorithm to
      determine load (e.g. based on queue length or processor load).

   The load conditions on a SIP server vary over time.  A load status
   report is therefore only valid for a limited amount of time.  The
   expiration time is part of the load report and is set by the
   reporting SIP server.  The rationale behind this is that the
   reporting SIP server can best determine how stable the current load
   conditions are (e.g. based on local history) and for how long the
   reported load can provide a good estimate for the actual load
   conditions.

   Once a load status report has reached its expiration time, it may
   have become inaccurate since the load on the server may have changed.


Hilt, et al.             Expires April 18, 2007                [Page 10]

Internet-Draft              Overload Control                October 2006


   The longer the load status report has expired, the bigger is the
   uncertainty about the accuracy of its values.  A SIP entity that has
   an expired load status report could discard it right after the
   expiration time.  However, this would instantly deactivate the
   control commands contained in this report, e.g., the throttling of
   load to the receiving SIP server.  This is clearly undesirable, in
   particular, if the receiving server is currently under heavy load and
   had set a high throttling value.  An alternative approach is to fade
   out the effects of an expired load status report by gradually
   decreasing the throttle value over time until it reaches 0 and the
   amount of traffic that is forwarded is back to 100%.  Effectively,
   this implements a slow start mechanism.  Similarly, the load status
   value in a report can be decreased over time until it reaches 0,
   indicating that the server is not utilized at all.  If the downstream
   server is still overloaded, it can create a new status report with
   up-to-date values.  If the server has failed, this will be detected
   on the transport or SIP level.  Obviously, no traffic should be
   forwarded to a failed SIP server.

   Since SIP servers can use load status reports to continuously
   advertise the current level of load to upstream neighbors, this
   mechanism does not have the on-off semantics that can lead to traffic
   oscillation.  In fact, SIP proxies can use the load status
   information to balance load between alternative proxies.  Thus, this
   mechanism can help to evenly load downstream proxies, making best use
   of available resources.  However, this mechanism is not intended to
   replace specialized load balancing mechanisms.

3.4.  SIP Mechanism

   A SIP mechanism needs to convey load status reports from the
   receiving to the sending SIP entity.

   In principle, it would be possible to define a new SIP request that
   can be used to convey load status reports from the receiving to the
   sending entity.  However, sending separate load status requests from
   receiving to the sending entity would create additional messaging
   overhead, which is undesirable during periods of overload.  It would
   also require each SIP server to keep track of all potential upstream
   neighbors.

   Another approach is therefore to define a new SIP header field for
   load information that can be inserted into SIP responses.  This
   approach has the advantage that it provides load feedback to all
   upstream SIP entities that are currently forwarding traffic to a SIP
   server with very little overhead.

   Piggybacking load information in SIP responses is problematic in


Hilt, et al.             Expires April 18, 2007                [Page 11]

Internet-Draft              Overload Control                October 2006


   scenarios where a SIP server receives single requests from many
   different upstream neighbors.  An edge proxy that communicates with
   many SIP endpoints is an example for such a SIP server.  Since each
   endpoint only sends a single request, it can't decrease load by
   throttling future requests.  However, an edge proxy can reduce its
   load by rejecting a percentage of the requests it receives with a 503
   response code and ask the endpoint to stop sending messages for some
   time.  The proxy can send 503 only to selected servers and therefore
   gradually reduce the amount of traffic it receives.  In this
   particular scenario, the use of 503 responses does not lead to
   traffic oscillation and can be used instead of an overload control
   mechanism that gradually throttles the load forwarded by a SIP
   server.

   Hop-by-hop overload control requires that the distribution of load
   status reports is limited to the next upstream SIP server.  This is
   achieved by adding the address of the next hop server (i.e. the
   destination of the load status report) to the load header.

3.5.  Backwards Compatibility

   An important requirement for an overload control mechanism is that it
   can be gradually introduced into a network and that it functions
   properly if only a fraction of the servers support it.

   Hop-by-hop overload control does not require that all SIP entities in
   a network support it.  It can be used effectively between two
   adjacent SIP servers if both servers support this extension and does
   not depend on the support from any other server or user agent.  The
   more SIP servers in a network support this mechanism, the more
   effective it is since it includes more of the servers in the load
   reporting and offloading process.

   A SIP server may have multiple neighbors from which only some support
   overload control.  If this server would issue a throttling command to
   all upstream neighbors, only those that support overload control
   would throttle their load.  Others would keep sending at the full
   rate and benefit from the throttling by other servers supporting this
   extension.  In other words, upstream neighbors that do not support
   overload control would be better off than those that do.

   A SIP server should therefore fall back to the use of 503 responses
   towards upstream neighbors that do not support overload control.  A
   SIP server can issue throttling commands to overload control-enabled
   neighbors and 503 responses to all other neighbors.  This way,
   servers that support overload control are better off than servers
   that don't.


Hilt, et al.             Expires April 18, 2007                [Page 12]

Internet-Draft              Overload Control                October 2006


      OPEN ISSUE: how should an upstream neighbor indicate that it
      supports overload control?


4.  SIP Application Considerations

4.1.  How to Calculate Load Levels

   Calculating an element's load level is dependent on the limiting
   resource for the element.  There can be several contributing factors,
   such as CPU, Memory, Queue depth, calls per second, application
   threads, etc.  However, the underlying result is that the element has
   reached a limit such that it cannot reliably process any more calls.
   If the element knows what its limiting resource is, it should be able
   to report different levels of load based on this resource.  In some
   instances, it could be a combination of resources.  Regardless, if
   the element knows what that resource is and the limit at which the
   element becomes 100% overloaded, then the element should also be able
   to recognize percentages of load prior to hitting 100%.  The element
   should have configuration parameters that map the percent of load to
   a defined load value as described by this document.  The element can
   now report its load level in SIP Responses.

   In some instances, the element itself is not overloaded, but one of
   its resources is, such as a Trunk Group on a Media Gateway Controller
   (MGC) or a next hop IP address to which a proxy sends.  In this case,
   the element may still want to report the load on the resource, so
   that the sending element may be able to limit the sending of future
   requests to that same resource until that resource alleviates its
   overloaded state.

4.2.  Responding to a Load Level Message

   When an element receives a load header with a value other than 20,
   the receiving element should use an algorithm to try to reduce the
   amount of future traffic it sends to the overloaded element.  As an
   alternative, control commands received by an element may determine
   that traffic sent to the overloaded element needs to be reduced.

   If the egress element is overloaded, the ingress element can start to
   reduce the load to the overloaded element.  Depending on the network
   configuration, this may result in sending a percentage of future
   calls that would have gone to the overloaded element to a different
   destination.  In other instances, reducing load means rejecting a
   percentage of calls from coming into the network.

   If a resource on the egress element is overloaded, then the ingress
   element may alter its load-balancing algorithm to lower the


Hilt, et al.             Expires April 18, 2007                [Page 13]

Internet-Draft              Overload Control                October 2006


   percentage of calls it will offer to the overloaded resource.  This
   may be important, as there may be multiple egress points for reaching
   this same overloaded resource.

   The algorithm to reduce load is open to element and vendor
   implementation.  (Note: the load reduction algorithm is not
   specifying the quantity of SIP messages or calls allowed by the proxy
   server, it is simply specifying a treatment of new call attempts
   based on a specified load level.)  However, the algorithm should
   provide for tunable parameters to allow the network operator to
   optimize over time.  The settings for these parameters could be
   different for a carrier network versus an enterprise or VSP network.
   The goal of the algorithm is to alleviate load throughout the
   network.  This means avoidance of propagating load from one element
   to another.  This also means trying to keep the network as full as
   possible without reaching 100% on any given element, except when all
   elements approach 100%.  This may not always be possible on all
   elements, because of the physical resources associated with the
   element.  For example, a Media Gateway (MG) and MGC may be tied
   directly to physical trunks in a given city or region.  If that MGC
   or MG becomes overloaded, there may not be a way to spread the load
   across the network, because the physical resources on those elements
   are the only path into or out of the network in that city or region.
   However, for SIP resources, the network should be able to spread the
   load evenly across all elements as long as it does not result in
   quality issues.  Balancing load across a network is the
   responsibility of the operator, but the load parameter may help the
   operator to adjust the balancing in a more dynamic fashion by
   allowing the load-balancing algorithm to react to bursts or outages.

4.3.  Emergency Services Requests

   It is generally recommended proxy servers should attempt to balance
   all SIP requests, and relative resources, to a maximum load value 80.
   In doing so, the servers are proactively tuned to allow an emergency
   services request attempt to be placed to any available upstream or
   downstream SIP device for immediate processing and delivery to the
   intended emergency services provider.

   In some cases, a load value of 80 is simply impossible or difficult
   to maintain due to extraneous situations.  Since, the downstream
   proxy server is providing load information to the upstream
   originating elements; the originating elements may use this data to
   begin alleviation treatments such as reducing the load forwarded.
   When the proxy server receives an emergency services request, it
   should not be treated by the methods described and processed
   immediately.


Hilt, et al.             Expires April 18, 2007                [Page 14]

Internet-Draft              Overload Control                October 2006


   In the worse case, an emergency services request should be attempted
   immediately regardless of SIP device load state.  The load header is
   designed to proactively communicate and provide a common mechanism
   for addressing overloaded states to avoid situations when emergency
   services requests are delayed or denied due to overload.

4.4.  Downstream Server Failures

   In some cases, the downstream proxy is too overloaded to respond with
   a load value or any SIP message.  When this occurs, the proxy server
   attempting to contact the downstream server may indicate a load level
   on behalf of the specified server to upstream proxy servers.  The
   proxy would accomplish this by specifying its load level, and then
   specifying the questionable downstream proxy as described in a multi-
   server inclusive load header.  Optional text may describe a
   questionable downstream overloaded server for the application to
   respond as necessary in retries or future attempts.

4.5.  B2BUAs (Back-to-Back User Agents)

   Since, B2BUA's tend to perform a function of topology hiding, it is
   probably undesirable to forward load level information of "hidden"
   proxies.  A B2BUA may remove "hidden" load information from the load
   value, but it may also indicate a load level on behalf of the
   "hidden" proxies.  This will allow topology hiding while limiting the
   potential of external message overloading.

4.6.  Operations and Management

   This header can provide a valuable tool for element operators.  If it
   becomes necessary to "bleed" off traffic from an element for either
   maintenance or removal, the load header could be manually manipulated
   by the application.  This will communicate to upstream elements the
   necessity to try alternative downstream elements for new call
   attempts.  In accomplishing this, the element will have a graceful
   method of removing itself as a preferred downstream choice.

   In addition, the load header information can be captured within Call
   Detail Records (CDRs) and SNMP traps for use in service reports.
   These service reports could be used for future network optimization.


5.  SIP Load Header

   This section defines a new SIP header for overload control, the Load
   header.  This header follows the above design considerations for an
   overload control mechanism.


Hilt, et al.             Expires April 18, 2007                [Page 15]

Internet-Draft              Overload Control                October 2006


5.1.  Generating the Load Header

   A SIP server compliant to this specification SHOULD regularly provide
   load feedback to its upstream neighbors in a timely manner.  It does
   so by inserting a Load header field into the SIP responses it is
   forwarding or creating.  The Load header is a new header field
   defined in this specification.  The Load header can be inserted into
   all response types, including provisional, success and failure
   response types.  A SIP server SHOULD insert a Load header into all
   responses.

   A SIP server MAY choose to insert Load headers less frequently, for
   example, once every x milliseconds.  This may be useful for SIP
   servers that receive a very high number of messages from the same
   upstream neighbor or servers with a very low variability of the load
   measure.  In any case, a SIP server SHOULD try to insert a Load
   header into a response well before the previous Load header sent to
   the same upstream neighbor expires.  Only SIP servers that frequently
   insert Load header into responses are protected against overload.

   A SIP server MUST insert the address of its upstream neighbor into
   the "target" parameter of the Load header.  It SHOULD use the address
   of the upstream neighbor found in the topmost Via header of the
   response for this purpose.

   The "target" parameter enables the receiver of a Load header to
   determine if it should process the Load header (since it was
   generated by its downstream neighbor) or if the Load header needs to
   be ignored (since it was passed along by an entity that does not
   support this extension).  Effectively, the "target" parameter
   implements the hop-by-hop semantics and prevents the use of load
   status information beyond the next hop.

   A SIP server SHOULD add a "validity" parameter to the Load header.
   The "validity" parameter defines the time in milliseconds during
   which the Load header should be considered valid.  The default value
   of the "validity" parameter is 500.  A SIP server SHOULD use a
   shorter "validity" time if its load status varies quickly and MAY use
   a longer "validity" time if the current load level is more stable.

5.2.  Determining the Load Header Value

   The value of the Load header contains the current load status of the
   SIP server generating this header.  Load header values range from 0
   (idle) to 100 (overload) and MUST reflect the current level in the
   usage of SIP message processing resources.  For example, a SIP server
   that is processing SIP messages at a rate that corresponds to 50% of
   its maximum capacity must set the Load header value to 50.


Hilt, et al.             Expires April 18, 2007                [Page 16]

Internet-Draft              Overload Control                October 2006


5.3.  Determining the Throttle Parameter Value

   The value of the "throttle" parameter specifies the percentage by
   which the load forwarded to this SIP server should be reduced.
   Possible values range from 0 (the load forwarded is reduced by 0%,
   i.e., all traffic is forwarded) to 100 (the load forwarded is reduced
   by 100%, i.e., no traffic forwarded).  The default value of the
   "throttle" parameter is 0.  The "throttle" parameter value is
   determined by the control function of the SIP server generating the
   Load header.

5.4.  Processing the Load Header

   A SIP entity compliant to this specification MUST remove all Load
   headers from the SIP messages it receives before forwarding the
   message.  A SIP entity may, of course, insert its own Load header
   into a SIP message.

   A SIP entity MUST ignore all Load headers that were not addressed to
   it.  It MUST compare its own addresses with the address in the
   "target" parameter of the Load header.  If none of its addresses
   match, it MUST ignore the Load header.  This ensures that a SIP
   entity only processes Load headers that were generated by its direct
   neighbors.

   A SIP server MUST store the information received in Load headers from
   a downstream neighbor in a server load table.  Each entry in this
   table has the following elements:

   o  Address of the server from which the Load header was received.
   o  Time when the header was received.
   o  Load header value.
   o  Throttle parameter value (default value if not present).
   o  Validity parameter value (default value if not present).

   A SIP entity SHOULD slowly fade out the contents of Load headers that
   have exceeded their expiration time by additively decreasing the Load
   header and throttle parameter values until they reach zero.  This is
   achieved by using the following equation to access stored Load header
   and "throttle" parameter values.  Note that this equation is only
   used to access Load header and "throttle" parameter values and the
   result is not written back into the table.

      result = value - ((cur_t - rec_t) DIV validity) * 20

   If the result is negative, zero is used instead.  Value is the stored
   value of the Load header or the "throttle" parameter.  Cur_t is the
   current time in milliseconds, rec_t is the time the Load header was


Hilt, et al.             Expires April 18, 2007                [Page 17]

Internet-Draft              Overload Control                October 2006


   received.  Validity is the "validity" parameter value.  DIV is a
   function that returns the integer portion of a division.

   The idea behind this equation is to subtract 20 from the value for
   each validity period that has passed since the header was received.
   A value of 100, for example, will be reduced to 80 after the first
   validity period and it will be completely removed after 5 * validity
   milliseconds.

   A stored Load header is removed from the table when the above
   equation returns zero for both the load header and throttle parameter
   value.

      ISSUE: All proposed default values and the above equation should
      be considered as straw man proposal.

5.5.  Using the Load Header Value

   A SIP entity MAY use the Load header value to balance load or to find
   an underutilized SIP server.

5.6.  Using the Throttle Parameter Value

   A SIP entity compliant to this specific MUST honor "throttle"
   parameter values when forwarding SIP messages to a downstream SIP
   server.

   A SIP entity applies the usual SIP procedures to determine the next
   hop SIP server as, e.g., described in [2] and [3].  After selecting
   the next hop server, the SIP entity MUST determine if it has a stored
   Load header from this server has not yet fully expired.  If it has
   such a Load header and the header contained a throttle parameter that
   is non-zero, the SIP server MUST determine if it can or cannot
   forward the current request within the current throttle conditions.

   The SIP MAY use the following algorithm to determine if it can
   forward the request.  Other algorithms that lead to the same result
   may be used as well.  The SIP entity draws a random number between 1
   and 100 for the current request.  If the random number is less than
   or equal to the throttle value, the request is not forwarded.
   Otherwise, the request if forwarded as usual.

   The treatment of SIP requests that cannot be forwarded to the
   selected SIP Server is a matter of local policy.  A SIP entity MAY
   try to find an alternative target or it MAY reject the request.


Hilt, et al.             Expires April 18, 2007                [Page 18]

Internet-Draft              Overload Control                October 2006


5.7.  503 Responses

   A SIP server may determine that an upstream neighbor does not support
   this extension.  The SIP server SHOULD use the 503 response code to
   throttle traffic from upstream neighbors that do not support this
   extension.  This is important to ensure that SIP entities, which do
   not support this extension, don't receive preferred treatment over
   SIP entities that do.  A SIP server SHOULD therefore send a 503
   response to an upstream neighbor that does not support this extension
   as soon as it starts throttling load (i.e. generates Load headers
   with throttle parameters greater than zero).

   A SIP server that has reached overload (i.e. a load close to 100)
   SHOULD use 503 responses in addition to the throttle parameter in the
   Load header.  If the proxy has reached a load of 100, it is very
   likely that upstream proxies have ignored the increasing load status
   reports and thus do not support this extension.  By sending a 503
   response, an upstream proxy is enabled to use the traditional SIP
   overload control mechanisms.


6.  Syntax

   This section defines the syntax of a new SIP response header, the
   Load header.  The Load header field is used to advertise the current
   load status information of a SIP entity to its upstream neighbor.

   The value of the Load header is an integer between 0 and 100 with the
   value of 0 indicating that the proxy is least overloaded and the
   value of 100 indicating that the proxy is most overloaded.

   The "target" parameter is mandatory and contains the URI of the next
   hop SIP entity for the response.  I.e. the SIP entity the response is
   forwarded to.  This is the entity that processes the Load header.

   The "throttle" parameter is optional and contains a number between 0
   and 100.  It describes the percentage by which the load forwarded by
   "target" SIP entity to the SIP server generating this header should
   be reduced.

   The "validity" parameter is optional and contains an indication of
   how long the reporting proxy is likely to remain in the given load
   status.

   The syntax of the Load header field is:


Hilt, et al.             Expires April 18, 2007                [Page 19]

Internet-Draft              Overload Control                October 2006


     Load              = "Load" HCOLON loadStatus
     loadStatus        = 0-100 SEMI serverID [ SEMI throttleRate ]
                         [ SEMI validMS ] [ SEMI generic-param ]
     serverID          = "target" EQUAL absoluteURI
     throttleRate      = "throttle" EQUAL 0-100
     validMS           = "validity" EQUAL delta-ms
     delta-ms          = 1*DIGIT

   The BNF for absoluteURI and generic-param is defined in [2].

   Table 1 is an extension of Tables 2 and 3 in [2].

     Header field       where   proxy ACK BYE CAN INV OPT REG
     ________________________________________________________
     Load                 r       ar   -   o   o   o   o   o
                   Table 1: Load Header Field

   Example:

      Load: 80;throttle=20;validity=500;target=p1.example.com


7.  Security Considerations

   Overload control mechanisms can be used by an attacker to conduct a
   denial-of-service attack on a SIP entity if the attacker can pretend
   that the SIP entity is overloaded.  When such a forged overload
   indication is received by an upstream SIP entity, it will stop
   sending traffic to the victim.  Thus, the victim is subject to a
   denial-of-service attack.

   An attacker can create forged load status reports by inserting itself
   into the communication between the victim and its upstream neighbors.
   The attacker would need to add status reports indicating a high load
   to the responses passed from the victim to its upstream neighbor.
   Proxies can prevent this attack by communicating via TLS.  Since load
   status reports have no meaning beyond the next hop, there is no need
   to secure the communication over multiple hops.

   Another way to conduct an attack is to send a message containing a
   high load status value through a proxy that does not support this
   extension.  Since this proxy does not remove the load status
   information, it will reach the next upstream proxy.  If the attacker
   can make the recipient believe that the load status was created by
   its direct downstream neighbor (and not by the attacker further
   downstream) the recipient stops sending traffic to the victim.  A
   precondition for this attack is that the victim proxy does not
   support this extension since it would not pass through load status


Hilt, et al.             Expires April 18, 2007                [Page 20]

Internet-Draft              Overload Control                October 2006


   information otherwise.  The attack also does not work if there is a
   stateful proxy between the attacker and the victim and only 100
   (Trying) responses are used to convey the Load header.


8.  IANA Considerations

   [TBD.]


Appendix A.  Acknowledgements

   Many thanks to Jonathan Rosenberg for the discussions and
   suggestions.


9.  References

9.1.  Normative References

   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.

   [2]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
        Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
        Session Initiation Protocol", RFC 3261, June 2002.

   [3]  Rosenberg, J. and H. Schulzrinne, "Session Initiation Protocol
        (SIP): Locating SIP Servers", RFC 3263, June 2002.

   [4]  Schulzrinne, H. and J. Polk, "Communications Resource Priority
        for the Session Initiation Protocol (SIP)", RFC 4412,
        February 2006.

9.2.  Informative References

   [5]  Rosenberg, J., "Requirements for Management of Overload in the
        Session Initiation Protocol",
        draft-rosenberg-sipping-overload-reqs-01 (work in progress),
        June 2006.


Hilt, et al.             Expires April 18, 2007                [Page 21]

Internet-Draft              Overload Control                October 2006


Authors' Addresses

   Volker Hilt
   Bell Labs/Lucent Technologies
   101 Crawfords Corner Rd
   Holmdel, NJ  07733
   USA

   Email: volkerh@bell-labs.com


   Daryl Malas
   Level 3 Communications
   1025 Eldorado Blvd.
   Broomfield, CO
   USA

   Email: daryl.malas@level3.com


   Indra Widjaja
   Bell Labs/Lucent Technologies
   600-700 Mountain Avenue
   Murray Hill, NJ  07974
   USA

   Email: iwidjaja@lucent.com


   Rich Terpstra
   Level 3 Communications
   1025 Eldorado Blvd.
   Broomfield, CO
   USA

   Email: rich.terpstra@level3.com


Hilt, et al.             Expires April 18, 2007                [Page 22]

Internet-Draft              Overload Control                October 2006


Full Copyright Statement

   Copyright (C) The Internet Society (2006).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).


Hilt, et al.             Expires April 18, 2007                [Page 23]