SIPPING Working Group V. Hilt Internet-Draft Bell Labs/Lucent Technologies Intended status: Informational D. Malas Expires: April 18, 2007 Level 3 Communications I. Widjaja Bell Labs/Lucent Technologies R. Terpstra Level 3 Communications October 15, 2006 Session Initiation Protocol (SIP) Overload Control draft-hilt-sipping-overload-00 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 18, 2007. Copyright Notice Copyright (C) The Internet Society (2006). Abstract Overload occurs in Session Initiation Protocol (SIP) networks when SIP servers have insufficient resources to handle all SIP messages they receive. Even though the SIP protocol provides a limited Hilt, et al. Expires April 18, 2007 [Page 1] Internet-Draft Overload Control October 2006 overload control mechanism through its 503 response code, SIP servers are still vulnerable to overload. This specification defines a new SIP overload control mechanism that protects SIP servers against overload. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Design Considerations . . . . . . . . . . . . . . . . . . . . 4 3.1. Control Model . . . . . . . . . . . . . . . . . . . . . . 4 3.2. Applying the Overload Control Loop . . . . . . . . . . . . 7 3.3. Load Feedback . . . . . . . . . . . . . . . . . . . . . . 9 3.4. SIP Mechanism . . . . . . . . . . . . . . . . . . . . . . 11 3.5. Backwards Compatibility . . . . . . . . . . . . . . . . . 12 4. SIP Application Considerations . . . . . . . . . . . . . . . . 13 4.1. How to Calculate Load Levels . . . . . . . . . . . . . . . 13 4.2. Responding to a Load Level Message . . . . . . . . . . . . 13 4.3. Emergency Services Requests . . . . . . . . . . . . . . . 14 4.4. Downstream Server Failures . . . . . . . . . . . . . . . . 15 4.5. B2BUAs (Back-to-Back User Agents) . . . . . . . . . . . . 15 4.6. Operations and Management . . . . . . . . . . . . . . . . 15 5. SIP Load Header . . . . . . . . . . . . . . . . . . . . . . . 15 5.1. Generating the Load Header . . . . . . . . . . . . . . . . 16 5.2. Determining the Load Header Value . . . . . . . . . . . . 16 5.3. Determining the Throttle Parameter Value . . . . . . . . . 17 5.4. Processing the Load Header . . . . . . . . . . . . . . . . 17 5.5. Using the Load Header Value . . . . . . . . . . . . . . . 18 5.6. Using the Throttle Parameter Value . . . . . . . . . . . . 18 5.7. 503 Responses . . . . . . . . . . . . . . . . . . . . . . 19 6. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 7. Security Considerations . . . . . . . . . . . . . . . . . . . 20 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 21 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 9.1. Normative References . . . . . . . . . . . . . . . . . . . 21 9.2. Informative References . . . . . . . . . . . . . . . . . . 21 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 Intellectual Property and Copyright Statements . . . . . . . . . . 23 Hilt, et al. Expires April 18, 2007 [Page 2] Internet-Draft Overload Control October 2006 1. Introduction A Session Initiation Protocol (SIP) [2] server can be overloaded. Overload occurs when a SIP server has insufficient resources to process all SIP requests and responses it receives. SIP server overload poses a serious problem for a SIP network. During periods of overload, the message processing capacity of a SIP network can be significantly degraded. In particular, SIP server overload may lead to a situation in which the actual message throughput of a SIP network drops to a small fraction of its original capacity. This is called congestion collapse. The SIP protocol provides a limited mechanism for overload control through its 503 response code. However, it has been shown that this mechanism often cannot prevent SIP server overload and it cannot prevent congestion collapse in SIP networks. In fact, the 503 response code mechanism may cause traffic to oscillate between SIP servers and thereby worsens an overload condition. A detailed discussion of the SIP overload problem, the 503 response code and the requirements for a SIP overload control solution can be found in [5]. This specification defines a new mechanism for SIP overload control. The idea of this mechanism is to introduce a feedback control loop between a SIP server and its upstream neighbors to regulate the load that is forwarded to a server. With this mechanism, a SIP server sends load feedback to its upstream neighbors. Upstream neighbors use this information to adjust the load they forward downstream and, in particular, to throttle the load if a downstream neighbor reaches an overload condition. Overload occurs if a SIP server does not have sufficient resources to process all incoming SIP messages. These resources may include CPU processing capacity, memory, I/O, or disk resources. In some cases, it can also include external resources, such as a database or a DNS server. Generally speaking, overload occurs in all cases in which a SIP server becomes endangered of not being able to process or respond to incoming SIP messages. Overload does not occur in and, hence, overload control is not suitable for conditions in which a SIP server is able to process incoming SIP requests (e.g. by rejecting them with an appropriate response code) but does not have the resources to complete the requests successfully. These cases are covered by the respective response codes defined in other specifications. Overload control MUST NOT be used in these scenarios. For example, a PSTN gateway that runs out of trunk lines but still has plenty of capacity to process SIP messages should reject incoming INVITEs using a 488 (Not Acceptable Here) response [4]. Similarly, a SIP registrar that has Hilt, et al. Expires April 18, 2007 [Page 3] Internet-Draft Overload Control October 2006 lost connectivity to its registration database but is still capable of processing SIP messages should reject REGISTER requests with a 500 (Server Error) response [2]. This specification is structured as follows: Section 3 discusses general design principles of an SIP overload control mechanism. Section 4 discusses general considerations for applying SIP overload control. Section 5 defines a SIP protocol extension for overload control and Section 6 introduces the syntax of this extension. Section 7 and Section 8 discuss security and IANA considerations respectively. 2. Terminology In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119 [1] and indicate requirement levels for compliant implementations. 3. Design Considerations This section discusses some of the key design considerations for a SIP overload control mechanism. 3.1. Control Model The following model for SIP overload control is based on a closed- loop control system and consists of the following functional components (see Figure 1 and Figure 2): o Monitor: component that monitors the current load of the SIP processor on the receiving entity. The monitor implements the mechanisms needed to measure the current usage of resources relevant for the SIP processor. It reports load samples (S) to the Control Function. o Control Function: component that implements the overload control algorithm, which decides whether actions need to be taken based on the current load. The control function uses the load samples (S) provided by the monitor and determines if overload control commands (C) need to be issued to adjust the load sent to the SIP processor on the receiving entity. By generating control commands, the control function can impact load forwarded from the sending entity to the receiving entity. In particular, it may throttle this load if necessary to prevent overload in the receiving entity. Hilt, et al. Expires April 18, 2007 [Page 4] Internet-Draft Overload Control October 2006 o Actuator: component that acts on the control commands (C) it receives from the control function. The control function relies on the actuator to properly execute control commands on the traffic forwarded from the sending to the receiving entity. For example, a control command may instruct the actuator to shed 10% of the load destined to the receiving entity. The actuator decides how the load reduction is achieved (e.g. by redirecting or rejecting requests). o SIP Processor: component that processes SIP messages. The SIP processor is the component that is protected by overload control. Three models for SIP overload control can be defined based on these components. All models implement a closed-loop controller, however, they differ in the way the control system components are distributed across the entities that participate in overload control. All models can be chained, i.e., a server can have the role of a receiving entity in the control loop with its upstream neighbor and act as a sending entity in another control loop with a downstream neighbor. In the first model, the control function is located on the receiving entity (see Figure 1). Thus, overload control algorithms are implemented in the receiving entity. The feedback sent from receiving to sending entity consists of control commands (C). Since the control function is on the receiving entity, it is the receiving entity that decides how much traffic it wants to receive from a sending entity. The receiving entity sends control commands to possible multiple upstream neighbors. All upstream neighbors of a receiving entity are therefore governed by the same control algorithm on the receiving entity. This control algorithm may consider local policies when creating a control command for an upstream neighbor. Since load samples are not conveyed to upstream neighbors in this model, this information is not available for load balancing and target selection in upstream neighbors. Hilt, et al. Expires April 18, 2007 [Page 5] Internet-Draft Overload Control October 2006 Sending Receiving Entity Entity +----------------+ +----------------+ | Server A | | Server B | | | | +----------+ | -+ | | C | | Control | | | | +--------+------+--| Function | | | | | | | +----------+ | | | | | | ^ | | Overload | v | | |S | | Control | +----------+ | | +----------+ | | | | Actuator | | | | Monitor | | | | +----------+ | | +----------+ | | | | | | ^ | -+ | v | | | | -+ | +----------+ | | +----------+ | | <-+--| SIP | | | | SIP | | | SIP --+->|Processor |--+------+->|Processor |--+-> | System | +----------+ | | +----------+ | | +----------------+ +----------------+ -+ Figure 1: Control Function on Receiving Entity In the second alternative, the control function is located on the sending entity (see Figure 2). In this model, load samples (S) are reported from the receiving to the sending entity. The sending entity decides how much traffic it forwards to the receiving entity based on the received load information. The receiving entity can only indirectly impact the load it receives by adjusting the load samples (S) it is reporting. In this model, each sending entity can implement its own overload control algorithm to generates control commands. Since load information is available in the sending entity, the sending entity can use information from various downstream neighbors when generating control commands. Hilt, et al. Expires April 18, 2007 [Page 6] Internet-Draft Overload Control October 2006 Sending Receiving Entity Entity +----------------+ +----------------+ | Server A | | Server B | | +----------+ | | | -+ | | Control | | S | | | | | Function |<-+------+-------+ | | | +----------+ | | | | | | |C | | | | | Overload | v | | | | | Control | +----------+ | | +----------+ | | | | Actuator | | | | Monitor | | | | +----------+ | | +----------+ | | | | | | ^ | -+ | v | | | | -+ | +----------+ | | +----------+ | | <-+--| SIP | | | | SIP | | | SIP --+->|Processor |--+------+->|Processor |--+-> | System | +----------+ | | +----------+ | | +----------------+ +----------------+ -+ Figure 2: Control Function on Sending Entity Yet a third alternative is a variant of the first model. In this model, the control function is located on the receiving entity. However, in addition to control commands (C), the receiving entity also reports load samples (S) to the sending entity. In this model, the traffic forwarded to the receiving entity is controlled by the receiving entity as in the first model. However, the additional load information assists the sending entity in balancing load and finding under-utilized servers. OPEN ISSUE: Which model is preferable? Having the control function on the receiving entity has the advantage that the receiving entity can implement overload control algorithms that can consider the specifics of this system (e.g. determine how early load needs to be reduced and which load values are critical for the system) and can apply throttling individually to each upstream neighbor. Control commands need to be well defined so that they can be used unambiguously by receiving and sending entity. It also seems preferable to have the load status in the sending entity, which makes the third model attractive. 3.2. Applying the Overload Control Loop A typical SIP request is processed by more than one SIP server. Thus, the question arises as to how the overload control model is Hilt, et al. Expires April 18, 2007 [Page 7] Internet-Draft Overload Control October 2006 applied to multiple servers along the path of a SIP request. In principle, the overload control loop can be applied hop-by-hop (i.e. pair wise between the servers on a path) or as one single control loop that stretches across the entire path from UAC to UAS. The two alternatives are illustrated in Figure 3. +---------+ +-------+----------+ +------+ | | | | | | | | +---+ | | +---+ v | v //=>| C | v | //=>| C | +---+ +---+ // +---+ +---+ +---+ // +---+ | A |===>| B | | A |===>| B | +---+ +---+ \\ +---+ +---+ +---+ \\ +---+ ^ \\=>| D | ^ | \\=>| D | | +---+ | | +---+ | | | | | +---------+ +-------+----------+ (a) hop-by-hop loop (b) full path loop ==> SIP request flow <-- Load feedback loop Figure 3: Overload Control Loop The idea of hop-by-hop overload control is to have an individual feedback control loop between two neighboring SIP servers. Each SIP server reports feedback to its upstream neighbors, i.e., the entities it receives SIP requests from. The upstream neighbors adjust the amount of traffic they are forwarding to the SIP server based on this feedback. A SIP server that receives traffic from multiple upstream neighbors has a separate feedback loop with each of the upstream neighbors. The server can use local policies to determine how feedback is balanced across upstream neighbors. For example, the server can treat all neighbors equally and provide the same feedback to all of them (e.g. ask all upstream neighbors to throttle traffic by 10% when load builds up). It may also prefer some servers over others. For example, it may throttle a less preferred upstream neighbor earlier than a preferred neighbor or it may throttle the neighbor first that sends the most traffic. In any case, a SIP server needs to provide feedback such that it does not get overloaded as the load increases. The upstream neighbors of a SIP server don't forward the load status they receive further upstream, since they can act on this information Hilt, et al. Expires April 18, 2007 [Page 8] Internet-Draft Overload Control October 2006 and resolve the overload condition if needed (e.g. by re-routing or rejecting traffic). The upstream neighbor, of course, should report its own load status to its upstream neighbors. If the upstream neighbor becomes congested itself, its upstream neighbors can take action based on the reported feedback and resolve the situation. Thus, overload of a SIP server is resolved by its direct neighbors without the need to involve entities that are located multiple SIP hops away. Hop-by-hop overload control can effectively reduce the impact of overload on a SIP network and, in particular, can avoid the signaling congestion collapse. This is achieved by enabling SIP entities to gradually lower the amount of traffic they receive. Thus, a server that reaches a high level of load can effectively offload the task of redirecting and rejecting messages to its upstream neighbors. The approach of hop-by-hop overload control is simple and scales well to networks with many SIP entities. It does not require a SIP entity to aggregate a large number of load status values or keep track of the load status of SIP servers it is not communicating with. A SIP entity only needs to observe the load status of the downstream neighbors it is currently forwarding traffic to. The main goal of full path overload control is to provide overload control for a full path, from UAC to UAS. Full path overload control considers load information from all SIP servers (including all proxies and the UAS). A full path overload control mechanism has to be able to frequently collect the load status of all servers on the potential path of a SIP request and combine this data into meaningful load feedback. The main problem of full path overload control is its inherent complexity. A UAC or SIP server would have to monitor all potential paths it may use for SIP requests. OPEN ISSUE: which approach is preferable? Hop-by-hop overload control seems to be much simpler and manageable and does achieve the goal of avoiding overload in SIP servers. 3.3. Load Feedback A SIP server frequently provides load feedback to its upstream neighbors. Load feedback reports consist of two parts: i) the control commands (if any) and ii) the current load status value. In addition, load feedback reports have an expiration time so that they can eventually be timed out. Control commands in a load feedback report instruct the actuator on the upstream neighbor to modify the traffic forwarded to the receiving SIP server. The following control commands are defined: Hilt, et al. Expires April 18, 2007 [Page 9] Internet-Draft Overload Control October 2006 o Throttle X%: instructs the actuator to reduce the amount of traffic it would normally forward to the receiving SIP server by X percent. This requires that the actuator diverts or rejects X percent of the traffic destined to the receiving server. OPEN ISSUE: the throttle X% command has the advantage that it is simple. It does not set a fixed boundary for the load. E.g. if the offered load increases significantly (say by 20%), the forwarded load will increase even if the downstream server has sent a throttle 10% command. Other possibilities for control commands are max. messages per second, message gapping, etc. Also, there might be other commands that are useful. Generally, the idea is to define a few well-defined commands the receiving entity can send to the sending entity and can expect a given, well-defined behavior. OPEN ISSUE: it probably makes sense to define a default control function algorithm that generates these control commands based on load numbers. The load status value is used to indicate to which degree the resources needed by a SIP server to process SIP messages are utilized. Load status values range from 0 to 100. A value of 0 indicates that the server is completely idle, a value of 100 that all resources are fully utilized and the server is overloaded. The algorithm used to determine the load status of a SIP server is specific to the type of resources needed by the server to process SIP messages. Different SIP servers may have different resource requirements and constrains and therefore may use different algorithms to compute the load status value. A common mechanism is to use the processor utilization to derive the load status. However, other metrics such as memory usage or queue length may also be used. OPEN ISSUE: it might make sense to define a default algorithm to determine load (e.g. based on queue length or processor load). The load conditions on a SIP server vary over time. A load status report is therefore only valid for a limited amount of time. The expiration time is part of the load report and is set by the reporting SIP server. The rationale behind this is that the reporting SIP server can best determine how stable the current load conditions are (e.g. based on local history) and for how long the reported load can provide a good estimate for the actual load conditions. Once a load status report has reached its expiration time, it may have become inaccurate since the load on the server may have changed. Hilt, et al. Expires April 18, 2007 [Page 10] Internet-Draft Overload Control October 2006 The longer the load status report has expired, the bigger is the uncertainty about the accuracy of its values. A SIP entity that has an expired load status report could discard it right after the expiration time. However, this would instantly deactivate the control commands contained in this report, e.g., the throttling of load to the receiving SIP server. This is clearly undesirable, in particular, if the receiving server is currently under heavy load and had set a high throttling value. An alternative approach is to fade out the effects of an expired load status report by gradually decreasing the throttle value over time until it reaches 0 and the amount of traffic that is forwarded is back to 100%. Effectively, this implements a slow start mechanism. Similarly, the load status value in a report can be decreased over time until it reaches 0, indicating that the server is not utilized at all. If the downstream server is still overloaded, it can create a new status report with up-to-date values. If the server has failed, this will be detected on the transport or SIP level. Obviously, no traffic should be forwarded to a failed SIP server. Since SIP servers can use load status reports to continuously advertise the current level of load to upstream neighbors, this mechanism does not have the on-off semantics that can lead to traffic oscillation. In fact, SIP proxies can use the load status information to balance load between alternative proxies. Thus, this mechanism can help to evenly load downstream proxies, making best use of available resources. However, this mechanism is not intended to replace specialized load balancing mechanisms. 3.4. SIP Mechanism A SIP mechanism needs to convey load status reports from the receiving to the sending SIP entity. In principle, it would be possible to define a new SIP request that can be used to convey load status reports from the receiving to the sending entity. However, sending separate load status requests from receiving to the sending entity would create additional messaging overhead, which is undesirable during periods of overload. It would also require each SIP server to keep track of all potential upstream neighbors. Another approach is therefore to define a new SIP header field for load information that can be inserted into SIP responses. This approach has the advantage that it provides load feedback to all upstream SIP entities that are currently forwarding traffic to a SIP server with very little overhead. Piggybacking load information in SIP responses is problematic in Hilt, et al. Expires April 18, 2007 [Page 11] Internet-Draft Overload Control October 2006 scenarios where a SIP server receives single requests from many different upstream neighbors. An edge proxy that communicates with many SIP endpoints is an example for such a SIP server. Since each endpoint only sends a single request, it can't decrease load by throttling future requests. However, an edge proxy can reduce its load by rejecting a percentage of the requests it receives with a 503 response code and ask the endpoint to stop sending messages for some time. The proxy can send 503 only to selected servers and therefore gradually reduce the amount of traffic it receives. In this particular scenario, the use of 503 responses does not lead to traffic oscillation and can be used instead of an overload control mechanism that gradually throttles the load forwarded by a SIP server. Hop-by-hop overload control requires that the distribution of load status reports is limited to the next upstream SIP server. This is achieved by adding the address of the next hop server (i.e. the destination of the load status report) to the load header. 3.5. Backwards Compatibility An important requirement for an overload control mechanism is that it can be gradually introduced into a network and that it functions properly if only a fraction of the servers support it. Hop-by-hop overload control does not require that all SIP entities in a network support it. It can be used effectively between two adjacent SIP servers if both servers support this extension and does not depend on the support from any other server or user agent. The more SIP servers in a network support this mechanism, the more effective it is since it includes more of the servers in the load reporting and offloading process. A SIP server may have multiple neighbors from which only some support overload control. If this server would issue a throttling command to all upstream neighbors, only those that support overload control would throttle their load. Others would keep sending at the full rate and benefit from the throttling by other servers supporting this extension. In other words, upstream neighbors that do not support overload control would be better off than those that do. A SIP server should therefore fall back to the use of 503 responses towards upstream neighbors that do not support overload control. A SIP server can issue throttling commands to overload control-enabled neighbors and 503 responses to all other neighbors. This way, servers that support overload control are better off than servers that don't. Hilt, et al. Expires April 18, 2007 [Page 12] Internet-Draft Overload Control October 2006 OPEN ISSUE: how should an upstream neighbor indicate that it supports overload control? 4. SIP Application Considerations 4.1. How to Calculate Load Levels Calculating an element's load level is dependent on the limiting resource for the element. There can be several contributing factors, such as CPU, Memory, Queue depth, calls per second, application threads, etc. However, the underlying result is that the element has reached a limit such that it cannot reliably process any more calls. If the element knows what its limiting resource is, it should be able to report different levels of load based on this resource. In some instances, it could be a combination of resources. Regardless, if the element knows what that resource is and the limit at which the element becomes 100% overloaded, then the element should also be able to recognize percentages of load prior to hitting 100%. The element should have configuration parameters that map the percent of load to a defined load value as described by this document. The element can now report its load level in SIP Responses. In some instances, the element itself is not overloaded, but one of its resources is, such as a Trunk Group on a Media Gateway Controller (MGC) or a next hop IP address to which a proxy sends. In this case, the element may still want to report the load on the resource, so that the sending element may be able to limit the sending of future requests to that same resource until that resource alleviates its overloaded state. 4.2. Responding to a Load Level Message When an element receives a load header with a value other than 20, the receiving element should use an algorithm to try to reduce the amount of future traffic it sends to the overloaded element. As an alternative, control commands received by an element may determine that traffic sent to the overloaded element needs to be reduced. If the egress element is overloaded, the ingress element can start to reduce the load to the overloaded element. Depending on the network configuration, this may result in sending a percentage of future calls that would have gone to the overloaded element to a different destination. In other instances, reducing load means rejecting a percentage of calls from coming into the network. If a resource on the egress element is overloaded, then the ingress element may alter its load-balancing algorithm to lower the Hilt, et al. Expires April 18, 2007 [Page 13] Internet-Draft Overload Control October 2006 percentage of calls it will offer to the overloaded resource. This may be important, as there may be multiple egress points for reaching this same overloaded resource. The algorithm to reduce load is open to element and vendor implementation. (Note: the load reduction algorithm is not specifying the quantity of SIP messages or calls allowed by the proxy server, it is simply specifying a treatment of new call attempts based on a specified load level.) However, the algorithm should provide for tunable parameters to allow the network operator to optimize over time. The settings for these parameters could be different for a carrier network versus an enterprise or VSP network. The goal of the algorithm is to alleviate load throughout the network. This means avoidance of propagating load from one element to another. This also means trying to keep the network as full as possible without reaching 100% on any given element, except when all elements approach 100%. This may not always be possible on all elements, because of the physical resources associated with the element. For example, a Media Gateway (MG) and MGC may be tied directly to physical trunks in a given city or region. If that MGC or MG becomes overloaded, there may not be a way to spread the load across the network, because the physical resources on those elements are the only path into or out of the network in that city or region. However, for SIP resources, the network should be able to spread the load evenly across all elements as long as it does not result in quality issues. Balancing load across a network is the responsibility of the operator, but the load parameter may help the operator to adjust the balancing in a more dynamic fashion by allowing the load-balancing algorithm to react to bursts or outages. 4.3. Emergency Services Requests It is generally recommended proxy servers should attempt to balance all SIP requests, and relative resources, to a maximum load value 80. In doing so, the servers are proactively tuned to allow an emergency services request attempt to be placed to any available upstream or downstream SIP device for immediate processing and delivery to the intended emergency services provider. In some cases, a load value of 80 is simply impossible or difficult to maintain due to extraneous situations. Since, the downstream proxy server is providing load information to the upstream originating elements; the originating elements may use this data to begin alleviation treatments such as reducing the load forwarded. When the proxy server receives an emergency services request, it should not be treated by the methods described and processed immediately. Hilt, et al. Expires April 18, 2007 [Page 14] Internet-Draft Overload Control October 2006 In the worse case, an emergency services request should be attempted immediately regardless of SIP device load state. The load header is designed to proactively communicate and provide a common mechanism for addressing overloaded states to avoid situations when emergency services requests are delayed or denied due to overload. 4.4. Downstream Server Failures In some cases, the downstream proxy is too overloaded to respond with a load value or any SIP message. When this occurs, the proxy server attempting to contact the downstream server may indicate a load level on behalf of the specified server to upstream proxy servers. The proxy would accomplish this by specifying its load level, and then specifying the questionable downstream proxy as described in a multi- server inclusive load header. Optional text may describe a questionable downstream overloaded server for the application to respond as necessary in retries or future attempts. 4.5. B2BUAs (Back-to-Back User Agents) Since, B2BUA's tend to perform a function of topology hiding, it is probably undesirable to forward load level information of "hidden" proxies. A B2BUA may remove "hidden" load information from the load value, but it may also indicate a load level on behalf of the "hidden" proxies. This will allow topology hiding while limiting the potential of external message overloading. 4.6. Operations and Management This header can provide a valuable tool for element operators. If it becomes necessary to "bleed" off traffic from an element for either maintenance or removal, the load header could be manually manipulated by the application. This will communicate to upstream elements the necessity to try alternative downstream elements for new call attempts. In accomplishing this, the element will have a graceful method of removing itself as a preferred downstream choice. In addition, the load header information can be captured within Call Detail Records (CDRs) and SNMP traps for use in service reports. These service reports could be used for future network optimization. 5. SIP Load Header This section defines a new SIP header for overload control, the Load header. This header follows the above design considerations for an overload control mechanism. Hilt, et al. Expires April 18, 2007 [Page 15] Internet-Draft Overload Control October 2006 5.1. Generating the Load Header A SIP server compliant to this specification SHOULD regularly provide load feedback to its upstream neighbors in a timely manner. It does so by inserting a Load header field into the SIP responses it is forwarding or creating. The Load header is a new header field defined in this specification. The Load header can be inserted into all response types, including provisional, success and failure response types. A SIP server SHOULD insert a Load header into all responses. A SIP server MAY choose to insert Load headers less frequently, for example, once every x milliseconds. This may be useful for SIP servers that receive a very high number of messages from the same upstream neighbor or servers with a very low variability of the load measure. In any case, a SIP server SHOULD try to insert a Load header into a response well before the previous Load header sent to the same upstream neighbor expires. Only SIP servers that frequently insert Load header into responses are protected against overload. A SIP server MUST insert the address of its upstream neighbor into the "target" parameter of the Load header. It SHOULD use the address of the upstream neighbor found in the topmost Via header of the response for this purpose. The "target" parameter enables the receiver of a Load header to determine if it should process the Load header (since it was generated by its downstream neighbor) or if the Load header needs to be ignored (since it was passed along by an entity that does not support this extension). Effectively, the "target" parameter implements the hop-by-hop semantics and prevents the use of load status information beyond the next hop. A SIP server SHOULD add a "validity" parameter to the Load header. The "validity" parameter defines the time in milliseconds during which the Load header should be considered valid. The default value of the "validity" parameter is 500. A SIP server SHOULD use a shorter "validity" time if its load status varies quickly and MAY use a longer "validity" time if the current load level is more stable. 5.2. Determining the Load Header Value The value of the Load header contains the current load status of the SIP server generating this header. Load header values range from 0 (idle) to 100 (overload) and MUST reflect the current level in the usage of SIP message processing resources. For example, a SIP server that is processing SIP messages at a rate that corresponds to 50% of its maximum capacity must set the Load header value to 50. Hilt, et al. Expires April 18, 2007 [Page 16] Internet-Draft Overload Control October 2006 5.3. Determining the Throttle Parameter Value The value of the "throttle" parameter specifies the percentage by which the load forwarded to this SIP server should be reduced. Possible values range from 0 (the load forwarded is reduced by 0%, i.e., all traffic is forwarded) to 100 (the load forwarded is reduced by 100%, i.e., no traffic forwarded). The default value of the "throttle" parameter is 0. The "throttle" parameter value is determined by the control function of the SIP server generating the Load header. 5.4. Processing the Load Header A SIP entity compliant to this specification MUST remove all Load headers from the SIP messages it receives before forwarding the message. A SIP entity may, of course, insert its own Load header into a SIP message. A SIP entity MUST ignore all Load headers that were not addressed to it. It MUST compare its own addresses with the address in the "target" parameter of the Load header. If none of its addresses match, it MUST ignore the Load header. This ensures that a SIP entity only processes Load headers that were generated by its direct neighbors. A SIP server MUST store the information received in Load headers from a downstream neighbor in a server load table. Each entry in this table has the following elements: o Address of the server from which the Load header was received. o Time when the header was received. o Load header value. o Throttle parameter value (default value if not present). o Validity parameter value (default value if not present). A SIP entity SHOULD slowly fade out the contents of Load headers that have exceeded their expiration time by additively decreasing the Load header and throttle parameter values until they reach zero. This is achieved by using the following equation to access stored Load header and "throttle" parameter values. Note that this equation is only used to access Load header and "throttle" parameter values and the result is not written back into the table. result = value - ((cur_t - rec_t) DIV validity) * 20 If the result is negative, zero is used instead. Value is the stored value of the Load header or the "throttle" parameter. Cur_t is the current time in milliseconds, rec_t is the time the Load header was Hilt, et al. Expires April 18, 2007 [Page 17] Internet-Draft Overload Control October 2006 received. Validity is the "validity" parameter value. DIV is a function that returns the integer portion of a division. The idea behind this equation is to subtract 20 from the value for each validity period that has passed since the header was received. A value of 100, for example, will be reduced to 80 after the first validity period and it will be completely removed after 5 * validity milliseconds. A stored Load header is removed from the table when the above equation returns zero for both the load header and throttle parameter value. ISSUE: All proposed default values and the above equation should be considered as straw man proposal. 5.5. Using the Load Header Value A SIP entity MAY use the Load header value to balance load or to find an underutilized SIP server. 5.6. Using the Throttle Parameter Value A SIP entity compliant to this specific MUST honor "throttle" parameter values when forwarding SIP messages to a downstream SIP server. A SIP entity applies the usual SIP procedures to determine the next hop SIP server as, e.g., described in [2] and [3]. After selecting the next hop server, the SIP entity MUST determine if it has a stored Load header from this server has not yet fully expired. If it has such a Load header and the header contained a throttle parameter that is non-zero, the SIP server MUST determine if it can or cannot forward the current request within the current throttle conditions. The SIP MAY use the following algorithm to determine if it can forward the request. Other algorithms that lead to the same result may be used as well. The SIP entity draws a random number between 1 and 100 for the current request. If the random number is less than or equal to the throttle value, the request is not forwarded. Otherwise, the request if forwarded as usual. The treatment of SIP requests that cannot be forwarded to the selected SIP Server is a matter of local policy. A SIP entity MAY try to find an alternative target or it MAY reject the request. Hilt, et al. Expires April 18, 2007 [Page 18] Internet-Draft Overload Control October 2006 5.7. 503 Responses A SIP server may determine that an upstream neighbor does not support this extension. The SIP server SHOULD use the 503 response code to throttle traffic from upstream neighbors that do not support this extension. This is important to ensure that SIP entities, which do not support this extension, don't receive preferred treatment over SIP entities that do. A SIP server SHOULD therefore send a 503 response to an upstream neighbor that does not support this extension as soon as it starts throttling load (i.e. generates Load headers with throttle parameters greater than zero). A SIP server that has reached overload (i.e. a load close to 100) SHOULD use 503 responses in addition to the throttle parameter in the Load header. If the proxy has reached a load of 100, it is very likely that upstream proxies have ignored the increasing load status reports and thus do not support this extension. By sending a 503 response, an upstream proxy is enabled to use the traditional SIP overload control mechanisms. 6. Syntax This section defines the syntax of a new SIP response header, the Load header. The Load header field is used to advertise the current load status information of a SIP entity to its upstream neighbor. The value of the Load header is an integer between 0 and 100 with the value of 0 indicating that the proxy is least overloaded and the value of 100 indicating that the proxy is most overloaded. The "target" parameter is mandatory and contains the URI of the next hop SIP entity for the response. I.e. the SIP entity the response is forwarded to. This is the entity that processes the Load header. The "throttle" parameter is optional and contains a number between 0 and 100. It describes the percentage by which the load forwarded by "target" SIP entity to the SIP server generating this header should be reduced. The "validity" parameter is optional and contains an indication of how long the reporting proxy is likely to remain in the given load status. The syntax of the Load header field is: Hilt, et al. Expires April 18, 2007 [Page 19] Internet-Draft Overload Control October 2006 Load = "Load" HCOLON loadStatus loadStatus = 0-100 SEMI serverID [ SEMI throttleRate ] [ SEMI validMS ] [ SEMI generic-param ] serverID = "target" EQUAL absoluteURI throttleRate = "throttle" EQUAL 0-100 validMS = "validity" EQUAL delta-ms delta-ms = 1*DIGIT The BNF for absoluteURI and generic-param is defined in [2]. Table 1 is an extension of Tables 2 and 3 in [2]. Header field where proxy ACK BYE CAN INV OPT REG ________________________________________________________ Load r ar - o o o o o Table 1: Load Header Field Example: Load: 80;throttle=20;validity=500;target=p1.example.com 7. Security Considerations Overload control mechanisms can be used by an attacker to conduct a denial-of-service attack on a SIP entity if the attacker can pretend that the SIP entity is overloaded. When such a forged overload indication is received by an upstream SIP entity, it will stop sending traffic to the victim. Thus, the victim is subject to a denial-of-service attack. An attacker can create forged load status reports by inserting itself into the communication between the victim and its upstream neighbors. The attacker would need to add status reports indicating a high load to the responses passed from the victim to its upstream neighbor. Proxies can prevent this attack by communicating via TLS. Since load status reports have no meaning beyond the next hop, there is no need to secure the communication over multiple hops. Another way to conduct an attack is to send a message containing a high load status value through a proxy that does not support this extension. Since this proxy does not remove the load status information, it will reach the next upstream proxy. If the attacker can make the recipient believe that the load status was created by its direct downstream neighbor (and not by the attacker further downstream) the recipient stops sending traffic to the victim. A precondition for this attack is that the victim proxy does not support this extension since it would not pass through load status Hilt, et al. Expires April 18, 2007 [Page 20] Internet-Draft Overload Control October 2006 information otherwise. The attack also does not work if there is a stateful proxy between the attacker and the victim and only 100 (Trying) responses are used to convey the Load header. 8. IANA Considerations [TBD.] Appendix A. Acknowledgements Many thanks to Jonathan Rosenberg for the discussions and suggestions. 9. References 9.1. Normative References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [2] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [3] Rosenberg, J. and H. Schulzrinne, "Session Initiation Protocol (SIP): Locating SIP Servers", RFC 3263, June 2002. [4] Schulzrinne, H. and J. Polk, "Communications Resource Priority for the Session Initiation Protocol (SIP)", RFC 4412, February 2006. 9.2. Informative References [5] Rosenberg, J., "Requirements for Management of Overload in the Session Initiation Protocol", draft-rosenberg-sipping-overload-reqs-01 (work in progress), June 2006. Hilt, et al. Expires April 18, 2007 [Page 21] Internet-Draft Overload Control October 2006 Authors' Addresses Volker Hilt Bell Labs/Lucent Technologies 101 Crawfords Corner Rd Holmdel, NJ 07733 USA Email: volkerh@bell-labs.com Daryl Malas Level 3 Communications 1025 Eldorado Blvd. Broomfield, CO USA Email: daryl.malas@level3.com Indra Widjaja Bell Labs/Lucent Technologies 600-700 Mountain Avenue Murray Hill, NJ 07974 USA Email: iwidjaja@lucent.com Rich Terpstra Level 3 Communications 1025 Eldorado Blvd. Broomfield, CO USA Email: rich.terpstra@level3.com Hilt, et al. Expires April 18, 2007 [Page 22] Internet-Draft Overload Control October 2006 Full Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Hilt, et al. Expires April 18, 2007 [Page 23]