Problem:

Issue #184: maximum message sizes for SIP over UDP, truncated responses.

Explanation:

While the request may fit the MTU, the response may be sufficiently
large to exceed it and thus get fragmented. This appears likely only
with 2xx responses, since error responses typically don't contain
message bodies.

Responses could either start large at the UAS, become large as they
traverse the proxies back to the UAC (e.g., because Record-Route headers
are being added) or an error response at a proxy can be large.  The
latter case seems least likely, but is possible:  For example, P2
receives a request over UDP.  Its less than 1500 bytes.  It adds its own
Via and RR.  RR is big.  Now, the request is over the limit.  So, it
sends it via TCP.  It gets the response over TCP.  Even after stripping
Via, its still too big, but the request came in over UDP.

Another case is a proxy adding headers to a response for services, such
as R-P-ID, or even bodies.

Fragmentation leads to loss of efficiency; more seriously, NATs and
firewalls may not be able for forward fragments, thus causing all
packets to be lost.

Proposals:

- Proposal 1: Ignore the issue

  1. If you send a request bigger than MTU, you MUST use TCP.

  2. TCP is recommended between elements that may have an intervening nat, 
     if they are aware of such, or for elements that exchange a
     significant amount of traffic.
  
  3. Intermediate elements SHOULD NOT insert headers or bodies into
     requests or, in particular, responses if they are above half MTU
     size, as this may cause the response to exceed the MTU size and thus
     incur fragmentation.

  Problems: may not fly past the IESG.

- Proposal 2: Status code 499

  If the UAS needs to send a large response, it sends a 499 instead,
  which propagates back to the UAC. This is treated like a branch
  failure.

  The first proxy that detects a response that has grown too large
  converts the response to 499.

  Problems: 

  (1) Backwards-compatibility. Old UACs have no clue what this means and
  will not do the right thing.

  (2) Branches get lost, i.e., a perfectly servicable branch that would
  generate a 2xx may not get asked if there's another "better" branch,
  simply because the response is large. This isn't a bug as such, but
  would yield different behavior if somebody switches from UDP to TCP
  initially, which is at least hard to explain.

- Proposal 3: Use redirection -- 307 (Temporary Redirect).

  The relevant entity issues a 307 response, with a Contact indicating
  the new transport, as in 

    Contact: <foobar@example.com;transport=tcp>

  Normal 3xx behavior applies.

  As long as the 307 response is issued by a UAS, no special handling is
  necessary. The UAS needs to recognize the incoming TCP connection as
  being part of the same call, but this should happen based on standard
  dialogue identification.

  If the proxy increases the message size beyond the MTU, it needs to
  convert a 2xx response received via UDP to a 307 response.  The
  upstream proxy or UAC then reissues the request as TCP.  To ensure the
  same routing of the request, the proxy needs to insert some
  state-identifying information in the Contact header.

  I believe this is backwards-compatible and avoids the "branch loss"
  problem. Also, proxies can do recursive resolution of 3xx, so the
  request doesn't have to go back all the way to the UAC.

  The case is where a proxy at 1.2.3.4 got a 2xx from a downstream
  element, over TCP, but the request that triggered it came in on UDP. 
  It wants to send the response, but its too big to go over UDP.  So, it
  remembers it, indexing it with a key, say with value 3, and issues a
  307:

  307 Contact: sip:response-index.3@1.2.3.4

  When it gets the recursed INVITE, it immediately returns the 2xx
  stored there.  If the recursion never arrives, it would be bad.

- Proposal 3a: Variant of redirection

  Route 307 all the way back to the origin.  Restart the request with
  TCP from the UAC.

  - How does this fit with normal 3xx processing?

  - Requires UAC to speak TCP - but that's unavoidable in this case.

- Proposal 4: Always send large responses via TCP

  Even if a request comes in via UDP, send the response via TCP.

  Difficulty: 
   - associating requests and responses. 
   - if source proxy is behind NAT or firewall, this may not work (but
     then, that proxy won't be able to receive incoming calls, either)

Proposed Resolution: (3) or (4)

  If possible, I'd like to avoid the case of proxies turning 2xx into
  3xx.

See 
http://www.caida.org/outreach/papers/2001/Frag/
for recent measurements related to fragmentation.