Network Working Group                                          E. Cooper
Internet-Draft                                               P. Matthews
Expires: August 28, 2006                                           Avaya
                                                       February 24, 2006


           The Effect of NATs on P2P SIP Overlay Architecture
               draft-matthews-p2psip-nats-and-overlays-00

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on August 28, 2006.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   This document discusses the constraints that NATs put on the possible
   overlay architectures of a P2P SIP system.  Given what seems to be a
   reasonable set of assumptions on where nodes are deployed and the
   kinds of NATs they are located behind, the document concludes that a
   structured partial-mesh overlay network exhibiting a property known
   as "symmetric interest" is the most reasonable overlay architecture.


Cooper & Matthews        Expires August 28, 2006                [Page 1]

Internet-Draft              NATs and Overlays              February 2006


1.  Introduction

   In general terms, P2P overlays attempt to eliminate a central
   bottleneck in a system by taking the data traditionally stored on a
   server (or set of servers) and dispersing it amongst a number of
   peers.  Also in general terms, NAT boxes multiplex many "private" IP
   addresses onto a single "public" address.  As a result of this
   multiplexing function, a NAT which receives an unsolicited message on
   its "public" address cannot determine which "private" address should
   receive it.  Such messages are generally discarded.  In client-server
   network topologies this is not a problem, since servers are usually
   given "public" addresses and clients never receive unsolicited
   messages.  In P2P networks however, peers that cannot receive
   unsolicited messages cannot participate in the overlay.  It follows
   then, that the presence of NATs in the network topology has a major
   influence on the overlay architecture.

   Comments on this draft are solicited and should be addressed either
   to the authors or to the P2P-SIP mailing list at
   p2p-sip@cs.columbia.edu (see
   https://lists.cs.columbia.edu/cucslists/listinfo/p2p-sip).


Cooper & Matthews        Expires August 28, 2006                [Page 2]

Internet-Draft              NATs and Overlays              February 2006


2.  Scenario

   Figure 1 shows a set of peers that want to create a P2P SIP overlay
   network.  Though this set is rather small, it still illustrates some
   key points.


            ,-------.
          ,' P    P  `.                             ,-----.
         (          P  )                           (   P   )
          `. P   P   ,'                             `-----'
            `-------'                            NAT
                     NAT
                               _.------------.
                          ,--''               `---.
                       ,-'                         `-.
                      /                               \
                     /                                 \
                    (            Internet               )
                     \                                 /
                      \                               /
                       `-.                         ,-'
                          `---.               _.--'
             N A T             `------------''
         ,-.     ,-.                                ,-----.
        / P \   ( P )                             ,'       `.
       ( P   )   `-'                             (   P   P   )
        \  P/                                     `.       ,'
         `-'                                        `-----'

       Legend
       P   - Peer node
       NAT - NAT box


   Figure 1: Example Scenario

   In this figure we see six clouds.  Five represent subnets containing
   peers and one represents the Internet.  Some of the subnets contain
   just a single peer while others contain multiple peers.  One of the
   subnets uses public IP addresses, while the other subnets have NATs
   between them and the Internet and thus use private addresses.  Two of
   the subnets are sitting behind the same NAT.  Not illustrated in this
   figure are more complex NAT scenarios -- for example, a cascading NAT
   scenario where there are two NATs between a subnet and the Internet.

   This document talks about overlay architectures for hooking these
   peers together into a P2P SIP system.


Cooper & Matthews        Expires August 28, 2006                [Page 3]

Internet-Draft              NATs and Overlays              February 2006


3.  Assumptions

   This section presents and discusses our assumptions about the P2P SIP
   system, about the NATs the system must traverse, and about the
   interaction between the system and these NATs.

   The first assumption deals with the size range of P2P SIP systems.
   We assume that there can be many different P2P SIP systems, ranging
   from very small systems to very large systems, and nodes can be
   scattered anywhere around the world.  This assumption is not directly
   related to NATs, but influences the other assumptions.

      Assumption 1: There may be many different P2P SIP systems, with
      sizes ranging from two nodes to millions of nodes, with the node
      scattered across one to millions of subnets.

   The next assumption deals with the question of whether a P2P SIP
   system will always have a certain proportion of nodes with public IP
   addresses.  This question is important because nodes with public IP
   addresses make things easier, and if there is a large proportion of
   them, then nodes behind NATs can be treated as leaf nodes (that hang
   off of nodes with public IP addresses).  Most P2P systems (e.g., file
   sharing systems) assume a certain proportion of nodes with public IP
   addresses.

   However, this assumption seems less tenable with P2P SIP systems,
   especially in systems where the system is used in an enterprise
   and/or is primarily composed of hard phones (rather than general-
   purpose computers).  Thus we make the following assumption:

      Assumption 2: There can be P2P SIP systems where every peer has a
      NAT box between it and the open Internet.

   In corporate environments, we expect this situation to be common.

   The next set of assumptions deal with the behavior of the various
   NATs.  At this point, readers should be familiar with references [1],
   [2], [3].  In this document, we use the terminology of the IETF
   BEHAVE working group.

   The two key behaviors of a NAT as are mapping behavior and its
   filtering behavior.

   Consider the various possible mapping behaviors first (c.f. section
   4.1 of [1]).  If a NAT has a behavior other than "Endpoint
   Independent mapping", then peers behind the NAT cannot use "UDP hole-
   punching" (see [3]).  The only way to support these peers is by
   treating them as leaf nodes hanging off a "relay peer".  This relay


Cooper & Matthews        Expires August 28, 2006                [Page 4]

Internet-Draft              NATs and Overlays              February 2006


   peer must have either a public IP address or be located behind a NAT
   with a filtering behavior of "Endpoint Independent filtering".  Since
   (a) acting as a relay is very bandwidth- and processor-intensive
   (which some peers may not be able to handle) and since (b) a given
   P2P network may not have a node that has the required address
   properties to act as a relay, many P2P SIP networks may not be able
   to support peers behind NATs which do not provide "Endpoint
   Independent mapping".

   For these reasons, we limit our architectural investigation to NATs
   with "Endpoint Independent mapping".  (A later version of this
   document may describe the necessary extensions to support NATs that
   do not satisfy this assumption).

      Assumption 3: All NATs must have a mapping behavior of "Endpoint
      Independent mapping".

   Note that various investigations (see, for example, sections 6.2 and
   6.4 of [3]) have suggested that about 85% of all NATs have a mapping
   behavior of "Endpoint Independent mapping".

   Now consider the various possible filtering behaviors (c.f. section
   4.1 of [1]).  It is easier to create a P2P network with nodes behind
   NATs that have a filtering behavior of "Endpoint Independent
   filtering" than with nodes behind NATs with other filtering
   behaviors.  However, other filtering behaviors are seen as more
   secure, and especially in corporate NATs, these other filtering
   behaviors are more common.

   So can we assume that the NATs in the P2P system have a variety of
   filtering behaviors, and that at least a significant percentage of
   them have the more P2P-friendly "Endpoint Independent filtering"
   behavior?  Unfortunately, this seems overly optimistic.  This may be
   true in larger systems with a significant number of residential-based
   peers, but in smaller deployments and/or deployments with a large
   number of enterprise-based peers, this seems unlikely.  Especially in
   a P2P SIP systems deployed in enterprise environments, it seems
   likely that many systems will reside exclusively behind NATs with a
   filtering behavior of "Address Dependent filtering" (or worse).

   So it seems best to be very conservative in this regard, and assume
   the worst possible filtering behavior.

      Assumption 4: The P2P SIP system must function when all peers are
      located behind NATs with a filtering behavior of "Address and Port
      Dependent filtering".

   An architecture that works in this situation will also work where


Cooper & Matthews        Expires August 28, 2006                [Page 5]

Internet-Draft              NATs and Overlays              February 2006


   some NATs have a less-restrictive filtering behavior.

   The BEHAVE group has specified a number of other NAT UDP requirements
   [1].  The appendix discusses our assumptions relative to this
   document in detail.  For now, there is no similar table for TCP since
   the work on TCP in the BEHAVE working group has just started.
   However, many of the requirements for UDP apply to TCP as well.


   In addition to the BEHAVE approach, there are some other approaches
   to NAT traversal that warrant discussion: UPnP, ALGs, SBCs, and
   manual configuration.

   Universal Plug-n-Play (UPnP) is an approach developed by Microsoft.
   In this approach, the P2P application talks directly with the NAT and
   asks the NAT to open up pinholes for it.  Many consumer-grade NATs
   support the UPnP protocol, and this approach is a viable option for
   P2P applications targeted only at the consumer market.  However, most
   corporate-grade NATs do not support UPnP.  In addition, ISPs that NAT
   their entire network (a practice that is becoming more common in
   certain environments) typically do not allow their customers to
   configure that NAT using UPnP.

   Many NATs contain one or more Application Level Gateways (ALGs).  An
   ALG is special code within the NAT that recognizes packets of a
   particular application-level protocol and treats the packets
   specially.  ALG support for the File Transfer Protocol (FTP) is
   almost universal in NATs, and ALG support for the SIP is becoming
   more common.  However, ALG support requires that the application
   protocol not be encrypted, and encryption of both SIP and P2P
   messages is likely to be desirable for security reasons.  Also, ALG
   support for whatever P2P protocol we pick is very unlikely, at least
   in the short term.

      Assumption 5: The traversal of a given NAT must not depend on that
      NAT supporting either UPnP or any ALG (except for FTP).

   Session Border Controllers (SBCs) are boxes that are deployed in the
   network, sometimes by the customer but more commonly by the SIP
   service provider, to enable NAT traversal for standard client-server
   SIP.  SBCs are becoming more common, but are typically restricted to
   working only with the SIP proxy servers of the SIP service provider
   that deploys the SBC.  Furthermore, they are unlikely to support
   whatever P2P protocol we pick.  Thus they are not a NAT traversal
   option for P2P SIP networks.

      Assumption 6: The P2P NAT traversal strategy must not depend on
      the presence of SBCs in the network.


Cooper & Matthews        Expires August 28, 2006                [Page 6]

Internet-Draft              NATs and Overlays              February 2006


   NAT traversal is often much easier if the user can manually configure
   the NAT.  The user can open up pinholes in the NAT and/or modify the
   NAT's behavior.  However, this requires that the user have the
   knowledge and interest to do the configuration (non-technical users
   often do not), have a NAT which is configurable (some low-end NATs
   are not configurable), and have permission to configure the NAT
   (problematic in corporate environments or when the ISP NATs the
   entire access network).

   Furthermore, history has shown that systems which are "plug-and-play"
   tend to get much better acceptance by users.  We would like users to
   be able to deploy P2P SIP peers without even know what a NAT is.
   Though we may not be "plug-and-play" in all cases, our NAT traversal
   strategy will be a failure if this is not true in the vast majority
   of cases.

      Assumption 7: The NAT traversal strategy must be "plug-and-play"
      in the vast majority of cases.


   Finally, there is the question of how many mapping and filtering
   entries ("pinholes") a NAT can support.  Low-end NAT boxes found in
   homes and small enterprises may support only a very small number of
   mapping and filtering entries.  NAT boxes deployed in larger
   enterprise environments usually support more entries since there are
   more devices (computers, IP phones, etc) behind them.  However, a
   general rule seems to be that NAT vendors expect a given node to use
   only fairly few entries at a time.  The exact number is not known to
   the authors at this time, but it is clearly small.  Thus a NAT
   traversal strategy that has one or more peers opening up a large
   number of pinholes to communicate with other peers is not acceptable,
   partly because it uses up what may be a very limited resource, and
   partly because of the refresh traffic required (especially if UDP is
   used).

      Assumption 8: The NAT traversal strategy must limit the number of
      mapping and filtering entries opened up on a given NAT box to a
      fairly small number (exact value is TBD).


   Here is a summary of the assumptions listed above:

      Assumption 1: There may be many different P2P SIP systems, with
      sizes ranging from two nodes to millions of nodes, with the node
      scattered across one to millions of subnets.

      Assumption 2: There can be P2P SIP systems where every peer has a
      NAT box between it and the open Internet.


Cooper & Matthews        Expires August 28, 2006                [Page 7]

Internet-Draft              NATs and Overlays              February 2006


      Assumption 3: All NATs must have a mapping behavior of "Endpoint
      Independent mapping".

      Assumption 4: The P2P SIP system must function when all peers are
      located behind NATs with a filtering behavior of "Address and Port
      Dependent filtering".

      Assumption 5: The traversal of a given NAT must not depend on that
      NAT supporting either UPnP or any ALG (except for FTP).

      Assumption 6: The P2P NAT traversal strategy must not depend on
      the presence of SBCs in the network.

      Assumption 7: The NAT traversal strategy must be "plug-and-play"
      in the vast majority of cases.

      Assumption 8: The NAT traversal strategy must limit the number of
      mapping and filtering entries opened up on a given NAT box to a
      fairly small number (i.e., 10s of pinholes, not 100s of pinholes).


Cooper & Matthews        Expires August 28, 2006                [Page 8]

Internet-Draft              NATs and Overlays              February 2006


4.  Architectural Options

   This section discusses various architectural options in light of the
   above assumptions.  The goal of this section is to do a pretty
   complete exploration of the design space, and discuss the pros and
   cons of the various approaches.

   First of all, it is important to note the distinction between NAT
   traversal for signaling messages and NAT traversal for media
   messages.  The latter problem (media) is solved in a peer-to-peer
   fashion using the ICE mechanism[5].  If two peers can exchange
   signaling messages in some way (perhaps indirectly through other
   peers), then ICE can be used to set up a direct peer-to-peer
   connection through intervening NATs for the exchange of media
   messages.  Furthermore, the ICE mechanism is consistent with the
   assumptions listed above.  Thus the problem we need to solve can be
   reduced to finding a way for peers to exchange signaling messages.

4.1.  Types of Networks

   So let's consider an overlay network of peers where all peers are
   behind NATs with the most restrictive filtering policy, and consider
   ways for the peers to exchange signaling messages.  Several different
   approaches can be used to accomplish this:

      Relay -- All peers exchange SIP messages via a centralized "Relay
      Server" (with a public IP address).  This scheme minimizes the
      load on the peers and their associated NATs but requires a central
      server.  SIP messages flow relatively quickly between the peers,
      provided the central server is always available and not
      constrained by processing power or network bandwidth.

      Rendezvous -- Peers use a "Rendezvous Server" (with a public IP
      address) as an intermediary to initiate "NAT hole-punching" ([3])
      every time they wish to begin communicating.  Once NAT pinholes
      have been established, SIP messages are then exchanged directly.
      This scheme is still highly dependant on a central server, but
      reduces the load on it somewhat.  Initial SIP messages are
      slightly delayed by the retrieval of SIP addresses from the
      "Rendezvous Server" and by the "NAT hole-punching" technique.  The
      "Rendezvous Server" must maintain knowledge of and links to every
      active peer.

      Mesh -- Once connected into the peer network, nodes exchange
      messages with selected other peers periodically to keep NAT
      pinholes open.  SIP messages are either sent directly to the
      destination peer, or are sent indirectly via intermediate peers.
      No central server is required.  The load on the peers and their


Cooper & Matthews        Expires August 28, 2006                [Page 9]

Internet-Draft              NATs and Overlays              February 2006


      local NATs is proportional to the number of NAT pinholes that must
      be maintained and the number of messages that must be sent within
      the mesh.  (Methods for a peer to create or join such a peer-to-
      peer network are discussed in section 3.2).

   Graphically, the communication flows in these networks would appear
   as shown in Figure 2.  In the diagram, only signaling connections are
   shown; Media (RTP) connections are not shown.


              P                  P                    P
          P   |   P          P   |.  P            P---|---P
           \  |  /           .\  | ./            /    | /  \
            \ | /            . \ | / .          |     /     |
        P-----S-----P      P-.---S-----P        P-----------P
            / | \            . / | \              \ / |    /
           /  |  \           ./  |  \              /\ |   |
          P   |   P          P   |   P            P  \|   P
              P                  P                    P

            Relay           Rendezvous              Mesh


         Legend:
         P     - Peers
         S     - Central Server
         / \ | - Permanent connnections
         .     - Temporary connections


   Figure 2: Overlay Network Connectivity

   The networks in the figure above can be considered as discrete points
   in a spectrum that ranges from "fully centralized" on the left to
   "fully distributed" on the right.  In general, the effort required to
   establish and maintain NAT pinholes increases as we move to the
   right, as does the amount of effort required to deliver a SIP message
   between two arbitrary nodes.  However, the reliance on centralized
   equipment and the overall scalability decreases as we move to the
   right, and the network becomes more peer-to-peer.  Further discussion
   of each topology is given below.

   The Relay Network appears similar to a Client-Server configuration.
   It operates in a straightforward manner.  A peer that wishes to call
   another creates a request and delivers it to the "Relay Server".  The
   server forwards the request on to the target.  The performance and
   scalability characteristics of this network are quite suitable for
   small- and medium-scale deployments.  As the system grows into large


Cooper & Matthews        Expires August 28, 2006               [Page 10]

Internet-Draft              NATs and Overlays              February 2006


   scale deployments however, keeping the NAT pinholes open between the
   clients and the server places a heavy load on the server's resources.
   This load increases (at least) linearly with the size of the network.
   Even on a smaller scale, the "Relay Server" requires a sizable
   expenditure of resources (both initial and operational).  For very
   small systems, this cost may be impractical.  From a network
   availability standpoint, the "Relay Server" is also a liability.  It
   represents a single point of failure upon which all nodes are totally
   dependant.  Finally, the centralization of the administration of the
   network may be undesirable or impractical in some deployments.

   The Rendezvous Network reduces the load on a central server by
   eliminating it from the messaging path once communications between
   the two endpoints has been established.  One way this could work
   would be to have the originating node send the "Rendezvous Server" an
   'INITIATE_NAT_HOLE' request that specifies the target peer (perhaps
   via node-id, or SIP URI), as well as its own IP address(es).  In
   processing this request, the "Rendezvous Server" replies with the
   mapped IP address and port of the target peer and forwards the
   request to the target peer, perhaps also appending the mapped IP
   address and port of the originating peer.  Upon reception of the
   'INITIATE_NAT_HOLE' request, the target peer begins NAT hole-punching
   procedures to establish a link to the originator.  This effort may
   include an ICE-like trial of various IP addresses, to avoid the
   problems associated with double-NAT topologies.  Once the NAT
   pinholes are established, the two peers can begin regular SIP
   communications.

   Overall load on the "Rendezvous Server" is somewhat reduced, since it
   is only party to a portion of the session signaling.  These savings
   may not be substantial, though, since the reduction in SIP message
   traffic will require an increase in traffic to keep NAT pinholes
   alive.  The availability and administration characteristics are the
   same as with the Relay Network.

   The Mesh Network eliminates the use of a centralized server (except
   perhaps for bootstrapping, see section Section 5.2).  A node in this
   type of overlay establishes connections to some of the other peers.
   SIP messages are then routed via these connections.

4.2.  More on Mesh Networks

   Of the topologies described above, the Mesh Network is the most peer-
   to-peer, the most scalable, and the most plug-and-play.  Thus it
   seems to line up the best with our assumptions.  However, even with
   the general Mesh paradigm, several variations are still possible.
   The actual number of NAT pinhole connections is a key consideration.
   Consider Figure 3: Mesh Network Connectivity:


Cooper & Matthews        Expires August 28, 2006               [Page 11]

Internet-Draft              NATs and Overlays              February 2006


                P                   P                    P
             /     \             /  |  \              / /|\ \
           P         P         P----|----P          P----|----P
          /           \       /|    |    |\        /|/ \ | / \|\
         P             P     P-------------P      P-------------P
          \           /       \|    |    |/        \|/ / | \ \|/
           P         P         P----|----P          P----|----P
             \     /             \  |  /              \ \|/ /
                P                   P                    P

               Ring            Partial Mesh          Full Mesh


   Figure 3

   A Mesh Network in which every node is connected only to two
   neighbours can be termed a "Ring Network".  This topology expends
   very little effort to maintain NAT pinholes but results in extremely
   high hop counts as the number of nodes increases.  As a result, the
   overall scalability of this topology is very poor.

   On the other hand, in small peer-to-peer overlay networks it is
   possible to maintain NAT pinhole connections between all pairs of
   peers (a "Full Mesh Network").  However, as the number of peers and
   distinct NATs increase, the number of pinholes (and traffic required
   to maintain them) quickly becomes impractical.  In this topology,
   overall scalability is also poor.

   In between these two extremes, the "Partial Mesh Network" seeks to
   strike a balance between the minimum and maximum sustainable numbers
   of NAT pinholes.  This seems to be the only viable approach.  The
   "ideal" number of pinholes is the one that results in the lowest hop
   counts whilst also keeping pinhole maintenance traffic manageable.

4.3.  Static vs. Dynamic Connections

   Given the selection of a partial-mesh network, the next question is
   whether the connection topology should be relatively static, or
   should evolve dynamically as calls are made.  Note that we are
   talking about signaling connections here -- as with classical client-
   server SIP, the volume of media messages means that it always makes
   sense to set up a dedicated connection between the call endpoints for
   the media whenever that is possible.

   Say peer P wants to set up a connection to peer Q. In keeping with
   assumption 4, we assume peer Q is behind a NAT with a restrictive
   filtering behavior.  Thus P cannot send a connection request directly
   to Q, but must send it via existing connections in the overlay.  Only


Cooper & Matthews        Expires August 28, 2006               [Page 12]

Internet-Draft              NATs and Overlays              February 2006


   once the connection request is delivered to Q can P and Q use UDP (or
   TCP) hole-punching to initiate a connection, and then do any
   connection handshaking required (e.g, for TCP).

   So setting up a connection requires a number of messages to be
   exchanged between P and Q. If P and Q just need to exchange a very
   small number of messages, then it is probably more efficient for P
   and Q to use the existing mesh of connections rather than
   establishing a new connection.  Though it is not the goal of this
   document to discuss lookup and signaling mechanisms for P2P SIP, it
   seems likely that most transactions between two peers will be short
   and consist of only a small number of messages.  Thus a static
   connection pattern (perhaps with some additional connections
   established dynamically) is likely to be appropriate.

4.4.  Message Routing and Structured vs. Unstructured Meshes

   Assuming a fairly static pattern of connections, the next logical
   question is: What should the pattern of connections be?  There are
   many different patterns or schemes that can be used -- how can we
   classify and evaluate these choices?

   We believe that an important property of a overlay is the ability to
   route messages from one peer to an arbitrary second peer in the
   overlay.  We believe that this property is essential at times to
   allow a peer to place a call to another node, to publish the status
   of a peer or user (for example, to a peer acting as a distributed
   registrar), or when a peer want to create a connection to another
   peer in the overlay (when creating the partial mesh).

   With this in mind, we can classify connection patterns (or schemes)
   into two main groups:

      Structured -- In a structured scheme, connection pattern between
      peers is exploited when routing messages between peers.

      Unstructured -- In an unstructured scheme, the connection pattern
      is more or less random, and properties of the connection scheme
      are NOT exploited when routing messages.

   In the next few subsections, we consider the various properties of
   structured and unstructured partial meshes.

4.4.1.  Unstructured Schemes

   Some examples of unstructured schemes are:


Cooper & Matthews        Expires August 28, 2006               [Page 13]

Internet-Draft              NATs and Overlays              February 2006


   o  Purely Random -- a peer randomly selects a number of other peers
      to connect to.

   o  Longest Lived -- a peer prefers connections to peers who have been
      part of the overlay for a longer time.

   o  Nearby Neighbors -- a peer prefers connections to peers who are
      closer (e.g., smaller round-trip times)

   There are a number of ways messages might be routed in an
   unstructured scheme.  The simplest way is to flood the message
   through the overlay.  Though not particularly efficient, this way may
   be practical in smaller overlays or when the volume of messages is
   low.  Another way is to use a graph searching algorithm to locate the
   message target, for example depth-first search or breadth-first
   search.  A graph search algorithm will generally take longer than
   flooding to get the message to the peer, but may use fewer messages.
   Remembering a route, once found, and then using source routing for
   subsequent messages can be used with either of these two methods to
   improve performance, but suffers from the problem that topology
   changes (caused, for example, by a peer leaving the overlay) can
   invalidate the route unexpectedly.

   Another approach is to run a routing protocol, which is the approach
   used in the Internet.  In this case, each peer acts as both a host
   and a router.  Let's consider the impact of choosing one of the
   standard IETF routing protocols.

   o  RIP -- RIP is an example of a Distance Vector protocol.  Distance
      vector protocols require only small amounts of CPU and memory, and
      work well in networks will only a small number of loops, but tend
      to perform poorly in networks with lots of loops.  Since the
      number of loops in a partially meshed network increases rapidly as
      the number of connections per peer increases, DV protocols are
      likely to be a poor choice.

   o  BGP -- BGP is an example of a Path Vector protocols.  Path Vector
      protocols perform better (than DV protocols) in networks with lots
      of loops, but require significantly more storage and bandwidth,
      and can (at least in the case of BGP) converge slowly.

   o  OSPF, IS-IS -- OSPF and IS-IS are Link State protocols.  Link
      state protocols perform very well in meshed networks, but not
      considered suitable for networks larger than hundreds of routers.

   As can be seen, no one single IETF protocol works will in meshed
   networks of the scale we are interested in.  The Internet solves this
   problem by dividing the network up into regions (Autonomous Systems


Cooper & Matthews        Expires August 28, 2006               [Page 14]

Internet-Draft              NATs and Overlays              February 2006


   or ASes), each AS containing up to a few hundred routers, then
   running both a link state protocol (either OSPF or IS-IS) and a
   version of BGP call iBGP inside each AS, and running another version
   of BGP called eBGP between ASes.  However, all this requires
   considerable configuration and monitoring on the part of an army of
   operational personnel.

   All this suggests that unstructured schemes may not represent a good
   choice for P2P-SIP

4.4.2.  Structured Schemes

   The idea of a structured scheme is to create a connection pattern
   that can be exploited in routing.

   Consider, for example, the following connection scheme based on a few
   of the ideas of Chord.  As in Chord, some unique peer identifier is
   hashed and the result used to place peers on a ring.  Each peer then
   maintains connections to peers located at various locations going
   clockwise around the ring.  In this scheme, a message to peer Q can
   be addressed to Q's location in the ring, and an intermediate peer R
   can forward the message by forwarding it to the peer S in R's
   connection table that is closest to Q without overshooting Q.

   If the NAT can support 160 different connections per peer, then the
   targets of the connections radiating out from each peer can be
   located at exponentially increasing distances from that peer.  This
   allows a peer can reach any other peer in O(log N) hops using this
   scheme.  However, if 160 different connections per peer proves
   excessive (see assumption 8), then hop counts may be larger.

   Many other structured connections schemes exist.  For example,
   structured connections schemes can be created using the ideas
   contained any one of a number of DHT schemes.  (See, however, the
   comments of section Section 6).

4.4.3.  Symmetric Interest

   When evaluating connection schemes, there is a property we have
   dubbed "symmetric interest".  A connection scheme exhibits "symmetric
   interest" if, when peer P desires a connection to peer Q, then peer Q
   also desires a connection to peer P. "Symmetric interest" seems a
   desirable property of connection schemes since connections through
   NATs, by their nature, are bi-directional and because both peers
   incur the overhead of sending keep-alives to establish and maintain
   the connection.

   A connection scheme based on peers randomly selecting peers to


Cooper & Matthews        Expires August 28, 2006               [Page 15]

Internet-Draft              NATs and Overlays              February 2006


   establish connections to does NOT exhibit symmetric interest because
   peer P can select peer Q without peer Q selecting peer P. The
   connection scheme based on the ideas of Chord that was mentioned in
   the previous section also does NOT exhibit symmetric interest because
   a given peer P in the ring desires connections to peers in the
   clockwise half-circle but not in the counter-clockwise half-circle.

   One scheme that does exhibit symmetric interest has each peer
   maintains connections to peers located an exponentially increasing
   distances going both clockwise AND counter-clockwise around the ring.

   The authors have not yet had a chance to do a thorough analysis of
   various structured schemes.  Never-the-less, the idea of a structured
   scheme (perhaps exhibiting "symmetric interest") seems a lot more
   promising than unstructured schemes.


Cooper & Matthews        Expires August 28, 2006               [Page 16]

Internet-Draft              NATs and Overlays              February 2006


5.  A Few Additional Points

   This section discusses a few additional points about P2P SIP
   architecture.

5.1.  Superpeers

   Orthogonal to these connectivity approaches is the idea of
   superpeers.  A group of peers that are all behind the same NAT can
   elect one or more of their number to act on their behalf in the
   larger P2P overlay.  These elected peers are called superpeers.

   The overlay architecture can then create two types of connections:
   connections between superpeers that traverse NATs, and connections
   between a superpeer and its local peers that do not traverse NATs.
   In this way, the number of NAT pinholes can be reduced compared with
   an architecture that has each peer connect to peers behind other
   NATs.

5.2.  Joining the Network

   How can a node X, which is not currently a part of a particular P2P
   network, can join that network.

   The first thing to note is that if node X can contact just one peer P
   in the P2P overlay network, then it can learn about other peers
   though peer P and so join the network.

   So the question can be reworded as: how can node X locate and contact
   at least one peer in the P2P overlay network that it wishes to join?

   One approach is to use multicast.  Node X could send out a "Hello, is
   anyone there?" multicast message, and any peer currently in the P2P
   network can reply.  Alternatively, peers that are currently in a P2P
   network can periodically send out multicast messages advertising the
   existence of the network.

   This approach works well when there are a number of peers on the same
   subnet.  It also works well when there a number of peers on subnets
   linked by multicast-enabled routers.  However, many low-end routers
   do not support multicast, and multicast support on high-end routers
   needs to be configured, so using multicast between subnets likely
   works only in more sophisticated deployments.

   A second approach can be used if node X was previously part of the
   P2P network and then disconnected for a while.  Node X can remember
   the IP addresses and ports of some peers when it disconnects, and
   then try to contact those peers when it wants to rejoin the network.


Cooper & Matthews        Expires August 28, 2006               [Page 17]

Internet-Draft              NATs and Overlays              February 2006


   If at least one of the other peers (a) can be contacted and (b) is
   still a member of the P2P overlay network, then node X can rejoin the
   network.

   This approach will not work if all the other peers are behind NATs
   with a filtering policy of "Address Restricted filtering" (or worse)
   and node X disconnects for more than the lifetime of a filtering
   entry in a NAT (typically 2 - 5 minutes).  However, it will work if
   some peers are behind NATs with "Endpoint-Independent filtering".

   A third approach is to configure node X with the "mapped address and
   port" of some peer P. Here the "mapped IP address and port" is the
   public IP address and port of the peer that the NAT (if any) assigns
   [ETH] this is typically learned through a protocol such as STUN
   (which requires a STUN server).  If peer P is behind a NAT with a
   filtering behavior of "Address Restricted filtering" (or worse), then
   peer P must also configured with the mapped address and port of node
   X.

   Given the manual configuration required, this approach must be
   considered a last-ditch approach.

   A fourth, and most general, approach is to use an Introduction
   Server.  This is a node with a public IP address and a DNS entry
   which is not part of the P2P network but is used only for
   bootstrapping purposes.  In the minimal usage scenario, the P2P
   network elects a single peer P to maintain a connection to the
   Introduction Server.  When node X contacts the Introduction Server,
   node X is given the mapped IP address and port of peer P, and the
   Introduction Server forwards node X's mapped address and port to P.

   The disadvantage of this approach is that it requires a stable helper
   node with a public IP address.  But otherwise it is the most
   generally applicable of all the approaches.

   +---------------------+-----------+-------+----------+--------------+
   |                     | Multicast | Buddy |  Manual  | Introduction |
   |                     |           |  List |  Config  |    Server    |
   +---------------------+-----------+-------+----------+--------------+
   | Plug and Play       |     Y     |   Y   |     N    |       Y      |
   |                     |           |       |          |              |
   | Works when node X   |     N     |   Y   |     Y    |       Y      |
   | is anywhere         |           |       |          |              |
   |                     |           |       |          |              |
   | Can be used for     |     Y     |   N   |     Y    |       Y      |
   | first connnection   |           |       |          |              |
   |                     |           |       |          |              |


Cooper & Matthews        Expires August 28, 2006               [Page 18]

Internet-Draft              NATs and Overlays              February 2006


   | Does not require an |     Y     |   Y   |     Y    |       N      |
   | external node       |           |       |          |              |
   +---------------------+-----------+-------+----------+--------------+

                 Table 1: Comparison of Discovery Methods


Cooper & Matthews        Expires August 28, 2006               [Page 19]

Internet-Draft              NATs and Overlays              February 2006


6.  Comments on Existing P2P Overlays

   Many existing P2P overlays have ignored the presence of NATs in the
   network.  Their assumption is that all participating nodes are fully
   reachable by all other nodes.  In practice, this turns out not to be
   true.  The "Endpoint-dependant filtering" NAT behaviour specified in
   [1] will impair the ability of many DHT algorithms to provide the
   guarantees they strive for.  Some popular file-sharing networks
   require manual configuration of user's local NAT in order to join.
   Incorrect configuration makes it impossible to participate in the
   overlay.  Other P2P systems deal with NATs by assigning "helpers" to
   nodes behind NATs.  These "helpers" have publicly available addresses
   and act as relay points for the NAT-ed nodes.  This is a relatively
   effective approach, but requires the nodes with publicly available
   addresses to carry more than their share of the load.  The load will
   quickly become overwhelming in a network with a small proportion of
   public nodes.


Cooper & Matthews        Expires August 28, 2006               [Page 20]

Internet-Draft              NATs and Overlays              February 2006


7.  Conclusions

   Given the analysis done so far, it seem like the best P2P overlay
   architecture will have the following properties:

   o  Partial mesh,

   o  Mostly static connections,

   o  Structured,

   o  Exhibits symmetric interest, and

   o  Uses superpeers.


Cooper & Matthews        Expires August 28, 2006               [Page 21]

Internet-Draft              NATs and Overlays              February 2006


Appendix A.  Detailed NAT UDP Assumptions

   +-----------+-------+-----------------+-----------+-----------------+
   | Criterion | BEHAV | Brief           | Our       | Justification   |
   |           | E  #  | Description     | Requireme |                 |
   |           |       |                 | nt        |                 |
   +-----------+-------+-----------------+-----------+-----------------+
   | Mapping   | REQ-1 | MUST be         | Must      | Peers behind a  |
   |           |       | "Endpoint-Indep | comply    | NAT which does  |
   |           |       | endent"         |           | not comply      |
   |           |       |                 |           | require a       |
   |           |       |                 |           | "surrogate" to  |
   |           |       |                 |           | act on their    |
   |           |       |                 |           | behalf in the   |
   |           |       |                 |           | P2P network and |
   |           |       |                 |           | to relay        |
   |           |       |                 |           | traffic to      |
   |           |       |                 |           | them.  This     |
   |           |       |                 |           | surrogate must  |
   |           |       |                 |           | have either a   |
   |           |       |                 |           | public IP       |
   |           |       |                 |           | address or be   |
   |           |       |                 |           | behind a NAT    |
   |           |       |                 |           | with a          |
   |           |       |                 |           | Filtering rule  |
   |           |       |                 |           | of              |
   |           |       |                 |           | "Endpoint-Indep |
   |           |       |                 |           | endent" (REQ-8) |
   |           |       |                 |           | .It is likely   |
   |           |       |                 |           |  that some      |
   |           |       |                 |           |  systems will   |
   |           |       |                 |           |  not have peers |
   |           |       |                 |           |  that can act a |
   |           |       |                 |           | ssurrogates.    |
   |           |       |                 |           |  Furthermore,   |
   |           |       |                 |           |  acting as a    |
   |           |       |                 |           |  surrogate is   |
   |           |       |                 |           |  very bandwidth |
   |           |       |                 |           | -and            |
   |           |       |                 |           |  processor-inte |
   |           |       |                 |           | nsive.          |
   |           |       |                 |           |                 |
   | IP        | REQ-2 | RECOMMENDED to  | Don't     | Since we        |
   | Address   |       | be "Paired"     | care      | control both    |
   | Pooling   |       |                 |           | endpoints, it   |
   |           |       |                 |           | is easy for us  |
   |           |       |                 |           | to handle other |
   |           |       |                 |           | behaviors       |


Cooper & Matthews        Expires August 28, 2006               [Page 22]

Internet-Draft              NATs and Overlays              February 2006


   | Port      | REQ-3 | MUST NOT be     | Must      | "Port           |
   | Assignmen |       | "Port           | comply    | Overloading"    |
   | t         |       | Overloading"    |           | can often cause |
   |           |       |                 |           | seemingly       |
   |           |       |                 |           | random and      |
   |           |       |                 |           | inexplicable    |
   |           |       |                 |           | failures, as    |
   |           |       |                 |           | well as making  |
   |           |       |                 |           | testing much    |
   |           |       |                 |           | harder.         |
   |           |       |                 |           |                 |
   | Port      | REQ-3 | RECOMMENDED     | Don't     | Since we        |
   | Range     | a     | that the range  | care      | control both    |
   |           |       | classification  |           | endpoints, it   |
   |           |       | of the source   |           | is easy for us  |
   |           |       | port be         |           | to handle other |
   |           |       | preserved.      |           | behaviors.      |
   |           |       |                 |           |                 |
   | Port      | REQ-4 | RECOMMENDED     | Don't     | Since we        |
   | Parity    |       | that the NAT    | care      | control both    |
   |           |       | exhibit "Port   |           | endpoints, it   |
   |           |       | parity          |           | is easy for us  |
   |           |       | preservation"   |           | to handle other |
   |           |       |                 |           | behaviors.      |
   |           |       |                 |           |                 |
   | Mapping   | REQ-5 | MUST NOT be     | (TBD)     | (TBD)           |
   | Refresh   |       | less than 2     |           |                 |
   | Interval  |       | minutes         |           |                 |
   |           |       |                 |           |                 |
   |           | REQ-5 | Value MAY be    | (TBD)     | (TBD)           |
   |           | a     | configurable    |           |                 |
   |           |       |                 |           |                 |
   |           | REQ-5 | Default         | Don't     |                 |
   |           | b     | RECOMMENDED to  | care      |                 |
   |           |       | be 5 minutes    |           |                 |
   |           |       |                 |           |                 |
   | Mapping   | REQ-6 | MUST have "NAT  | Must      | Are their any   |
   | Refresh   |       | Outbound        | comply    | NATs that do    |
   | Direction |       | refresh         |           | not comply with |
   |           |       | behavior" of    |           | this???         |
   |           |       | "True".         |           |                 |
   |           |       |                 |           |                 |


Cooper & Matthews        Expires August 28, 2006               [Page 23]

Internet-Draft              NATs and Overlays              February 2006


   |           | REQ-6 | MAY have "NAT   | Don't     | Many NATs       |
   |           | a     | Inbound refresh | care      | refresh only on |
   |           |       | behavior" of    |           | outbound        |
   |           |       | "True"          |           | traffic, so it  |
   |           |       |                 |           | is simplest to  |
   |           |       |                 |           | assume this is  |
   |           |       |                 |           | false.          |
   |           |       |                 |           |                 |
   | Conflicti | REQ-7 | MUST either     | Should    | Conflicting     |
   | ng Addres |       | ensure no       | comply    | addresses are   |
   | sSpaces   |       | conflict or     |           | not common, but |
   |           |       | behave sensibly |           | do occur.  NATs |
   |           |       | when a conflict |           | that do not     |
   |           |       | occurs          |           | comply will     |
   |           |       |                 |           | cause problems  |
   |           |       |                 |           | for the peers   |
   |           |       |                 |           | behind them.    |
   |           |       |                 |           |                 |
   | Filtering | REQ-8 | RECOMMENDED to  | Should    | (see discussion |
   |           |       | be either       | comply    | in section XXX) |
   |           |       | "Endpoint       |           |                 |
   |           |       | independent" or |           |                 |
   |           |       | "Address        |           |                 |
   |           |       | dependent"      |           |                 |
   |           |       |                 |           |                 |
   |           | REQ-8 | Filtering       | Don't     | Best to assume  |
   |           | a     | behavior MAY be | care      | it is NOT       |
   |           |       | configurable    |           | configurable    |
   |           |       |                 |           |                 |
   | Hairpinni | REQ-9 | MUST support    | Should    | This issue      |
   | ng        |       | "hairpinning"   | comply    | becomes crucial |
   |           |       |                 |           | when the NAT in |
   |           |       |                 |           | question is the |
   |           |       |                 |           | NAT closest to  |
   |           |       |                 |           | the public      |
   |           |       |                 |           | internet in a   |
   |           |       |                 |           | multi-NAT       |
   |           |       |                 |           | environment.    |
   |           |       |                 |           | In this         |
   |           |       |                 |           | scenario, a     |
   |           |       |                 |           | failure to      |
   |           |       |                 |           | support         |
   |           |       |                 |           | "hairpinning"   |
   |           |       |                 |           | will hinder     |
   |           |       |                 |           | (possibly       |
   |           |       |                 |           | prevent)        |
   |           |       |                 |           | bootstrapping   |
   |           |       |                 |           | attempts.       |


Cooper & Matthews        Expires August 28, 2006               [Page 24]

Internet-Draft              NATs and Overlays              February 2006


   |           | REQ-9 | Hairpinning     | Must      | (TBD)           |
   |           | a     | behavior MUST   | comply    |                 |
   |           |       | be "External    | (if NAT   |                 |
   |           |       | source IP       | does      |                 |
   |           |       | address and     | hair-pinn |                 |
   |           |       | port"           | ing)      |                 |
   |           |       |                 |           |                 |
   | ALGs      | REQ-1 | RECOMMENDED     | Should    | (TBD)           |
   |           | 0     | that ALGs be    | comply    |                 |
   |           |       | disabled by     |           |                 |
   |           |       | default         |           |                 |
   |           |       |                 |           |                 |
   |           | REQ-1 | RECOMMENDED     | Should    | (TBD)           |
   |           | 0  a  | that each ALG   | comply    |                 |
   |           |       | can be enabled  |           |                 |
   |           |       | or disabled     |           |                 |
   |           |       | separately      |           |                 |
   |           |       |                 |           |                 |
   | Determini | REQ-1 | MUST have       | Must      | (TBD)           |
   | sm        | 1     | deterministic   | comply    |                 |
   |           |       | behavior        |           |                 |
   |           |       |                 |           |                 |
   | ICMP      | REQ-1 | Receipt of ICMP | Must      | (TBD)           |
   | support   | 2     | message MUST    | comply    |                 |
   |           |       | NOT destroy NAT |           |                 |
   |           |       | mapping         |           |                 |
   |           |       |                 |           |                 |
   |           | REQ-1 | SHOULD NOT      | Don't     | (TBD)           |
   |           | 2  a  | filter ICMP     | care      |                 |
   |           |       | messages based  |           |                 |
   |           |       | on source IP    |           |                 |
   |           |       | address.        |           |                 |
   |           |       |                 |           |                 |
   |           | REQ-1 | RECOMMENDED     | Don't     | (TBD)           |
   |           | 2  b  | that the NAT    | care      |                 |
   |           |       | support ICMP    |           |                 |
   |           |       | Destination     |           |                 |
   |           |       | Unreachable     |           |                 |
   |           |       | messages.       |           |                 |
   |           |       |                 |           |                 |
   | Fragmenta | REQ-1 | MUST support    | Should    | (TBD)           |
   | tion when | 3     | fragmentation   | comply    |                 |
   |  sending  |       | of packets      |           |                 |
   |           |       | larger than     |           |                 |
   |           |       | link MTU        |           |                 |
   |           |       |                 |           |                 |


Cooper & Matthews        Expires August 28, 2006               [Page 25]

Internet-Draft              NATs and Overlays              February 2006


   | Fragmenta | REQ-1 | MUST support    | Should    | (TBD)           |
   | tion when | 4     | "Receive        | comply    |                 |
   |  receivin |       | Fragment Out of |           |                 |
   | g         |       | Order" behavior |           |                 |
   +-----------+-------+-----------------+-----------+-----------------+

                       Table 2: NAT UDP Assumptions


8.  References

   [1]  Audet, F. and C. Jennings, "NAT Behavioral Requirements for
        Unicast UDP", draft-ietf-behave-nat-udp-04 (work in progress).

   [2]  Guha, S. and P. Francis, "NAT Behavioral Requirements for
        Unicast TCP", draft-hoffman-behave-tcp-03 (work in progress).

   [3]  Ford, B. and P. Srisuresh, "Peer-to-Peer Communication Across
        Network Address Translators", article available at
        http://www.brynosaurus.com/pub/net/p2pnat/.

   [4]  Stoica, I., Morris, R., Liben-Nowell, D., Karger, D., Kaashoek,
        M., Dabek, F., and H. Balakrishnan, "Chord: A Scalable Peer-to-
        peer Lookup Service for Internet Applications",
        article available at http://pdos.csail.mit.edu/chord/.

   [5]  Rosenberg, J., "Interactive Connectivity Establishment (ICE): A
        Methodology for Network Address Translator (NAT) Traversal for
        Offer/Answer Protocols", draft-ietf-mmusic-ice-06 (work in
        progress).

   [6]  Network World, "P2P Traffic Still Dominates the Net",
        article available at
        http://www.toptechnews.com/story.xhtml?story_id=38121.


Cooper & Matthews        Expires August 28, 2006               [Page 26]

Internet-Draft              NATs and Overlays              February 2006


Authors' Addresses

   Eric Cooper
   Avaya
   100 Innovation Drive
   Ottawa, Ontario  K2K 3G7
   Canada

   Phone: +1 613 592 4343 x228
   Email: ecooper@avaya.com


   Philip Matthews
   Avaya
   100 Innovation Drive
   Ottawa, Ontario  K2K 3G7
   Canada

   Phone: +1 613 592 4343 x224
   Email: philip_matthews@magma.ca


Cooper & Matthews        Expires August 28, 2006               [Page 27]

Internet-Draft              NATs and Overlays              February 2006


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2006).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


Cooper & Matthews        Expires August 28, 2006               [Page 28]