[Docs] [txt|pdf] [Tracker] [Email] [Nits]

Versions: 00

IDR working group                                             Juan Alcaide
Internet Draft                                                       Cisco
Intended status: Standards Track                         Fernando Calabria
Expires: February 2012                                               Cisco

                                                         September 2, 2011

                               BGP prefix priority


   Status of this Memo

      This Internet-Draft is submitted to IETF in full conformance with
      the provisions of BCP 78  and BCP 79.

      Internet-Drafts are working documents of the Internet Engineering
      Task Force (IETF), its areas, and its working groups.  Note that
      other groups may also distribute working documents as Internet-

      Internet-Drafts are draft documents valid for a maximum of six
      months and may be updated, replaced, or obsoleted by other documents
      at any time.  It is inappropriate to use Internet-Drafts as
      reference material or to cite them other than as "work in progress."

      The list of current Internet-Drafts can be accessed at

      The list of Internet-Draft Shadow Directories can be accessed at

   Copyright Notice

        Copyright (c) 2011 IETF Trust and the persons identified as the
        document authors.  All rights reserved.

        This document is subject to BCP 78 and the IETF Trust's Legal
        Provisions Relating to IETF Documents
        (http://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided without
        warranty as described in the Simplified BSD License.

Calabria - Alcaide      Expires February 2 , 2012                  [Page 1]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011


      This document defines a set of extended communities to carry
      priority information. This information provides a mechanism for
      assigning a processing preference to the routes that carries it. It
      also provides a scheme for processing routes with strict priority
      order during update reception, best-path computation, and update

   Table of Contents

   1. Introduction......................................................3
   2. Conventions Used in this Document.................................4
   3. Definitions of Commonly Used Terms................................4
   4. Scope.............................................................6
   5. Solution Specification............................................7
   5.1. Network Wide Prefix Priority ...................................7
   5.2. Network Wide Prefix Priority in a "Trusted" Environment.........8
   5.3. Network Wide Prefix Priority in a "on-Trusted" Environment......8
   5.4. Prioritizing Reception of Routes................................8
   5.5. Prioritizing Local and Outbound Processing of Routes...........10
   5.6. Change of priority.............................................11
   5.7. Interaction with Neighbors not Supporting Route Prioritization.12
   6. Rationale behind network wide priorities.........................13
   7. Security Considerations..........................................14
   8. IANA Considerations..............................................14
   9. References.......................................................15
   10. Authors' Addresses..............................................15

   Calabria - Alcaide et all   Expires February 2 , 2012           [Page 2]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

   1. Introduction

      BGP scale has been growing in the last years, in terms of neighbors
      and routes. This impacts convergence times after, for example, a BGP
      re-initialization event. One solution is a continuous upgrade of the
      hardware used by BGP speakers, by adding faster CPU and additional
      memory. This approach, however, is expensive and cannot reduce
      convergence times indefinitely. It is desirable having a software
      based solution, in which a BGP speaker can prioritize some selected
      routes. In other words, there is a need for a Qos-like mechanism in
      the BGP control plane.

      Processing of routes with a given priority SHOULD be performed
      before any lower priority ones. This process SHOULD be performed in
      a preemptive manner. Thus, the convergence times obtained for high
      priority routes would be the same as if there were no lower priority
      routes at all. Implementations are not expected to reach this
      theoretical limit, but closely approach to it.

      Priority information is signaled by adding to the route an extended
      community hereby named PEC (Priority Extended Community). A PEC is
      meant to have network wide significance and transparent to speakers
      that do not understand it. It MAY be set at the origination of the
      route and propagated across the network, thus greatly reducing
      management burden, but it can also be set by a policy if required.

      Route processing during reception of routes is based on the priority
      assigned to the received path; while the remaining tasks are based
      on the priority of the computed best-path. Provisions to prevent
      that a change in the priorities associated to the path results in
      miss ordered routes are also covered in the present document.

      The design of how a given priority marking is honored is twofold: a
      given speaker SHOULD process the reception of a path with the
      priority that the received path has; and it should process any local
      or transmission task with the priority associated to the best-path
      of the net. Thus, the design supports different paths being
      originated with different priority marking; and it deals with the
      conflict by aggregating these markings during best-path computation
      and propagating them downstream. Thus, aggregated marking is honored
      as close to the source of this aggregation as possible.

   Calabria - Alcaide et all   Expires February 2 , 2012           [Page 3]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

   2. Conventions Used in this Document

      The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in BCP 14, RFC 2119
      [RFC2119].  RFC 2119 defines the use of these key words to help make
      the intent of standards track documents as clear as possible.  While
      this document uses these keywords, this document is not a standards
      track document.

   3. Definitions of Commonly Used Terms

      The set of definitions below are used through this document. Some
      terms are well-known, some terms are defined to avoid confusion and
      some (those marked with a "*") are defined for the purpose of this
      implementation (and thus referenced by other sections throughout the
      entire document).

             BGP process: internal implementation of a BGP speaker. The
             router may implement the BGP process as one or more OS
             processes or threads.

             net: BGP prefix, including all the paths received from all
             the neighbors.

             path: BGP prefix received from a particular neighbor.
             Multiple paths can be associated to a given net.

             BGP table: database where all the BGP routes are kept. It's a
             set of nets, each of them with their associated paths.

             RIB (Routing Information Base): database where all the
             forwarding information is kept. It's a set of nets with their
             associated forwarding paths (more than one if it's a
             multipath net). Nets can be learned from different routing
             protocols, in particular they can have a correspondent entry
             from  the BGP table, and the forwarding path used will be the
             BGP best-path for that net (plus additional ones if it's a
             multipath net).

   Calabria - Alcaide et all   Expires February 2 , 2012           [Page 4]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

             upstream/downstream directions: When routes flow in a given
             direction, a BGP speaker receives routes from upstream and
             advertises them downstream.

             receiving peer/sending peer: When routes flow in a given
             direction between two speakers, the BGP speaker that sends
             the routes is the sending peer and the BGP speaker that
             receives them is the receiving peer.

             PEC* (Priority Extended Community): extended-community
             associated to a BGP path that is an indication of the path-
             priority for that path. PEC=priority denotes that a given PEC
             indicates that priority. PEC=NULL indicates that no PEC is
             actually send in an update message.

             strict priority: method of servicing the process of several
             tasks. Tasks with a given priority are processed before any
             other task with lower priority. In the context of this
             document, they SHOULD also preempt the processing of any
             lower priority task.

             route priority*: integer from 0 to 7 associated to a route.
             It indicates the priority or urgency with which this route is
             processed. Priority=0 indicates the lowest urgency, and
             priority=7 indicates the highest urgency. It is a generic
             term that can actually have a different value based on the
             specific task a BGP process is performing:

             in-message-priority*: priority associated to a received BGP
             message as it is received from the TCP session. It's derived
             by calculating the maximum of all path-priorities in a given
             update message. It determines the priority for message
             processing during reception.

             path-priority*: priority associated to a BGP path. It's
             calculated by looking at the PEC associated to the path. It
             determines the priority of a path during reception, after it

   Calabria - Alcaide et all   Expires February 2 , 2012           [Page 5]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

             has been parsed from a message. It is also used to calculate
             the rest of priorities.

             max-path-priority*: priority associated to a BGP net. It's
             derived by calculating the maximum path-priority for all the
             paths of a given net. It determines the processing priority
             for best-path computation.

             net-priority*: priority associated to a BGP net. It's derived
             from by calculating the path-priority of the best-path. It
             determines the processing priority for any further local
             processing (after best-path computation) and advertisement of

   4.  Scope

   As mentioned before, this document focuses on the following:

      - A scheme that assigns and signals priority values on a prefix

      - Proposing a solution for processing prioritized routes during
      update reception.

      - Proposing a solution for processing prioritized routes during
      best-path computation, and update transmission.

      - Proposing a solution for managing prefixes whose priority changed
      by an administrative task.

      - Guidelines to "interact" with speakers that do not (fully or
      partially) support prefix prioritization.

   Calabria - Alcaide et all   Expires February 2 , 2012           [Page 6]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

   5. Solution Specification

         5.1.Network Wide Prefix Priority

      Priority for a prefix is set by the assignment of a BGP extended
      community attribute, in order to indicate preference of processing.
      This community is hereby named PEC (Priority Extended Community),
      and MUST contain priority values from 0 to 7. PECs are defined as a
      new transitive extended-community of experimental use as defined by
      [RFC4360] and [RFC3692].

      The extended community type is: 0x80FE whose value is encoded as a
      sequence of 5 zero bytes and the priority value set by the 3 most
      significant bits of the last byte, resulting in:

      Highest priority (7) : 0x80FE:0000000000E0

      Lowest priority (0) : 0x80FEA:000000000000

      and all the pertinent values in between.

      In a trusted environment, PEC is set by the speaker originating the
      route and has neighbor significance. This approach greatly reduces
      the management burden of mapping routes to priorities. If PECs are
      not trusted, they MAY be changed by any other speaker downstream
      based on its policy.

      PECs are propagated on a per path basis. The correlation between
      paths and nets for a given priority is as follows:

         - Path-priority is associated to a BGP path upon receiving it,
           typically based on PECs.

         - Net-priority is assigned to the net, and corresponds to the
           path-priority of the best-path for that prefix.

         - Net-priority is signaled when the route is advertised,
           typically by PECs.

   Calabria - Alcaide et all   Expires February 2 , 2012           [Page 7]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

         5.2.Network Wide Prefix Priority in a "Trusted" Environment

      In a trusted environment, priority signaling is based on the
      advertisement of one single PEC by the originator of the route. In

         - Path-priority for a path is based on the PEC received to that
           BGP path.

         - If multiple PECs are received for the same prefix, the speaker
           SHOULD use the PEC that indicates a higher priority.

         - If no PEC is received (PEC=NULL), the speaker SHOULD explicitly
           set path-priority=0.

         - When advertising updates, all PECs are removed and one single
           PEC is advertised, corresponding to the net-priority of the
           advertised net. In particular, if net-priority=0 an explicit
           PEC=0 SHOULD be sent.

         5.3.Network Wide Prefix Priority in a "on-Trusted" Environment

      In a non-trusted environment, it's possible to change the above
      procedures by local configuration. In particular:

         - Path-priority can be overwritten when receiving a route.

         - PECs transmitted can be overwritten when advertising a route.

         5.4.Prioritizing Reception of Routes

      Processing routes during reception involves tasks like reading
      update messages, parsing the prefixes inside those messages, and
      installing them in the BGP table as a path belonging to the neighbor
      associated to the session the message was received from.

   Calabria - Alcaide et all   Expires February 2 , 2012           [Page 8]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

      These tasks SHOULD be performed in strict priority order based on
      the path-priority set by a speaker or by local configuration.

      Using path-priority to select the priority for inbound processing
      carries within some challenges, since path-priority is unknown till
      inbound processing itself is performed. The following solutions to
      this challenge are presented:

        - After reading an update message from the TCP session, inspect
         the message and calculate an in-message-priority, which
         corresponds to the highest path-priority of all the prefixes
         present in the message. Any further processing of the message,
         like a detailed parsing, it's performed in strict priority order
         based on in-message-priority.

         - Calculating in-message-priority itself is not a task that can
         be prioritized, and therefore it should be a light-weight task.
         For the most common case, where path-priority is determined based
         on PEC, this consideration does not apply. Assigning statically a
         path-priority to a given session is a task that requires no
         processing at all. On the other side of the spectrum, if path-
         priority is determined by the prefix itself (i.e. prefixes in the
         same update can have different path-priority), the task becomes
         non-trivial. Furthermore, some prefixes may get a preferential
         treatment (if their in-message-priority is higher than their

        - After path-priority is computed for a route, any further inbound
         processing of the route can be performed based on path-priority.
         This may involve tasks like installing the route into the BGP

      A path MUST be discarded (and not installed in the BGP table) if it
      has been received before a path for the same prefix and TCP session
      that already exists in the BGP table. This non-FIFO scenario is
      possible when receiving the same prefix with different priorities.
      If the second prefix received has a higher in-message-priority or
      path-priority, the first prefix could be a candidate to be installed
      in the BGP table after the second has actually already been
      installed. Note that with these modifications, the sequence of
      routes installed in the BGP table could be different than it would

   Calabria - Alcaide et all   Expires February 2 , 2012           [Page 9]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

      be without the use of priorities. This change of behavior is
      acceptable under BGP protocol rules ([RFC4271]).

      Any received BGP messages that are not update messages SHOULD be
      processed in strict priority order, based on a higher priority than
      the maximum in-message-priority.

         5.5.Prioritizing Local and Outbound Processing of Routes

      After a path has been installed in the BGP table, the processing
      priority of all the tasks that correspond to the associated prefix
      is not dependent anymore into the priority of the path itself (path-
      priority), but on that of the net it belongs to, namely net-
      priority. However, net-priority cannot be known till the best-path
      is resolved, and to prioritize itself the task that resolves best-
      path, max-path-priority is used. Max-path-priority is defined as the
      maximum path-priority of all the paths associated to a given net,
      including the path-priority of any new path that triggered the best-
      path computation.

      Calculating max-path-priority itself is a task that SHOULD be
      processed in strict priority order, based on the path-priority of
      the path that triggers best-path computation.

      Best-path processing is a local task that SHOULD be processed in
      strict priority order, based on max-path-priority.

      Further local processing of routes includes tasks like installation
      of the net in the routing table. Outbound processing includes tasks
      like formatting nets into update messages and transmitting them
      through the TCP session. All these tasks SHOULD be performed in
      strict priority order based on net-priority.

      Note that the rules above force that all the prefixes in a given
      message to have associated the same net-priority (if the
      transmission of update messages is to be prioritized based on the
      common net-priority). This is already a constriction if PECs are
      used to signal priorities to downstream peers.

   Calabria - Alcaide et all   Expires February 2 , 2012          [Page 10]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

      Any transmitted BGP messages that are not update messages SHOULD be
      processed in strict priority order, based on a higher priority than
      the maximum net-priority.

         5.6.Change of priority

      As previously described, the advertisement of routes is done with a
      priority based on net-priority (assigned to a given prefix). There
      are no conflicts as long as, over time, net-priority remains the
      same for a given prefix. However, net-priority derives from path-
      priority, and therefore it may change. Without any further
      mechanisms, the order in which routes are advertised would be
      incorrect, and inconsistencies across the BGP tables of the sending
      and receiving peers would appear.

      This non-FIFO scenario is possible when advertising the same prefix
      with different priorities. If the second prefix that needs to be
      advertised to a given neighbor has a higher net-priority than a
      first one already scheduled for transmission, the second one could
      be transmitted actually before the first one is.

      When sent through the BGP session, advertisements for a given prefix
      MUST keep, in all cases, the same order than they would have without
      route prioritization (i.e., FIFO-like processing), or perform only
      the last advertisement. In other words, a route computed as best-
      path MUST NOT be transmitted over a BGP session before a route that
      was computed previously as best-path. Note that the offending
      scenarios are only possible when increasing net-priority. If net-
      priority decreases, the problem does not happen. How an
      implementation deals with this situation is outside the scope of
      this document. However, these two general approaches are discussed:

         - One obvious option is making sure that any previous low-
           priority route is not actually advertised (and thus it's
           discarded). This option has the drawback of complexity (updates
           already scheduled for transmission may have to be reformatted).
           Note also that the sequence of routes transmitted could be
           different than it would be without the use of priorities. This
           change of behavior is acceptable under BGP protocol rules

   Calabria - Alcaide et all   Expires February 2 , 2012          [Page 11]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

         - A second option is that, whenever net-priority needs to
           increase, the BGP speaker simply waits for all the routes with
           lower net-priority to be transmitted across all sessions.
           After they are transmitted, net-priority can be safely
           increased.  While net-priority has not transitioned, any task
           depending on net-priority for that route is processed as usual,
           considering the old net-priority. Note that this may imply
           sending two updates upon a transition, if attributes
           transmitted (like PEC) depend on net-priority. The drawback of
           this approach is that it                                       introduces a delay in how priority
           information is propagated across the network (indefinitely in a
           worst case scenario, if a prefix is constantly flapping at a
           high rate).

      Same considerations apply for any other local processing tasks, if
      the implementation of these tasks makes them susceptible of miss
      ordering their execution.

         5.7.Interaction with Neighbors not Supporting Route Prioritization

      When all the BGP speakers involved in the propagation of a network
      event do not support route prioritization, priority routes will not
      be treated with the preference they would have otherwise. It is
      possible, however, to minimize the effects of this scenario based on
      the following considerations:

      - Priority management is transparent across speakers and domains not
      supporting route prioritization. This is because PEC is defined as a
      transitive extended-community.

      - If priority of received paths is not marked with a PEC, the same
      effect can be achieved by local configuration.

      - Reception of routes from a neighbor not supporting route priority
      does not change. The routes are received with the preference that
      in-message-priority indicates.

   Calabria - Alcaide et all   Expires February 2 , 2012          [Page 12]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

      - Advertisement of nets towards a neighbor not supporting route
      priority does not change. The routes are advertised with the
      preference that net-priority indicates.

      Note that if routes are advertised with the order determined by its
      own net-priority to a downstream speaker not supporting route
      prioritization, there is a high probability that that this speaker
      will process those routes with the same (or approximate) order that
      it received them, since most likely it will treat them in a FIFO or
      quasi-FIFO fashion. Thus, introducing a single speaker supporting
      route prioritization upstream in the network can significantly
      increase the overall prioritization across the entire route
      propagation path.

   6. Rationale behind network wide priorities

      This proposal develops a comprehensive use of a network wide
      priority as a method to give preferential treatment to some routes.
      Out of all the possible design alternatives, the choices were based
      in flexibility, performance and stability. Amongst them, the
      following ones can be pointed out:

     -  PECs can be used to signal path-priorities for unreachable NLRIs
        (aka withdraws). In an implementation without priorities, any
        attributes are meaningless when associated to unreachable NLRIs,
        but there is nothing in the BGP protocol rules ([RFC4271]) to
        prevent its use. Note that implementations could use other
        attributes (besides PECs) associated to unreachable NLRIs.

     -  An implementation SHOULD send one and one only PEC, but it SHOULD
        also accept multiple PECs or no PECs at all. With only "good
        behaved" implementations and configurations, this precaution is
        not necessary; but the proposal's designs provisions for it under
        the philosophy "be liberal with what you receive, be conservative
        with what you send".

     -  When a net with a net-priority=0 is sent, the options are to set
        PEC explicitly (PEC=0) or implicitly (PEC=NULL). Both options are
        equally valid and there is not a chance for confusion. Consider,
        however, the case where the nets coming from two speakers, one
        supporting route priority and one not supporting it. They traverse
        a transparent speaker (i.e. one that just forwards nets with the

   Calabria - Alcaide et all   Expires February 2 , 2012          [Page 13]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

        PECs it received). In this case, confusion is possible: a router
        downstream using route prioritization won't be able to distinguish
        the two set of routes (and it's possible that its requirements
        dictate to differentiate both cases). The drawbacks of using an
        explicit PEC=0 is that some extra bytes need to be added to the
        update messages of the lowest net-priority routes, and that more
        update messages might be transmitted (consider the case above,
        where a transparent speaker sends routes with both PEC=0 and
        PEC=NULL: these routes cannot be packed in the same message).

     -  It's desirable for a given prefix to have the same priority across
        the network. Propagating the priority of the best-path maximizes
        the chances of this happening. There is no absolute guarantee,
        however, since not all the speakers have to select the same best-
        path, according to BGP propagation and best-path selection rules

     -  When path-priorities are different for a given net, a different
        approach could have been chosen to determine net-priority (other
        than using the path-priority of a best-path). An alternative
        method, however, could potentially create a chicken-and-egg
        situation. Consider, for instance, a proposal that chooses as net-
        priority the higher path-priority of all the paths. Consider also
        the case of two speakers back to back, mutually advertising routes
        for a given prefix between them, none of them using the other's
        route a best-path. The mutually advertised routes could have a
        higher priority than the best-paths. This would be a self-
        sustained state that would remain no matter what other PECs are
        received from other peers.

   7. Security Considerations

      This document introduces no new security concerns to BGP or other
      specifications referenced in this document.

   8. IANA Considerations


   Calabria - Alcaide et all   Expires February 2 , 2012          [Page 14]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011

   9. References

    [RFC4271] Rekhter, Y., and T. Li, "A Border Gateway Protocol 4 (BGP-
      4)", RFC 4271, January 2006.

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
      Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC4360] Sangli, et all "BGP Extended Communities Attribute, RFC 4360,
      February 2006

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
      Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2234] Crocker, D. and Overell, P.(Editors), "Augmented BNF for
      Syntax Specifications: ABNF", RFC 2234, Internet Mail Consortium and
      Demon Internet Ltd., November 1997.

   Copyright (c) 2011 IETF Trust and the persons identified as authors of
      the code. All rights reserved.

      Redistribution and use in source and binary forms, with or without
      modification, is permitted pursuant to, and subject to the license
      terms contained in, the Simplified BSD License set forth in Section
      4.c of the IETF Trust's Legal Provisions Relating to IETF Documents

   10. Authors' Addresses

      Juan Alcaide
      7025 Kit Creek Rd RTP-NC 27709

      Fernando Calabria
      7025 Kit Creek Rd RTP-NC 27709

   Calabria - Alcaide et all   Expires February 2 , 2012          [Page 15]

Internet-Draft    draft-idr-bgp-prefix-priorization-00.txt     September 2011


   Calabria - Alcaide et all   Expires February 2 , 2012          [Page 16]

Html markup produced by rfcmarkup 1.126, available from https://tools.ietf.org/tools/rfcmarkup/