homepage security


Border Gateway Protocol (BGP) cheat sheet


RFC 1771 BGPv4
RFC 1772 an application of BGP in the Internet
RFC 1773 Experience with BGP-4
RFC 1774 BGP-4 protocol analysis
RFC 1863 BGP/IDRP
RFC 1930 guidelines for creation, selection, and registration of an AS
RFC 1965 BGP4 confederations
RFC 1997 BGP communities attribute
RFC 1998 an application of the BGP community attribute in multi-home Routing
RFC 2042 registering new BGP attribute types
RFC 1745 OSPF interactions
RFC 2439 route flap dampening
RFC 2796 route reflection
RFC 2842 Capabilities Advertisement
RFC 2918 Route Refresh Capability
RFC 1269 Managed Objects for BGP
RFC 2385 BGP Session Protection via TCP MD5
RFC 2283 Multiprotocol Extensions for BGP-4

TCP port 178

~> path vector protocol used to carry routing info between Autonomous System (AS)
AS: set of routers under a single technical administration
create loop-free interdomain routing between AS

AS: 16 bit number: 1-65535 (private AS: 64512-65535)

2 BGP speakers form a TCP connection and exchange messages to open and confirm the connection parameters.

Any 2 routers that have formed a TCP connection in order to exchange BGP routing information are called peers or neighbors

BGP routers exchange network reachability information, called path vector/attibute
This info is mainly an indication of the full paths (BGP AS numbers) that a route should take in order to reach the dest net.
This info helps in constructing of ASs that are loop-free and where routing policies can be applied in order to enforce some restrictions on the routing behavior

BGP assume that its communication is reliable, it doesn’t implement any retransmission or error recovery mech.
--> 2 routers speaking BGP established a TCP connection /w one another & exchange msg to open & confirm the connection parameters <~ peer routers / neighbors

BGP peers


When the connection is made, BGP peers initially exchange their full BGP routing tables.
After this exchange, BGP routers need only send incremental updates as the routing table changes

Periodic routing update are also not required on a reliable link, so triggered updates are used <~ BGP sends keepalive msgs

UPDATE msg:
Contains (among other things): list of <length, prefix> tuples that indicate list of dst that can be reached via BGP speaker, also contain path attributes.
In the event that a route become unreachable: BGP speakers informs its neighbors by withdraw  invalid route <- withdraw routes are part of UPDATE msg.


BGP keeps a version of the BGP table, which should be the same for all of its peers. The version changes whenever BGP updates the table due to routing info changes.
Periodic routing updates are also not required, so triggered updates are used.
BGP sends keepalive messages.


BGP routers exchange network reachability information  (path vector) made up of path attributes, including the list of the full path (of BGP AS numbers) that a route should take to reach a dest net.
This path info is used in constructing a graph of ASs that is loop-free.
BGP router will not accept a routing update that already includes its AS number in the path list (this would mean that the update has already passed through its AS <- accepting it again would result in a routing loop).
Routing policies can also be applied to the path of BGP AS numbers to enforce some restrictions on the routing behavior.



when Use BGP:
1. the AS has multiple connection to other ASs
2. the AS allow packets to transit through it to reach other ASs
3. flow of traffic entering and leaving the AS must be manipulated

when not to use BGP:
1. a single connection to the Internet or another AS
2. no concern for routing policy and route selection
3. lack of memory or processor power on router
4. limited understanding of route filtering and BGP path selection process
5. low bandwidth between AS


History

BGP history

BGP Messages Types

1. OPEN - starts the BGP peering session
2. UPDATE - transfer all necessary routing info between peers
3. NOTIFICATION - stops BGP peering session
4. KEEPALIVE - keeps the session active so that the established peers know that each other is still alive

~ max msg 4096 bytes, min msg 19 bytes
~ fix header (19 bytes)


1. OPEN

After transport connection has been established, OPEN mgs is sent.

OPEN msg:
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+
       |    Version    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |     My Autonomous System      |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |           Hold Time           |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                        BGP Identifier                         |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Opt Parm Len                                                  |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                      Optional Parameters                      |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


> Version
   - 1-octet unsigned integer
   - ver 4

> AS
   - 2-octet unsigned integer

> Hold Time
   - number of seconds that the sender propose for the value of the holdtimer
     ~> uses the smaller of its configured holdtimer and the holdtimer received in the OPEN msg
   - at least 6 sec (in JUNOS)

> BGP Identifier
   - 4-octet unsigned integer
   - BGP identifier of the sender, typically router's loopback IP address

> Optional Parameters Length
   - 1-octet unsigned integer
   - total value of the Optional Parameters field in octets

> Optional Parameters
   - each parameter is encoded as a <Parameter Type, Parameter Length, Parameter Value> triplet
   - Parameter Type: 1-octet unsigned integer, unambiguously identifies individual parameters
   - Parameter Length: 1-octet unsigned integer, contains the length of Parameter Value field in octets
   - Parameter Value: variable length,  is interpreted according to the value of the Parameter Type field

+ Capabilities (RFC 2842): uses the Optional Parameter Type 2 to facilitate new features into BGPv4.


2. UPDATE

Responsible for disseminating all the necessary routing info between BGP speakers
Also responsible for withdrawing routing info
Include info about BGP metric <~ path attributes

Contain all the active prefixes and their related attributes

Can contain feasible routes, withdrawn routes, path attributes and appropriate Network Layer Reachability Information (NLRI)

UPDATE msg:
      +-----------------------------------------------------+
      |   Unfeasible Routes Length (2 octets)               |
      +-----------------------------------------------------+
      |  Withdrawn Routes (variable)                        |
      +-----------------------------------------------------+
      |   Total Path Attribute Length (2 octets)            |
      +-----------------------------------------------------+
      |    Path Attributes (variable)                       |
      +-----------------------------------------------------+
      |   Network Layer Reachability Information (variable) |
      +-----------------------------------------------------+


> Unfeasible Routes Length
   - 2-octet unsigned integer
   - indicate the total length of the Withdrawn Routes field in octets
   - 0: no routes are being withdrawn

> Withdrawn Routes
   - variable length
   - list of IP prefixes for the routes that are being withdrawn
   - each route is encoded as a (Length, Prefix) tuple; where the Length is the number of bits in the subnetmask, the Prefix is the Ipv4 NLRI

> Total Path Attribute Length
   - 2-octet unsigned integer
   - indicate the total length of the Path Attributes field in octets
   - 0: no Network Layer Reachability Information

> Path Attributes
   - variable length
   - present in every UPDATE msg
   - each Path Attribute is encoded as a <Attribute Type, Attribute Length, Attribute Value> triplet

> Network Layer Reachability Information
   - variable length
   - list the routes advertised to the remote peer
   - each route is encoded as a (Length, Prefix) tuple; where the Length is the number of bits in the subnetmask, the Prefix is the Ipv4 NLRI


BGP updates msg: - include a variable length seq of path attributes describing the routes

3. NOTIFICATION

Notification msg is sent when a BGP router encounter an error that is fatal enough to warrant a termination of the peering session
-> sent to remote peer and immediately closes both BGP & TVP sessions

One common reason: intervention by user: manually clear a BGP connection <~ generate a cease

NOTIFICATION msg:
        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Error code    | Error subcode |           Data                |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+



> Error Code
   - 1-octet unsigned integer
   - type of Notification

> Error Subcode
   - 1-octet unsigned integer
   - more specific info, each error code might have one or more error subcodes associated with it

> Data/ description
   - variable length
   - reason for the Notification


Error Code
Error Subcode
Description
1

Msg Header Error

1
Connection Not Synchronized

2
Bad Msg Legth

3
Bad Msg Type
2

Open Msg error

1
Unsupported Version Number

2
Bad Peer AS

3
Bad BGP ID

4
Unsupported Optional Parameter

5
Authentication Failure

6
Unacceptable Holdtime

7
Unsupported Capability
3

Update Msg Error

1
Malformed Attribute List

2
Unrecognized Well-known Attribute

3
Missing Well-known Attribute

4
Attribute Flag Error

5
Attribute Length Error

6
Invalid ORIGIN Atrribute

7
AS Routing Loop

8
Invalid NEXT_HOP Attribute

9
Optional Attribute Error

10
Invalid Network Field

11
Malformed AS_PATH
4

Holdtime Expired
5

Finite-Machine Error
6

Ceased


4. KEEPALIVE

 Keep the BGP peering session alive

- contain only the 19 octet msg header and no other data
- exchange at  one-third the negotiated hold-time value
- advertisement of an Update msg /win the keepalive period reset the time to 0


> Marker
   - 16 octet field
   - contain a value that the receiver of the msg can predict
   - can be used to detect lost of synchronization between a pair of BGP peers, and to authenticate incoming BGP msg

> Length
   - 2-octet unsigned integer
   - indicates the total length of the msg, including header, in octets

> Type
   - 1-octet unsigned integer
   - indicate the type code of the msg


Type Code
Msg Type
1
OPEN
2
UPDATE
3
NOTIFICATION
4
KEEPALIVE



BGP Attributes

Attributes are set of parameters that describe the various path characteristics of a path to a dest IP prefix.  They are used extensively in the route selection process to choose the best of multiple routes, and to build routing policies by matching and setting attributes.

BGP send BGP update msg about dst net -> include info about BGP metric <~ path attributes

There are four categories of attributes:

* well-known: 
  - all BGP implementations must recognize
  - propagated to BGP neighbors

       1. well-know mandatory:
           - must appear in the description of a route
       ~> must present in an UPDATE msg and must be implemented by all BGP speakers

       2 .well-know discretionary:
           - does not need to appear in a route description
       ~> may be present in UPDATE msg and must be implemented by all BGP speakers


* optional:
  - need not be supported by all BGP implementations
  - may be propagated to BGP neighbors

       3. optional transitive:
           - attribute that is not implemented in a router should be passed to other BGP routers untouched
        ~> may be present in UPDATE msg and may be implemented by a BGP speaker; passed on to other BGP speakers, even if it is not understood by the local BGP speaker
           - only optional transitive attributes may be marked as partial

       4. optional nontransitive:
           - attribute must be deleted by a router that has not implemented the attribute
       ~> may be present in UPDATE msg and may be implemented by a BGP speaker; ignored and not passed on to other BGP speakers, if it not understood by the local BGP speaker


    * Well-known Mandatory Attributes:
          o Next Hop
          o AS Path
          o Origin
    * Well-known Discretionary Attributes:
          o Local Preference (32 bits)
          o Atomic Aggregate
    * Optional Transitive Attributes:
          o Aggregator
          o Community
          o Extended Communities
    * Optional Non-transitive Attributes:
          o MED - 'Multi Exit Discriminator' or 'Metric' (32 bits)
          o ORIGINATOR_ID
          o CLUSTER_LIST
          o MP_REACH_NLRI
          o MP_UNREACH_NLRI

BGP attributes


Attribute Format:

BGP updates msg include a variable length seq of path attributes describing the routes
consist of 3 fields:
1. attribute type: 1-byte attribute flags fields & 1-byte attribute code field
2. attribute length
3. attribute value

0                                                           1
0     1     2     3     4     5     6     7     8     9     0     1     2     3     4     5
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| W/0 | N/T | C/PEL |     unused (4 bits)   |             Attribute Type Code               |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|     Attribute Length Code (variable)         |         Attribute Value (variable)             |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+

bit 0: Well-known or Optional
bit 1: Non-transitive or Transitive
bit 2: Complete or Partial
bit 3: Extended Length, defines if the attribute length is 1 byte or 2 bytes
bin 4-7: unused & set to 0
Attribute Type Code: type code of the attribute
Attribute Length: length of the attribute

- Complete: attribute was passed along entire path
- Partial: a router in the path did not implement an attribute, routing information may be lost (unlikely)
 

Type Code
 Attribute
Type
Preference
1
ORIGIN
well-known mandatory lowest origin IGP<EGP<incomplete
2
AS_PATH
well-known mandatory shortest path
3
NEXT_HOP
well-know mandatory shortest path or IGP metrci
4
MED
optional nontransitive lowest value
5
LOCAL_PREF
well-known discretionary highest value
6
ATOMIC_AGGREGATE
well-known discretionary info not used in path selection
7
AGGREATOR
optional transitive info not used in path selection
8
COMMUNITY
optional transitive info not used in path selection
9
ORIGINATOR ID
optional nontransitive info not used in path selection
10
CLUSTER_LIST
optional nontransitive info not used in path selection
11
DPA


12
ADVERTISER


13
RCID_PATH / CLUTER_ID


14
MP_REACH_NLRI
optional nontransitive
15
MP_UNREACH_NLRI
optional nontransitive
16
EXTENTED COMMUNITY
optional transitive
17
WEIGHT
cisco-defined
highest value (1st thing to check in Cisco routers)


AS_PATH attribute (well-known mandatory)
- Type Code: 2
- contains a list of the ASs the prefix has traversed
- also provides info for loop detection by looking for local system’s AS in the path
- a BGP speaker’s own AS is prepended to the AS path when advertised with EBGP, but not with IBGP
- AS that advertised the prefix will always be at the left side of an AS path, transit ASs will be in the middle, and originating AS will be at the end of AS path


NEXT_HOP attribute (well-know mandatory)
- Type Code: 3
- IP address of the router which should be used as the BGP next hope to the dest
- router makes a recursive lookup to find BGP next hop in the routing table
- usually IGP has next hop route to BGP next hop, or it could be a static route or directly connected interface
- this next hop must be reachable
> rule;
1. EBGP sets the next hop addr to the IP addr of the peer that advertise the prefix
2. IBGP sets the next hop addr to the IP addr of the peer that advertise the prefix for routes that originates internally
3. IBGP passes the next hop unaltered for prefixes that are learned with EBGP


LOCAL_PREFERENCE attribute (well-known discretionary)
- Type Code 5
- provides an indication to routers in the AS about which path to exit the AS
- path with a higher local preference is preferred
- attribute that is configured on a router and exchanged only among routers within the same AS (cisco default value 100)


MED attribute (optional nontransitive)
- Type Code 4
- Multi-exit-discriminator, was known as the inter-AS attribute in BGP-3
- an indication to external neighbors about the preferred path into an AS
- dynamic way for an AS to influence another AS on which way it should choose to reach a certain route, if there are multiple entry points into an AS
- a lower metric is preferred
- unlike LOCAL_PREFERENCE,  MED is exchanged between AS,  carried into an AS but not passed to the next AS


ORIGIN attribute (well-known mandatory)
- Type Code 1
-  defines the origin of the path info
> it can have the following values:
       0: IGP: the prefix was learned through an IGP
       1: EGP: the prefix was learned through the EGP
       2: Incomplete: the origin of the route is unknown, usually the prefix was learned through redistribution or as an aggregate


COMMUNITIES attribute (optional transitive)
-Type Code 8
- provide a way to logically classify a prefix for use in the policies by attaching an ID that is significant within a network
- allow routers to tag routes with an ID and allow other routers to make decision based upon that tag
- tag: incoming, outgoing updates or route distribution
- one way to filter incoming or outgoing routes


WEIGHT attribute (Cisco only) 


ATOMIC _AGGREGATE  attribute  (well-known discretionary)
- Type Code 6
- is set if a router advertise an aggregate causes path attribute information to be lost
- if ATOMIC_AGGREGATE is set, then the prefix should not be disaggregated


AGGREGATOR attribute (optional transitive)
- Type Code 7
- specifies the router ID and AS of the router that originated an aggregate prefix
- useful for debugging


EXTENDED COMMUNITIES attribute (optional transitive)
- Type Code 16
- same concept as COMUNITIES attribute, but allow a wider range of values and have more defined structure
- 8 bytes in length compare 4 bytes of COMUNITIES







Tunning BGP

IP connectivity must be achieved via a protocol different from BGP -> otherwise the session will be in a race condition


Physical VS Logical connection:

Generally: for EBGP session, a route through a directly connected interface established IP reachability.

Indirectly connected of EBGP neighbors require extra config: multihop EBGP

Internal neighbours do not have this restriction whether the peers are physically connected or separated by multiple IP hops


BGP Continuity Inside an AS:

Aside from special case of route refection, in order to avoid  routing information loops inside of an AS, BGP does not readvertise to IBGP peers routes that are learned from other IBGP peers.

==> it is important to maintain a full IBGP mesh within the AS
==> every BGP router in the AS has to established a BGP session with all other BGP routers inside the AS


Synchronization Within an AS

Default behaviour: it must be synchronized with the IGP before BGP may be advertise transit routes to external ASs.
~> whenever a router receives an update about a dst from an IBGP peer, the router tries to verify for that dst before advertising it to other EBGP peers, by checking:
1. to see if a route to the next-hop router exist 
2. to see if a dst prefix in the IGP exists

Synchronization: BGP router should not advertise to external neighbour dst learned from IGBP neighbors unless dst are also known via IGP.

...but you can do no synchronization to override the synchronization requirement
most situation allow synchronization to be safely turned off on the border routers, assuming that all transit routers in the AS are running fully meshed IBGP.

By fare the most common config is to disable BGP synchronization and rely on a fully mesh of IBGP routers



Sources of Routing Update:

1.    Dynamically injected routes: come and go from the BGP routing table <~ depending on the status of the networks they identify.
2.    Statically injected routes: constantly maintained by the BGP routing tables <~ regardless of the status of the networks they identify.


Injecting Info Dynamically into BGP
1.    Purely dynamic: all the IGP routes are redistributed into BGP: redistribute subcmd
2.    Semidynamic: only certain IGP routes are to be injected into BGP: network subcmd

This distinction reflects both the level of user intervention and the level of ctrl in defining the routes to be advertised

Redistribution the whole IGP into BGP could result in some unwanted info being leaked into BGP (ex: private / illegal addrs)

Mutual redistribution: when redistribution occurs in both directions
-> info that was being injected from the outside into the AS could be sent back to the Internet as having originated from the AS.
Remedy: special filter should be put on the border routers to specify what particular networks should be injected from the IGP into BGP.

In Cisco implementation: ext OSPF routes are automatically blocked from being redistributed into BGP (it could be override). For protocols (such as RIP or IGRP) that do not distinguish between int and ext routes, special route tagging should be performed


Unstable routes:
Injecting the IGP routes into BGP dynamically or semidynamically -> route fluctuation within AS -> ripple effect through other networks attached to the Internet

Route dampening penalizes and ultimately discontinues advertisement of fluctuating routes, depending on their degree of instability

One way to minimize route instability is through aggregation

Aggregation could be done on the customer boundary or the provider boundary
If done on the customer boundary, alleviate the provider.
If done on the provider boundary, customer fluctuation would leak to provider but would not be propagated to the Internet.

Another way of ctrling route instability is to decouple route advertisement from the existence of the route itself : static injection of routes


Injecting Info Statically into BGP

Injecting info statically into BGP has proven to be the most effective method of ensuring route stability.

To statically inject info into BGP: IGP routes (or aggregates) that need to be advertised are manually defined as static routes <~ ensuring that these routes will never disappear

If a route is advertise to the Internet from a point:
advertising a route that is actually down is not a big issue.

If a route is advertise to the Internet from multiple points:
Advertising the route statically at all times might end up black-holing the traffic.

Advertisement can be done by redistributing all the static routes via the redistributecmd or subset of the static routes via the network cmd.


ORIGIN of Routes

BGP considers the network advertised via  network cmd or via aggregation as being internal to AS, and will include the ORIGIN attribute in each route as being IGP(i)

Whenever a route is injected into BGP via redistribution (whether statically or dynamically) the ORIGIN of the route will be INCOMPLETE, because the redistribution routes could have come from anywhere.

If the route was learned via EGP, the ORIGIN value will be assigned.

Aggregates routes will assume the worst ORIGIN value of all the component routes


Routes
ORIGIN value
Advertise via network cmd IGP(i)
Injected via redistribution (statically/dynamically) INCOMPLETE
Learned via EGP EGP
Aggregates routes
will assume the worst ORIGIN value




Overlapping Protocols: Backdoors

Backdoor links offer an alt IGP path that can be used instead of the ext IGP path.
IGP routes that can be reach over the backdoor links are called backdoor routes.

The lower a routing protocol's administrative distance -> the higher the preference


Routing Process

F68pg153

Pool of routes that the router receives from its peers
Input Policy Engine: filter the routes or manipulate their attributes
Decision process: decide which routes the router itself will use
Pool of routes that the router itself uses
Output Policy Engine: filter the routes or manipulate their attributes
Pool of routes that the router advertises to other peers



Adj-RouterInformationBase-In: Adj-RIB-in
-    logically associated with each individual peer of BGP speaker
-    it stores routing info that has been learned from the peer via inbound UPDATE msg
-    input to the BGP decision process after being manipulated/filtered by Input Policy Engine

Loc-RouterInformationBase: Loc-RIB
-    contains only preferred routes that have been selected as the best path to each available dst
-    the result of BGP decision process after incoming local policies have been applied by Input Policy Engine


Adj-RouterInformationBase-Out: Adj-RIB-Out
-    logically associated with each individual peer of BGP speaker
-    it stores routing info that the BGP speaker has selected for advertisement to peer after Output Policy Engine has been applied

pg155





to be continued...







homepage $Date: Wed Dec 29 17:30:14 CET 2004 $ © 2003-2004 Omar Gani