BGP Load Balancing Notes

Jon Spindler here.


I wanted to share some notes I created while exploring load-balancing capabilities of BGP.

Let's get started:


What if you want to do “EQUAL” Multi-Path?


IF the BGP path selection has reached Metric to Next Hop and both prefixes are still TIED, the prefixes now can be considered for BGP Load Balancing.


Equal-Cost load balancing can be performed across the paths by enabling the Multi-Path via


(config-router)#maximum-paths 2


This command will instruct the router to install both of the paths into the RIB & FIB and the paths should have an equal Traffic Share count when displayed in the Show IP Route & Show IP Cef x.x.x.x internal outputs.


What if you want to do “UN-EQUAL” Multi-Path?


The solution for this is the DMZ Link Bandwidth Feature.


Basically what this feature does is allow for proportional load balancing based of the metric of the eBGP peering link. It is enabled simply with one BGP Process-level command and one enabling it per-neighbor like so:


router bgp 123

bgp dmzlink-bw neighbor X.X.X.X

dmzlink-bw


IF you now take a look at the bgp table for a specific prefix learned via that eBGP peer, you will see new information regarding the bandwidth associated with that prefix advertisement.


It is basically like a new community value being appended to the prefix. These values are in proportion to the administrative bandwidth of a given eBGP peering interface. That said, they can be adjusted by simply changing the bandwidth manually on the peering interface.


Here is an example of what you will see with the new output of “Show ip bgp x.x.x.x” with some output omitted.


RTR#show ip bgp x.x.x.x


BGP routing table entry for x.x.x.x/32, version 51

Origin IGP, metric 0, localpref 100, valid, internal, multipath DMZ-Link Bw 12500 kbytes


This feature paired with enabling the eBGP Multi-Path for your desired amount of possible paths will allow un-equal load balancing to eBGP peers.


What if I want to Load Balance across UN-EQUAL paths being advertised to me from eBGP peers and other iBGP peers within my AS?


Before I mentioned how this new DMZ-Link BW feature is appended to the route like a community. That being said it can actually be sent between iBGP peers for the purposes of doing just this! There are a couple tweaks & considerations in order to make this possible.


1. You must remove your current Multi-Path command “Maximum-paths X”. This only enables Multi-Path for prefixes received from single-hop eBGP peers. If you’d like to receive DMZ-Link BW information from an iBGP peer with the intent of installing it in the RIB, you must change your Multi-Path command to include both eBGP & iBGP via “Maximum-Paths eibgp X”. Make sure that “X” is sufficient to account for all paths you wish to install in the RIB for load-balancing purposes.


2. You must exchange extended communities with your iBGP peer. This can be done with the “neighbor x.x.x.x send-community extended” command and should be agreed upon by both neighbors upon capabilities exchange. This will allow that iBGP peer to send you the DMZ Link BW information based on their eBGP peering toward the prefix. 3. The iBGP peer you wish to route through to provide an additional path must have the required DMZ Link BW commands issued for their specific eBGP peering.


Once all of this is in place, you should now see your iBGP peer listed in the BGP table for the given prefix like so:


x.x.x.x from neighbor (bgp router id) Origin IGP, metric 0, localpref 100, valid, internal, multipath DMZ-Link Bw 12500 kbytes -----(this is based on the iBGP peer’s eBGP peering link bandwidth)


Notes from some experts:


Number of points: Link BW community is an extended non-transitive community. Since it is non-transitive - it should not be passed to a eBGP peer (there were good reasons to limit it), I’m planning to work with the draft authors to address the limitation. Most implementations however don’t adhere to the draft, so it is important to address. DC case with eBGP underlay would be the most important case. Using weight attribute attached to a prefix to adjust hashing is an example of W(weighted)ECMP, not UCMP, in other words - unequal cost load balancing.

- Jeff Tantsura


Head of Networking Strategy @ Apstra

------------------------------------------------------

That's a good write up! One gotcha to consider is that as path has to be the same, not just the same length, for multipath to work by default. I wrote some stuff on this a while ago: https://www.alwaysnetworks.co.uk/blog/ebgp-ecmp-depth/

-Nick Shaw

Network Security Consultant and Director @ Always Networks

Washington DC | Support@NotLayer3.com