In an SD-WAN network, RR stands for Route Reflector. It is a control plane component that addresses the scalability limitations of the Border Gateway Protocol (BGP), which is used to exchange routing information between network sites. By acting as a central hub for BGP route distribution, the Route Reflector (RR) simplifies the network topology and reduces the number of BGP peerings required. This allows the network to scale far beyond the size that would be possible with a traditional full-mesh BGP design.
The challenge of the full-mesh iBGP design
The core problem the RR solves stems from a fundamental rule of the Internal BGP (iBGP), used for route exchange within a single Autonomous System (AS): an iBGP router will not re-advertise a route it learned from another iBGP peer to a third iBGP peer. This "split-horizon" rule prevents routing loops but forces a full-mesh topology in large iBGP networks, where every router must peer with every other router.
As the number of routers (Ncap N
š
) increases, the number of required peerings grows exponentially with the formula NĆ(Nā1)/2cap N cross open paren cap N minus 1 close paren / 2
šĆ(šā1)/2
. This quickly becomes unmanageable for large networks, consuming excessive administrative time, CPU, and memory resources.
How the Route Reflector simplifies SD-WAN
The RR acts as a central hub to break the full-mesh requirement. When an SD-WAN's Customer Premises Equipment (CPE) connects to the network, it establishes a BGP peering relationship with one or more designated RRs instead of all other CPEs. This changes the logical topology from a complex full-mesh to a much simpler hub-and-spoke model.
The RR's operation in an SD-WAN network follows a straightforward set of steps:
- Registration: A CPE establishes a secure control channel (often DTLS or SSL) with the RR and registers its presence on the network.
- Route advertisement: The CPE uses BGP to advertise its local network information and service routes to the RR. This includes details about its transport network connections (e.g., public IP, encapsulation methods).
- Route reflection: Upon receiving a route from one CPE (a "client"), the RR reflects (or re-advertises) that route to all other client CPEs.
- Data channel formation: After receiving the reflected routes from the RR, a CPE has the necessary information to directly establish a data plane tunnel with another CPE, allowing them to exchange service traffic. The RR orchestrates this process without being in the data path.
Benefits of using RRs in SD-WAN
-
Massive scalability: The RR model scales linearly with the number of sites (O(n)cap O open paren n close paren
š(š)
) instead of exponentially (O(n2)cap O open paren n squared close paren
š(š2)
). This allows the network to grow without the BGP overhead becoming unmanageable.
-
Simplified configuration: Centralizing route distribution at the RR simplifies configuration management. Network administrators only need to manage peering relationships with the RRs instead of configuring and troubleshooting peerings for every single site.
-
Efficient resource utilization: With fewer BGP peerings, less network bandwidth is consumed by BGP updates. On individual CPEs, less memory and CPU are needed to maintain and process a vast number of BGP sessions and route entries.
-
Policy enforcement: Because all route advertisements flow through the RR, network administrators can apply centralized policies to influence routing decisions. For example, the controller can instruct the RR not to reflect routes to certain non-client sites to prevent tunnel formation between them.
-
Centralized control: The RR is a key component of the SD-WAN's central control plane, working with a network controller to execute policies and orchestrate the overlay network.
Limitations and design considerations
While powerful, the RR model has important considerations:
- Single point of failure: A single RR creates a central point of failure. If it goes down, the entire BGP topology can collapse. Redundancy is critical, and SD-WAN solutions address this by deploying multiple RRs in a cluster.
- Sub-optimal routing: In some cases, the RR may select and advertise only a single "best path" to a destination, potentially causing sub-optimal routing for other devices. BGP features like "Add-Path" can mitigate this by allowing the RR to advertise multiple valid paths, but it requires careful design.
- Increased convergence time: Failure events can increase convergence time because the change in network state must first propagate to the RR, which must then re-evaluate the best path and reflect a new route to all clients. A full-mesh design, though impractical at scale, converges faster because each router has direct visibility of its peers.
RR in practice: An SD-WAN example
In a common SD-WAN architecture using Ethernet VPN (EVPN), the RR plays a specific role in managing EVPN routes.
- Transport Network Information: The RR and CPEs establish a Datagram Transport Layer Security (DTLS) management channel to exchange Transport Network Port (TNP) and Security Association (SA) information.
- BGP EVPN Peering: They then establish a BGP EVPN control channel, forming a BGP peer relationship.
- Tunnel Information Exchange: A CPE advertises a BGP SD-WAN route to the RR containing its TNP and SA parameters. The RR reflects this information to other CPEs.
- Overlay Tunnel Establishment: After receiving the reflected routes, CPEs at different sites can establish the necessary secure overlay tunnels (e.g., IPsec) directly between each other.
This process demonstrates how the RR leverages the BGP EVPN protocol to coordinate the control plane, enabling CPEs to build the data plane tunnels required for site-to-site communication, all while maintaining a simple and scalable network topology.