Hello, fellow VyOS community members, this is Christian again!
As promised in my last post about BGP L2VPN/EVPN support via VXLAN transport, this post is one of the announced follow-ups. Today's post is all about how to build a multi-tenant capable service provider network leveraging only open-source solutions.
To get this lab to work, you will need the latest VyOS 1.4 rolling release, as certain required features were only added as of the 2021-04-11 build.
You might ask yourself — I have read those weird route type number over and over again, but why should I care? What exactly is this number and is higher better?
This all goes back to 2015 and BGP MPLS-Based Ethernet VPN (RFC 7432).
So far we learned about type 2 and type 3 prefixes from the previous blog post.
So the answer to the question from the beginning is "no": higher is not better, it's simply a different kind of information that is carried across the EVPN network itself.
After writing the initial blog post of the series (BGP L2VPN/EVPN support via VXLAN transport), I asked myself what other scenario I can build using the new EVPN based feature everyone is talking about in the last couple of years — hopefully I am not too late.
All configurations leveraged in this post are already a part of the 1.4 rolling release. By the time of this writing the 1.4 release train (named Sagitta after the Latin word for "arrow") is the development train, so please expect unknown behavior — or let's rephrase that, bugs 🐞.
The lab is built on top of EVE-NG which I use for my networking experiments, it's always good to have an infrastructure you can break on purpose.
The idea behind this post is to replicate a use-case from a mid-size telco provider. This telco provides isolates layer 3 connectivity (L3VPN) through their backbone. I got the inspiration for this article while crossing this post from @toprankinrez during a late-night VyOS development session.
I spun up a new lab in EVE-NG, which represents this as the "Foo Bar - Service Provider Inc." that has 3 points of presence (PoP) in random datacenters/sites named PE1, PE2, and PE3. Each PoP aggregates at least two customers.
I named the customers blue, red and green which is common practice in VRF (Virtual Routing and Forwarding) documentation scenarios.
A brief excursion into VRFs: This has been one of the longest-standing feature requests of VyOS (dating back to 2016) which can be described as "a VLAN for layer 2 is what a VRF is for layer 3". With VRFs, a router/system can hold multiple, isolated routing tables on the same system. If you wonder what's the difference between multiple tables that people used for policy-based routing since forever, it's that a VRF also isolates connected routes rather than just static and dynamically learned routes, so it allows NICs in different VRFs to use conflicting network ranges without issues.
VyOS 1.3 added initial support for VRFs (including IPv4/IPv6 static routing) and VyOS 1.4 now enables full dynamic routing protocol support for OSPF, IS-IS, and BGP for individual VRFs.
The lab I built is using a VRF (called mgmt) to provide out-of-band SSH access to the PE (Provider Edge) routers.
set interfaces ethernet eth0 address '192.0.2.59/27'
set interfaces ethernet eth0 address '2001:db8:ffff::59/64'
set interfaces ethernet eth0 description 'out-of-band management'
set interfaces ethernet eth0 vrf 'mgmt'
set service ssh vrf 'mgmt'
set system name-server 192.0.2.254
set system name-server 2001:db8::1
set system ntp vrf 'mgmt'
set system ntp listen-address '192.0.2.59'
set system ntp listen-address '2001:db8:ffff::59'
set system ntp server 192.0.2.251
set system ntp server 2001:1578:200:ffff::2
set vrf name mgmt protocols static route 0.0.0.0/0 next-hop 192.0.2.62
set vrf name mgmt protocols static route6 ::/0 next-hop 2001:db8:ffff::1
set vrf name mgmt table '1000'
With having the router management separated into a dedicated VRF, there is a smaller chance that the management network is affected by routing issues.
We use the following network topology in this example. The topology, again, runs on EVE-NG. If you like the VyOS router icon, you can get it here.
I chose to run OSPF as the IGP (Interior Gateway Protocol). All required BGP sessions are established via a dummy interfaces (similar to the loopback, but in Linux you can have only one loopback, while there can be many dummy interfaces) on the PE routers. In case of a link failure, traffic is diverted in the other direction in this triangle setup and BGP sessions will not go down. One could even enable BFD (Bidirectional Forwarding Detection) on the links for a faster failover and resilience in the network.
Regular VyOS users will notice that the BGP syntax has changed in VyOS 1.4 from even the prior post about this subject. This is due to T1711, where it was finally decided to get rid of the redundant BGP ASN (Autonomous System Number) specification on the CLI and move it to a single leaf node (set protocols bgp local-as).
It's important to note that all your existing configurations will be migrated automatically on image upgrade. Nothing to do on your side.
PE1
set interfaces dummy dum0 address '172.29.255.1/32'
set interfaces ethernet eth1 address '172.29.0.2/31'
set interfaces ethernet eth1 description 'link to pe2'
set interfaces ethernet eth1 mtu '1600'
set interfaces ethernet eth3 address '172.29.0.6/31'
set interfaces ethernet eth3 description 'link to pe3'
set interfaces ethernet eth3 mtu '1600'
set protocols ospf area 0 network '172.29.0.2/31'
set protocols ospf area 0 network '172.29.0.6/31'
set protocols ospf interface eth1 network 'point-to-point'
set protocols ospf interface eth3 network 'point-to-point'
set protocols ospf log-adjacency-changes detail
set protocols ospf parameters abr-type 'cisco'
set protocols ospf parameters router-id '172.29.255.1'
set protocols ospf passive-interface 'default'
set protocols ospf passive-interface-exclude 'eth1'
set protocols ospf passive-interface-exclude 'eth3'
set protocols ospf redistribute connected
set protocols bgp address-family l2vpn-evpn advertise ipv4 unicast
set protocols bgp address-family l2vpn-evpn advertise-all-vni
set protocols bgp local-as '100'
set protocols bgp neighbor 172.29.255.2 peer-group 'ibgp'
set protocols bgp neighbor 172.29.255.3 peer-group 'ibgp'
set protocols bgp parameters default no-ipv4-unicast
set protocols bgp parameters log-neighbor-changes
set protocols bgp parameters router-id '172.29.255.1'
set protocols bgp peer-group ibgp address-family l2vpn-evpn
set protocols bgp peer-group ibgp remote-as '100'
set protocols bgp peer-group ibgp update-source 'dum0'
PE2
set interfaces dummy dum0 address '172.29.255.2/32'
set interfaces ethernet eth1 address '172.29.0.3/31'
set interfaces ethernet eth1 description 'link to pe1'
set interfaces ethernet eth1 mtu '1600'
set interfaces ethernet eth2 address '172.29.0.4/31'
set interfaces ethernet eth2 description 'link to pe3'
set interfaces ethernet eth2 mtu '1600'
set protocols ospf area 0 network '172.29.0.2/31'
set protocols ospf area 0 network '172.29.0.4/31'
set protocols ospf interface eth1 network 'point-to-point'
set protocols ospf interface eth2 network 'point-to-point'
set protocols ospf log-adjacency-changes detail
set protocols ospf parameters abr-type 'cisco'
set protocols ospf parameters router-id '172.29.255.2'
set protocols ospf passive-interface 'default'
set protocols ospf passive-interface-exclude 'eth1'
set protocols ospf passive-interface-exclude 'eth2'
set protocols ospf redistribute connected
set protocols bgp address-family l2vpn-evpn advertise ipv4 unicast
set protocols bgp address-family l2vpn-evpn advertise-all-vni
set protocols bgp local-as '100'
set protocols bgp neighbor 172.29.255.1 peer-group 'ibgp'
set protocols bgp neighbor 172.29.255.3 peer-group 'ibgp'
set protocols bgp parameters default no-ipv4-unicast
set protocols bgp parameters log-neighbor-changes
set protocols bgp parameters router-id '172.29.255.2'
set protocols bgp peer-group ibgp address-family l2vpn-evpn
set protocols bgp peer-group ibgp remote-as '100'
set protocols bgp peer-group ibgp update-source 'dum0'
PE3
set interfaces dummy dum0 address '172.29.255.3/32'
set interfaces ethernet eth2 address '172.29.0.5/31'
set interfaces ethernet eth2 description 'link to pe2'
set interfaces ethernet eth2 mtu '1600'
set interfaces ethernet eth3 address '172.29.0.7/31'
set interfaces ethernet eth3 description 'link to pe1'
set interfaces ethernet eth3 mtu '1600'
set protocols ospf area 0 network '172.29.0.4/31'
set protocols ospf area 0 network '172.29.0.6/31'
set protocols ospf interface eth2 network 'point-to-point'
set protocols ospf interface eth3 network 'point-to-point'
set protocols ospf log-adjacency-changes detail
set protocols ospf parameters abr-type 'cisco'
set protocols ospf parameters router-id '172.29.255.3'
set protocols ospf passive-interface 'default'
set protocols ospf passive-interface-exclude 'eth3'
set protocols ospf passive-interface-exclude 'eth2'
set protocols ospf redistribute connected
set protocols bgp address-family l2vpn-evpn advertise ipv4 unicast
set protocols bgp address-family l2vpn-evpn advertise-all-vni
set protocols bgp local-as '100'
set protocols bgp neighbor 172.29.255.1 peer-group 'ibgp'
set protocols bgp neighbor 172.29.255.2 peer-group 'ibgp'
set protocols bgp parameters default no-ipv4-unicast
set protocols bgp parameters log-neighbor-changes
set protocols bgp parameters router-id '172.29.255.3'
set protocols bgp peer-group ibgp address-family l2vpn-evpn
set protocols bgp peer-group ibgp remote-as '100'
set protocols bgp peer-group ibgp update-source 'dum0'
It's time to test the router-to-router reachability; in this example we test from PE1 to PE2
cpo@PE1:~$ ping 172.29.255.2 count 1
PING 172.29.255.2 (172.29.255.2) 56(84) bytes of data.
64 bytes from 172.29.255.2: icmp_seq=1 ttl=64 time=11.8 ms
--- 172.29.255.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 11.769/11.769/11.769/0.000 ms
... and PE1 to PE3.
cpo@PE1:~$ ping 172.29.255.3 count 1
PING 172.29.255.3 (172.29.255.3) 56(84) bytes of data.
64 bytes from 172.29.255.3: icmp_seq=1 ttl=64 time=1.34 ms
--- 172.29.255.3 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.344/1.344/1.344/0.000 ms
Once all routers can be safely remotely managed and the core network is operational, we can now setup the tenant networks.
Every tenant is assigned an individual VRF that would support overlapping address ranges for customers blue, red and green. In our example, we do not use overlapping ranges to make it easier when showing debug commands.
Thus you can easily match it to one of the devices/networks below.
Every router that provides access to a customer network needs to have the customer network (VRF + VNI) configured. To make our own lives easier, we utilize the same VRF table id (local routing table number) and VNI (Virtual Network Identifier) per tenant on all our routers.
Configuration must be done on every router; please change the VXLAN interface source-address accordingly.
set interfaces vxlan vxlan2000 mtu '1500'
set interfaces vxlan vxlan2000 parameters nolearning
set interfaces vxlan vxlan2000 port '4789'
set interfaces vxlan vxlan2000 source-address '172.29.255.1'
set interfaces vxlan vxlan2000 vni '2000'
set vrf name blue protocols bgp address-family ipv4-unicast redistribute connected
set vrf name blue protocols bgp address-family l2vpn-evpn advertise ipv4 unicast
set vrf name blue protocols bgp local-as '100'
set vrf name blue table '2000'
set vrf name blue vni '2000'
Given the above configuration, we now have the individual tenant VRF available on our local system. To provide a real-world access port to the tenant, we must also link this to the customer port on the PE routers.
For PE1 assign eth4 to tenant blue, eth5 to tenant red and eth6 to tenant green
.
set interfaces bridge br2000 address '10.1.1.1/24'
set interfaces bridge br2000 description 'customer blue'
set interfaces bridge br2000 member interface eth4
set interfaces bridge br2000 member interface vxlan2000
set interfaces bridge br2000 vrf 'blue'
set interfaces bridge br3000 address '10.2.1.1/24'
set interfaces bridge br3000 description 'customer red'
set interfaces bridge br3000 member interface eth5
set interfaces bridge br3000 member interface vxlan3000
set interfaces bridge br3000 vrf 'red'
set interfaces bridge br4000 address '10.3.1.1/24'
set interfaces bridge br4000 description 'customer green'
set interfaces bridge br4000 member interface eth6
set interfaces bridge br4000 member interface vxlan4000
set interfaces bridge br4000 vrf 'green'
set interfaces ethernet eth4 description 'customer blue'
set interfaces ethernet eth5 description 'customer red'
set interfaces ethernet eth6 description 'customer green'
For PE2 assign eth4 to tenant blue, eth5 to tenant red
.
set interfaces bridge br2000 address '10.1.2.1/24'
set interfaces bridge br2000 description 'customer blue'
set interfaces bridge br2000 member interface eth4
set interfaces bridge br2000 member interface vxlan2000
set interfaces bridge br2000 vrf 'blue'
set interfaces bridge br3000 address '10.2.2.1/24'
set interfaces bridge br3000 description 'customer red'
set interfaces bridge br3000 member interface eth5
set interfaces bridge br3000 member interface vxlan3000
set interfaces bridge br3000 vrf 'red'
set interfaces ethernet eth4 description 'customer blue'
set interfaces ethernet eth5 description 'customer red'
For PE2 assign eth4 to tenant blue, eth5 to tenant red
.
set interfaces bridge br2000 address '10.1.3.1/24'
set interfaces bridge br2000 description 'customer blue'
set interfaces bridge br2000 member interface eth4
set interfaces bridge br2000 member interface vxlan2000
set interfaces bridge br2000 vrf 'blue'
set interfaces bridge br4000 address '10.3.3.1/24'
set interfaces bridge br4000 description 'customer green'
set interfaces bridge br4000 member interface eth6
set interfaces bridge br4000 member interface vxlan4000
set interfaces bridge br4000 vrf 'green'
set interfaces ethernet eth4 description 'customer blue'
set interfaces ethernet eth6 description 'customer green'
As the above configuration is very complex, at least in the beginning, and might also be frustrating if something does not work out as planned this is now a good time for a coffee break!
You managed to come this far, now we want to see the network and routing tables in action.
Show routes for all VRFs
cpo@PE1:~$ show ip route vrf all
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
VRF blue:
K>* 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 03:04:29
C>* 10.1.1.0/24 is directly connected, br2000, 03:03:46
B>* 10.1.2.0/24 [200/0] via 172.29.255.2, br2000 onlink, weight 1, 03:02:58
B>* 10.1.3.0/24 [200/0] via 172.29.255.3, br2000 onlink, weight 1, 02:09:54
VRF default:
O 172.29.0.2/31 [110/1] is directly connected, eth1, weight 1, 03:03:09
C>* 172.29.0.2/31 is directly connected, eth1, 03:04:01
O>* 172.29.0.4/31 [110/2] via 172.29.0.3, eth1, weight 1, 02:10:06
* via 172.29.0.7, eth3, weight 1, 02:10:06
O 172.29.0.6/31 [110/1] is directly connected, eth3, weight 1, 03:03:09
C>* 172.29.0.6/31 is directly connected, eth3, 03:03:42
C>* 172.29.255.1/32 is directly connected, dum0, 03:04:14
O>* 172.29.255.2/32 [110/20] via 172.29.0.3, eth1, weight 1, 03:02:59
O>* 172.29.255.3/32 [110/20] via 172.29.0.7, eth3, weight 1, 02:10:05
VRF green:
K>* 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 02:48:21
C>* 10.3.1.0/24 is directly connected, br4000, 02:44:42
B>* 10.3.3.0/24 [200/0] via 172.29.255.3, br4000 onlink, weight 1, 02:09:47
VRF mgmt:
S>* 0.0.0.0/0 [1/0] via 192.0.2.62, eth0, weight 1, 02:48:21
K * 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 03:04:28
C>* 192.0.2.32/27 is directly connected, eth0, 03:03:54
VRF red:
K>* 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 03:04:28
C>* 10.2.1.0/24 is directly connected, br3000, 03:03:50
B>* 10.2.2.0/24 [200/0] via 172.29.255.2, br3000 onlink, weight 1, 03:02:57
Information about Ethernet Virtual Private Networks
cpo@PE1:~$ show bgp l2vpn evpn
BGP table version is 1, local router ID is 172.29.255.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[ESI]:[EthTag]:[IPlen]:[VTEP-IP]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]
Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 10.1.1.1:3
*> [5]:[0]:[24]:[10.1.1.0]
172.29.255.1 0 32768 ?
ET:8 RT:100:2000 Rmac:50:00:00:01:00:04
Route Distinguisher: 10.1.2.1:3
*>i[5]:[0]:[24]:[10.1.2.0]
172.29.255.2 0 100 0 ?
RT:100:2000 ET:8 Rmac:50:00:00:02:00:04
Route Distinguisher: 10.1.3.1:3
*>i[5]:[0]:[24]:[10.1.3.0]
172.29.255.3 0 100 0 ?
RT:100:2000 ET:8 Rmac:50:00:00:03:00:04
Route Distinguisher: 10.2.1.1:2
*> [5]:[0]:[24]:[10.2.1.0]
172.29.255.1 0 32768 ?
ET:8 RT:100:3000 Rmac:26:40:76:10:4a:e4
Route Distinguisher: 10.2.2.1:2
*>i[5]:[0]:[24]:[10.2.2.0]
172.29.255.2 0 100 0 ?
RT:100:3000 ET:8 Rmac:50:00:00:02:00:05
Route Distinguisher: 10.3.1.1:4
*> [5]:[0]:[24]:[10.3.1.0]
172.29.255.1 0 32768 ?
ET:8 RT:100:4000 Rmac:50:00:00:01:00:06
Route Distinguisher: 10.3.3.1:2
*>i[5]:[0]:[24]:[10.3.3.0]
172.29.255.3 0 100 0 ?
RT:100:4000 ET:8 Rmac:50:00:00:03:00:06
If we need to retrieve information about a specific host/network inside the EVPN network we need to run
cpo@PE2:~$ show bgp l2vpn evpn 10.3.1.10
BGP routing table entry for 10.3.1.1:4:[5]:[0]:[24]:[10.3.1.0]
Paths: (1 available, best #1)
Not advertised to any peer
Route [5]:[0]:[24]:[10.3.1.0] VNI 4000
Local
172.29.255.1 (metric 20) from 172.29.255.1 (172.29.255.1)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: RT:100:4000 ET:8 Rmac:50:00:00:01:00:06
Last update: Sat Apr 10 08:19:43 2021
For reference, there is also a WireShark trace available.
Given that this is a very "dry" and "theoretical" topic I really hope that the glimpse of information I provided can help you understand what some service providers out there use/do to provide IP (L3VPN) or ethernet (L2VPN) connectivity to their customers.
We now have a standards-compliant layer 3 overlay implementation available on our favorite open source routing platform. For future references, you can find the entire PE1 configuration as part of our daily build tests here.
I also want to express my thanks to FRRouting for providing such a good control plane and AS12817 GeFoekoM e.V. which hosts my rather huge EVE-NG lab instance!
Stay tuned, I'm already thinking of another topic I want to explore using VyOS!