VXLAN BGP evpn Multipod and Multisite
Configure VXLAN BGP evpn Multipod and Multisite
This article uses the VXLAN BGP evpn config from my previous VXLAN BGP evpn article. I have done every configuration in this article with the Nexus version nxos.9.2.4.bin.
Multipod
Multipod is used when you want to connect two Datacenters together and use the same Control Plane. The connection is done from Spine to Spine.
You basically have to tell the Spines to not alter packets from the remote pod.
Spine1 and Spine2
int e1/3 #connection to remote spine
no switchport
mtu 9216
no ip redirects
ip add 100.200.0.1/30
ip ospf network point-to-point
ip router ospf UNDERLAY area 0
ip pim sparse-mode
no shut
route-map MultiPod-VXLAN permit 10
set ip next-hop unchanged #to let the original leaf from the remote pod as next-hop and not the spine that forwards the packet to his own pod leaf
routber bgp 65100
neighbor 10.0.0.13
remote-as 65101
update-source loopback0
ebgp-multihop 10 #increase ttl to 10 so also the loopbacks interfaces of directly connected neighbors can peer
address-family ipv4 unicast
address-family l2vpn evpn
send-community
send-community extended
route-map MultiPod-VXLAN out
Verify BGP evpn with show bgp l2vpn epvn summary.
The config for Spine1 and Spine2 is the same except the IPs for ospf/bgp peering.
The only thing you have to configure on the Leaves are the route-targets from the remote pod.
all Leaves
vrf context TNT1 #for l3vni (intervlan)
vni 10999
address-family ipv4 unicast
route-target import 65101:10999 #you want to import the route-targets of the remote AS, change to 65100 for AS65101
route-target import 65101:10999 evpn
evpn #for l2vni (intravlan)
vni 10100 l2
route-target import 65101:10100 #change to 65100 for AS65101
Now create an endpoint in each pod and try to ping eachother and verify the evpn routes with the command show bgp l2vpn evpn.
Multisite
In Multisite each Pod has its own control plane (MP-BGP evpn) and uses unicast replication between them which means that instead of using a multicast tree to flood your packet you will encapsulate the multicast packet and forward it as unicast packet to the destination leaf.
Instead of connecting the Spines together as in Multipod you use BGW Leaf (Border Gateway) that connect to the DCI (Datacenter interconnector) which could be your ISP. You could connect the BGW Leaves directly together but normaly you have a DCI between them and use a separate device as BGW Leaf.
Underlay OSPF/MP-BGP/PIM
BGW1
feature bgp
feature ospf
feature pim
feature nv overlay
nv overlay evpn
feature interface-vlan
feature vn-segment-vlan-based
evpn multisite border-gateway 100 #all bgw on same site need same id
router ospf UNDERLAY
router-id 10.10.1.10/32
int e1/1 #OSPF peering intrasite
desc OSPF SITE-INTERNAL INTERFACE
no switchport
mtu 9216
no ip redirects
ip address 192.168.10.1/30
ip ospf network point-to-point
ip router ospf UNDERLAY area 0.0.0.0
ip pim sparse-mode
evpn multisite fabric-tracking
no shutdown
int e1/2 #BGP peering intersite
desc BGP
no switchport
no shut
mtu 9216
ip add 192.168.101.1/30 tag 54321
evpn multisite dci-tracking #to monitor interface
int lo0 #ipv4/evpn bgp peering
desc BGP peering
ip add 10.10.1.10/32 tag 54321 #tag is used to advertise network over a route-map in your bgp proccess instead of declaring every single network you want to advertise
ip router ospf UNDERLAY area 0
ip pim sparse-mode #only for intra communication of BUM traffic
router bgp 65100
router-id 10.10.1.10 #lo0
address-family ipv4 unicast
maximum-paths 4
address-family l2vpn evpn
neighbor 192.168.101.2 #peer to DCI e1/1 ipv4
remote-as 65200
update-source e1/2
address-family ipv4 unicast
neighbor 10.0.0.3 #peer to spine1 lo0 evpn
remote-as 65100
update-source lo0
address-family l2vpn evpn
send-community
send-community extended
neighbor 10.10.2.10 #peer to BGW2 lo0 evpn
remote-as 65101
update-source lo0
ebgp-multihop 2
peer-type fabric-external #enables the next hop rewrite for multi-site
address-family l2vpn evpn
send-community
send-community extended
rewrite-evpn-rt-asn #this command changes the incoming route target’s AS number to match the BGP-configured neighbor’s remote AS number. So we don't have the AS of the DCI as RT.
ip pim rp-address 11.11.11.2 group-list 239.0.0.0/24
The BGW2 config on the remote pod is basically the same but with different IPs/IDs.
BGW2
feature bgp
feature ospf
feature pim
feature nv overlay
nv overlay evpn
feature interface-vlan
feature vn-segment-vlan-based
evpn multisite border-gateway 200
router ospf UNDERLAY
router-id 10.10.2.10/32
int e1/1 #OSPF peering intrasite
desc OSPF SITE-INTERNAL INTERFACE
no switchport
mtu 9216
no ip redirects
ip address 192.168.20.1/30
ip ospf network point-to-point
ip router ospf UNDERLAY area 0
ip pim sparse-mode
evpn multisite fabric-tracking
no shutdown
int e1/2
desc BGP
no switchport
no shut
mtu 9216
ip add 192.168.201.1/30 tag 54321
evpn multisite dci-tracking
int lo0
desc BGP peering
ip add 10.10.2.10/32 tag 54321
ip router ospf UNDERLAY area 0
ip pim sparse-mode
router bgp 65101
router-id 10.10.2.10 #lo0
address-family ipv4 unicast
maximum-paths 4
address-family l2vpn evpn
neighbor 192.168.201.2 #peer to DCI e1/1 ipv4
remote-as 65200
update-source e1/2
address-family ipv4 unicast
neighbor 10.0.0.5 #peer to spine1 lo0 evpn
remote-as 65101
update-source lo0
address-family l2vpn evpn
send-community
send-community extended
neighbor 10.10.1.10 #peer to BGW1 lo0 evpn
remote-as 65100
update-source lo0
ebgp-multihop 2
peer-type fabric-external
address-family l2vpn evpn
send-community
send-community extended
rewrite-evpn-rt-asn
ip pim rp-address 11.11.11.2 group-list 239.0.0.0/24
The Spine just need to peer to the new Leaf.
Spines
int e1/3
description OSPF
no switchport
mtu 9216
ip address 192.168.20.2/30
ip ospf network point-to-point
ip router ospf UNDERLAY area 0.0.0.0
ip pim sparse-mode
no shutdown
router bgp 65101 #65100 for spine1
neighbor 10.10.2.10 #10.10.1.10 for spine1
inherit peer TO_LEAFS
The DCI is the connector between both Pods and nothing more. The BGW Leaves basically need to see eachother directly as peer without the DCI between them which means that the DCI needs to keep all route-targets and send it to the remote pod and also not advertise itself as next hop in evpn.
DCI
feature bgp
nv overlay evpn
int e1/1
desc BGP
no switchport
no shut
mtu 9216
ip add 192.168.101.2/30
int e1/2
desc BGP
no switchport
no shut
mtu 9216
ip add 192.168.201.2/30
no shut
int lo0
ip add 100.100.100.100/32
route-map UNCHANGED permit 10
set ip next-hop unchanged #without Next-Hop reachability, BGP-learned route will not be injected into BGP so in order that BGW1 and BGW2 see their routes in EVPN they need to see them directly as next-hop and not the DCI between them
router bgp 65200
address-family ipv4 unicast
network 100.100.100.100/32
maximum-paths 64 #allows to install multiple paths in the RIB for load-balancing.
maximum-paths ibgp 64
address-family l2vpn evpn
retain route-target all #keep all route-targets when forwarding to BGW2
neighbor 192.168.101.1 #BGW1 ipv4 e1/2
remote-as 65100
update-source e1/1
address-family ipv4 unicast
next-hop-self #the router will advertise itself as next-hop cause by default bgp routes doesnt get changed and he would install a next-hop that he cant reach
neighbor 192.168.201.1 #BGW2 ipv4 e1/2
remote-as 65101
update-source e1/2
address-family ipv4 unicast
next-hop-self
template peer OVERLAY-PEERING
update-source loopback0
ebgp-multihop 5
address-family l2vpn evpn
send-community both
route-map UNCHANGED out #dont overwrite next hop
neighbor 10.10.1.10 #peering to BGW1 lo0
remote-as 65100
inherit peer OVERLAY-PEERING
address-family l2vpn evpn
rewrite-evpn-rt-asn
neighbor 10.10.2.10 #peering to BGW2 lo0
remote-as 65101
inherit peer OVERLAY-PEERING
address-family l2vpn evpn
rewrite-evpn-rt-asn
Overlay VXLAN
Now most of the config is the same as in my VXLAN BGP evpn article. We have to create a new loopback lo100 that will be used as source IP for multisite traffic
BGW1
evpn storm-control broadcast level 10 #means that max. 10% of available bandwidth will be used for unknown unicast traffic
evpn storm-control multicast level 10
evpn storm-control unicast level 10
int lo1 #L3VNI scr ip intra traffic
desc NVE src ip intra traffic
ip add 10.200.200.21/32 tag 54321
ip router ospf UNDERLAY area 0
ip pim sparse-mode
int lo100 #L3VNI src ip multisite
desc EVPN Multi-Site source interface
ip add 10.111.111.1/32 tag 54321 #tag used together with route-map
ip router ospf UNDERLAY area 0
ip pim sparse-mode
route-map RMAP-REDIST-DIRECT permit 10 #to tag ips so you can declare networks easier in bgp process
match tag 54321
router bgp 65100
address-family ipv4 unicast
redistribute direct route-map RMAP-REDIST-DIRECT #redistribute the ips with tag to BGP
neighbor 100.100.100.100 #DCI lo0 bgp evpn peering
remote-as 65200
update-source lo0
ebgp-multihop 5 #ttl 5 in case dci or bgw is multiple hops away
peer-type fabric-external
address-family l2vpn evpn
send-community
send-community extended
rewrite-evpn-rt-asn
vlan 100
vn-segment 10100
vlan 999
name L3VNI
vn-segment 10999
vrf context TNT1
vni 10999
rd auto
address-family ipv4 unicast
route-target both auto
route-target both auto evpn
interface Vlan999
no shutdown
mtu 9216
vrf member TNT1
ip forward
no ip redirects #to prevent icmp redirects that are used to advertise a more optimal route
hardware access-list tcam region racl 512
hardware access-list tcam region arp-ether 256 double-wide
interface nve1
no shutdown
host-reachability protocol bgp
source-interface loopback1
multisite border-gateway interface loopback100 #src int for multisite
member vni 10100
suppress-arp
multisite ingress-replication #multicast to unicast, replicating every BUM packet and sending them as a separate unicast to the remote egress devices
mcast-group 239.0.0.10
member vni 10999 associate-vrf
evpn
vni 10100 l2
rd auto
route-target import auto
route-target import 65101:10100
route-target export auto
BGW2
evpn storm-control broadcast level 10
evpn storm-control multicast level 10
evpn storm-control unicast level 10
int lo1 #L3VNI scr ip intra traffic
desc NVE src ip intra traffic
ip add 10.200.200.22/32 tag 54321
ip router ospf UNDERLAY area 0
ip pim sparse-mode
int lo100 #L3VNI src ip multisite
desc EVPN Multi-Site source interface
ip add 10.111.111.2/32 tag 54321 #tag used together with route-map
ip router ospf UNDERLAY area 0
ip pim sparse-mode
route-map RMAP-REDIST-DIRECT permit 10 #to tag ips so you can declare networks easier in bgp process
match tag 54321
router bgp 65101
address-family ipv4 unicast
redistribute direct route-map RMAP-REDIST-DIRECT #redistribute the ips with tag to BGP
neighbor 100.100.100.100 #DCI lo0 bgp evpn peering
remote-as 65200
update-source lo0
ebgp-multihop 5 #ttl 5 in case dci or bgw is multiple hops away
peer-type fabric-external
address-family l2vpn evpn
send-community
send-community extended
rewrite-evpn-rt-asn
vlan 100
vn-segment 10100
vlan 999
name L3VNI
vn-segment 10999
vrf context TNT1
vni 10999
rd auto
address-family ipv4 unicast
route-target both auto
route-target both auto evpn
interface Vlan999
no shutdown
mtu 9216
vrf member TNT1
ip forward
no ip redirects #It keeps the router from sending redirect messages to clients (ICMP). These are for when I router would know a more optimal path for a client to take rather than taking itself
hardware access-list tcam region racl 512
hardware access-list tcam region arp-ether 256 double-wide
interface nve1
no shutdown
host-reachability protocol bgp
source-interface loopback1
multisite border-gateway interface loopback100 #src int for multisite
member vni 10100
suppress-arp
multisite ingress-replication #multicast to unicast, replicating every BUM packet and sending them as a separate unicast to the remote egress devices
mcast-group 239.0.0.10
member vni 10999 associate-vrf
evpn
vni 10100 l2
rd auto
route-target import auto
route-target import 65100:10100
route-target export auto
Now create an endpoint in each pod and try to ping eachother and verify the evpn routes with the command show bgp l2vpn evpn.
Thanks for reading my article. If you have any questions or recommendations you can message me via arvednetblog@gmail.com.