summaryrefslogtreecommitdiff
path: root/bgpd/bgp_evpn.h
AgeCommit message (Collapse)Author
2024-12-09bgpd: backpressure - Optimize EVPN L3VNI remote routes processingRajasekar Raja
Anytime BGP gets a L3 VNI ADD/DEL from zebra, - Walking the entire global routing table per L3VNI is very expensive. - The next read (say of another VNI ADD/DEL) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L3VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the BGP-VRF off the struct bgp_master and maintain a struct bgp FIFO list which is processed later on, where we walk a chunk of BGP-VRFs and do the remote route install/uninstall. Ticket :#3864372 Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2024-12-09bgpd: backpressure - Optimize EVPN L2VNI remote routes processingRajasekar Raja
Anytime BGP gets a L2 VNI ADD from zebra, - Walking the entire global routing table per L2VNI is very expensive. - The next read (say of another VNI ADD) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L2VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the VPN off the bgp_master struct and maintain a VPN FIFO list which is processed later on, where we walk a chunk of VPNs and do the remote route install. Note: So far in the L3 backpressure cases(#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864372 Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2024-11-15bgpd : backpressure - Fix to pop items off zebra_announce FIFO for few EVPN ↵Rajasekar Raja
triggers In cases such as 'no advertise-all-vni' and L2 VNI DELETE, we need to pop all the VPN routes present in the bgp_zebra_announce FIFO yet to be processed regardless of VNI is configured or not. NOTE: NO need to pop the VPN routes in two cases 1) In free_vni_entry - Called by bgp_free()->bgp_evpn_cleanup(). - Since bgp_delete is called before bgp_free and we pop all the dest pertaining to bgp under delete. 2) evpn_delete_vni() when user configures "no vni" since the withdraw of all routes happen in normal cycle. Fixes: a07df6f7548f6bd1b92acbb7a10c3823de33fe5f ("bgpd : backpressure - Handle BGP-Zebra(EPVN) Install evt Creation") Ticket :#4163611 Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2024-10-14bgpd: fix evpn mh esi flap remove local routesChirag Shah
In symmetric routing, when local ESI is down, the MH peer learnt local mac-ip prefix is installed into teannt vrf (given l3vni). When ESI is back up and associated to evi/vni then remove the local synced mac-ip imported routes from the tenant vrf as local neigh/arp is present. Ticket: #3878699 Testing: peer advertised mac-ip route: *> [2]:[0]:[48]:[aa:aa:aa:00:00:01]:[32]:[45.0.0.51] RD 27.0.0.4:9 27.0.0.4 (spine-1) 0 64435 65016 i ESI:03:44:38:39:ff:ff:01:00:00:01 RT:65016:1000 RT:65016:4000 ET:8 Rmac:44:38:39:ff:ff:16 When local ESI is flapped torm-11:# ip neigh show 45.0.0.51 45.0.0.51 dev vlan1000 lladdr aa:aa:aa:00:00:01 REACHABLE proto zebra Before fix: (The imported route remained in tenant-vrf) torm-11:# ip route show vrf vrf1 45.0.0.51 45.0.0.51 nhid 257 proto bgp metric 20 After fix: torm-11# ip route show vrf vrf1 45.0.0.51 torm-11# trace: 2024/10/11 18:19:29 BGP: [JMP3T-178G8] route [2]:[0]:[48]:[00:02:00:00:00:08]:[32]:[21.1.0.5] is matched on local esi 03:00:00:00:77:01:04:00:00:0e, uninstall from VRF tenant1 route table Signed-off-by: Chirag Shah <chirag@nvidia.com>
2024-06-20Merge pull request #16059 from kacpekwasny/kkwasny/CLIC-139-4Donatas Abraitis
bgpd: fixed failing to remove VRF if there is a stale l3vni
2024-06-05bgpd: store number of labels with 8 bitsLouis Scalbert
8 bits are sufficient to store the number of labels because the current maximum is 2. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-05-27bgpd: fixed failing remove of vrf if there is a stale l3vniKacper Kwaśny
Problem statement: ================== When a vrf is deleted from the kernel, before its removed from the FRR config, zebra gets to delete the the vrf and assiciated state. It does so by sending a request to delete the l3 vni associated with the vrf followed by a request to delete the vrf itself. 2023/10/06 06:22:18 ZEBRA: [JAESH-BABB8] Send L3_VNI_DEL 1001 VRF testVRF1001 to bgp 2023/10/06 06:22:18 ZEBRA: [XC3P3-1DG4D] MESSAGE: ZEBRA_VRF_DELETE testVRF1001 The zebra client communication is asynchronous and about 1/5 cases the bgp client process them in a different order. 2023/10/06 06:22:18 BGP: [VP18N-HB5R6] VRF testVRF1001(766) is to be deleted. 2023/10/06 06:22:18 BGP: [RH4KQ-X3CYT] VRF testVRF1001(766) is to be disabled. 2023/10/06 06:22:18 BGP: [X8ZE0-9TS5H] VRF disable testVRF1001 id 766 2023/10/06 06:22:18 BGP: [X67AQ-923PR] Deregistering VRF 766 2023/10/06 06:22:18 BGP: [K52W0-YZ4T8] VRF Deletion: testVRF1001(4294967295) .. and a bit later : 2023/10/06 06:22:18 BGP: [MRXGD-9MHNX] DJERNAES: process L3VNI 1001 DEL 2023/10/06 06:22:18 BGP: [NCEPE-BKB1G][EC 33554467] Cannot process L3VNI 1001 Del - Could not find BGP instance When the bgp vrf config is removed later it fails on the sanity check if l3vni is removed. if (bgp->l3vni) { vty_out(vty, "%% Please unconfigure l3vni %u\n", bgp->l3vni); return CMD_WARNING_CONFIG_FAILED; } Solution: ========= The solution is to make bgp cleanup the l3vni a bgp instance is going down. The fix: ======== The fix is to add a function in bgp_evpn.c to be responsible for for deleting the local vni, if it should be needed, and call the function from bgp_instance_down(). Testing: ======== Created a test, which can run in container lab that remove the vrf on the host before removing the vrf and the bgp config form frr. Running this test in a loop trigger the problem 18 times of 100 runs. After the fix it did not fail. To verify the fix a log message (which is not in the code any longer) were used when we had a stale l3vni and needed to call bgp_evpn_local_l3vni_del() to do the cleanup. This were hit 20 times in 100 test runs. Signed-off-by: Kacper Kwasny <kkwasny@akamai.com> bgpd: braces {} are not necessary for single line block Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>
2024-04-08bgpd : backpressure - Handle BGP-Zebra(EPVN) Install evt CreationRajasekar Raja
Current changes deals with EVPN routes installation to zebra. In evpn_route_select_install() we invoke evpn_zebra_install/uninstall which sends zclient_send_message(). This is a continuation of code changes (similar to ccfe452763d16c432fa81fd20e805bec819b345e) but to handle evpn part of the code. Ticket: #3390099 Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2024-03-05bgpd:aggr summary-only remove suppressed from evpnChirag Shah
Ticket: #3534718 #3720960 Testing Done: Config: router bgp 65564 vrf sym_2 bgp router-id 27.0.0.9 ! address-family ipv4 unicast redistribute static exit-address-family vrf sym_2 vni 8889 ip route 63.2.1.0/24 blackhole ip route 63.2.1.2/32 blackhole ip route 63.2.1.3/32 blackhole exit-vrf tor-1:# vtysh -c "show bgp l2vpn evpn route" | grep -A3 63.2 *> [5]:[0]:[24]:[63.2.1.0] RD 27.0.0.9:19 27.0.0.9 (tor-1) 0 32768 ? ET:8 RT:28:8889 Rmac:44:38:39:ff:ff:29 -- *> [5]:[0]:[32]:[63.2.1.2] RD 27.0.0.9:19 27.0.0.9 (tor-1) 0 32768 ? ET:8 RT:28:8889 Rmac:44:38:39:ff:ff:29 *> [5]:[0]:[32]:[63.2.1.3] RD 27.0.0.9:19 27.0.0.9 (tor-1) 0 32768 ? ET:8 RT:28:8889 Rmac:44:38:39:ff:ff:29 tor-1(config)# router bgp 65564 vrf sym_2 tor-1(config-router)# address-family ipv4 unicast tor-1(config-router-af)# aggregate-address 63.2.0.0/16 summary-only tor-1(config-rou-f)# end tor-1:# vtysh -c "show bgp l2vpn evpn route" | grep -A3 63.2.1 tor-1:# vtysh -c "show bgp l2vpn evpn route" | grep -A3 63.2 *> [5]:[0]:[16]:[63.2.0.0] RD 27.0.0.9:19 27.0.0.9 (tor-1) 0 32768 ? ET:8 RT:28:8889 Rmac:44:38:39:ff:ff:29 Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-11-29bgpd: aggr summary-only suppressed export to evpnChirag Shah
When exporting bgp vrf instance unicast route into EVPN as type-5, check for suppressed ones and do not export them. Ticket:#3534718 Testing Done: Config: router bgp 660000 vrf vrf1 bgp router-id 144.1.1.2 no bgp network import-check neighbor 144.1.1.1 remote-as external ! address-family ipv4 unicast aggregate-address 50.1.0.0/16 summary-only redistribute connected exit-address-family ! address-family l2vpn evpn advertise ipv4 unicast exit-address-family exit v4 suppressed route: (5 suppressed routes not exported to evpn) tor1# vtysh -c "show bgp vrf vrf1 ipv4 unicast" | grep "50.1" *> 50.1.0.0/16 0.0.0.0(bordertor-11) s> 50.1.1.212/32 6.0.0.30(leaf-11)< s> 50.1.1.222/32 6.0.0.31(leaf-11)< s> 50.1.110.0/24 0.0.0.0(bordertor-11) s> 50.1.210.214/32 6.0.0.30(leaf-11)< s> 50.1.220.224/32 6.0.0.31(leaf-11)< tor1# vtysh -c "show bgp l2vpn evpn route" | grep -A3 "*> \[5\].*\[50.1" *> [5]:[0]:[16]:[50.1.0.0] RD 144.1.1.2:7 6.0.0.1 (bordertor-11) 0 32768 ? ET:8 RT:4640:104001 Rmac:00:02:00:00:00:04 Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-08-08bgpd: bgp_path_info_extra memory optimizationValerian_He
Even if some of the attributes in bgp_path_info_extra are not used, their memory is still allocated every time. It cause a waste of memory. This commit code deletes all unnecessary attributes and changes the optional attributes to pointer storage. Memory will only be allocated when they are actually used. After optimization, extra info related memory is reduced by about half(~400B -> ~200B). Signed-off-by: Valerian_He <1826906282@qq.com>
2023-05-30bgpd: add EVPN reimport handler for martian changeTrey Aspelund
Adds a generalized martian reimport function used for triggering a relearn/reimport of EVPN routes that were previously filtered/deleted as a result of a "self" check (either during import or by a martian change handler). The MAC-VRF SoO is the first consumer of this function, but can be expanded for use with Martian Tunnel-IPs, Interface-IPs, Interface-MACs, and RMACs. Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2023-05-30bgpd: generalize EVPN martian nexthop changesTrey Aspelund
Currently we have a handler function that will walk the global EVPN rib and unimport/remove routes matching a local IP/TIP. This generalizes this function so that it can be re-used for other BGP Martian entry types. Now this can be used to unimport routes when the MAC-VRF SoO is reconfigured. Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2023-04-06bgpd: Treat withdraw variable as a boolDonald Sharp
Used as a bool, treated as a bool. Make it a bool Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-17Merge pull request #12780 from opensourcerouting/spdx-license-idDonald Sharp
*: convert to SPDX License identifiers
2023-02-13bgpd: add mpath label stack helper functions for dvniStephen Worley
Add some bgp_path_info helper functions for getting the correct l3vni label, getting the vni from the label stack, and determinging if the mpath is D-VNI based. Signed-off-by: Stephen Worley <sworley@nvidia.com>
2023-02-13lib,zebra,bgpd,staticd: use label code to store VNI infoStephen Worley
Use the already existing mpls label code to store VNI info for vxlan. VNI's are defined as labels just like mpls, we should be using the same code for both. This patch is the first part of that. Next we will need to abstract the label code to not be so mpls specific. Currently in this, we are just treating VXLAN as a label type and storing it that way. Signed-off-by: Stephen Worley <sworley@nvidia.com>
2023-02-09*: auto-convert to SPDX License IDsDavid Lamparter
Done with a combination of regex'ing and banging my head against a wall. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-02-01bgpd: Convert bgp_addpath_encode_[tr]x() to bool from intDonatas Abraitis
Rename addpath_encode[d] to addpath_capable to be consistent. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-06-07bgpd: Add CLI for overlay index recursive resolutionAmeya Dharkar
Gateway IP overlay index of the remote type-5 route is resolved recursively using remote type-2 route. For the purpose of this recursive resolution, for each L2VNI, we build a hash table of the remote IP addresses received by remote type-2 routes. For the topologies where overlay index resolution is not needed, we do not need to build this remote-ip-hash. Thus, make the recursive resolution of the overlay index conditional on "enable-resolve-overlay-index" configuration. router bgp 65001 bgp router-id 192.168.100.1 neighbor 10.0.1.2 remote-as 65002 ! address-family l2vpn evpn neighbor 10.0.1.2 activate advertise-all-vni enable-resolve-overlay-index----------> New configuration exit-address-family Gateway IP overlay index will be resolved only if this configuration is present. Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07bgpd: EVPN route type-5 to type-2 recursive resolution using gateway IPAmeya Dharkar
When EVPN prefix route with a gateway IP overlay index is imported into the IP vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP. For this vrf route to be valid, following conditions must be met. - Gateway IP nexthop of this route should be L3 reachable, i.e., this route should be resolved in RIB. - A remote MAC/IP route should be present for the gateway IP address in the EVI(L2VPN table). To check for the first condition, gateway IP is registered with nht (nexthop tracking) to receive the reachability notifications for this IP from zebra RIB. If the gateway IP is reachable, zebra sends the reachability information (i.e., nexthop interface) for the gateway IP. This nexthop interface should be the SVI interface. Now, to find out type-2 route corresponding to the gateway IP, we need to fetch the VNI for the above SVI. To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with svi_ifindex as key. struct hash *vni_svi_hash; An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero. Using this hash, we obtain struct bgpevpn corresponding to the gateway IP. For gateway IP overlay index recursive lookup, once we find the correct EVI, we have to lookup its route table for a MAC/IP prefix. As we have to iterate the entire route table for every lookup, this lookup is expensive. We can optimize this lookup by adding all the remote IP addresses in a hash table. Following hash table is defined for this purpose in struct bgpevpn Struct hash *remote_ip_hash; When a MAC/IP route is installed in the EVI table, it is also added to remote_ip_hash. It is possible to have multiple MAC/IP routes with the same IP address because of host move scenarios. Thus, for every address addr in remote_ip_hash, we maintain list of all the MAC/IP routes having addr as their IP address. Following structure defines an address in remote_ip_hash. struct evpn_remote_ip { struct ipaddr addr; struct list *macip_path_list; }; A Boolean field is added to struct bgp_nexthop_cache to indicate that the nexthop is EVPN gateway IP overlay index. bool is_evpn_gwip_nexthop; A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache. This flag is set when the gateway IP is L3 reachable but not yet resolved by a MAC/IP route. Following table explains the combination of L3 and L2 reachability w.r.t. BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags * | MACIP resolved | MACIP unresolved *----------------|----------------|------------------ * L3 reachable | VALID = 1 | VALID = 0 * | INCOMPLETE = 0 | INCOMPLETE = 1 * ---------------|----------------|-------------------- * L3 unreachable | VALID = 0 | VALID = 0 * | INCOMPLETE = 0 | INCOMPLETE = 0 Procedure that we use to check if the gateway IP is resolvable by a MAC/IP route: - Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash. - Check if the gateway IP is present in remote_ip_hash in this EVI. When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route, unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag. Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07bgpd, zebra: Add svi_interface to zebra VNI and bgp EVPN structuresAmeya Dharkar
SVI ifindex for L2VNI is required in BGP to perform EVPN type-5 to type-2 recusrsive resolution using gateway IP overlay index. Program this svi_ifindex in struct zebra_vni_t as well as in struct bgpevpn Changes include: 1. Add svi_if field to struct zebra_evpn_t 2. Add svi_ifindex field to struct bgpevpn 3. When SVI (bridge or VLAN) is bound to a VxLAN interface, store it in the zebra_evpn_t structure. 4. Add this SVI ifindex to ZEBRA_VNI_ADD 5. Store svi_ifindex in struct bgpevpn Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07bgpd: CLI to advertise gateway IP overlay indexAmeya Dharkar
Adds gateway-ip option to advertise ipv4/ipv6 unicast CLI. dev(config-router-af)# advertise <ipv4|ipv6> unicast <cr> gateway-ip Specify EVPN Overlay Index route-map route-map for filtering specific routes When gateway-ip is specified, gateway IP field of EVPN RT-5 NLRI is filled with the BGP nexthop of the vrf prefix being advertised. No support for ESI overlay index yet. Test cases: 1) advertise ipv4 unicast 2) advertise ipv4 unicast gateway-ip 3) advertise ipv6 unicast 4) advertise ipv6 unicast gateway-ip 5) Modify from no-overlay-index to gateway-ip 6) Modify from gateway-ip to no-overlay-index 7) CLI with route-map and modify route-map Author: Sri Mohana Singamsetty <srimohans@gmail.com> Signed-off-by: Sri Mohana Singamsetty <srimohans@gmail.com>
2021-03-25bgpd: handle local ES del or transition to LACP bypassAnuradha Karuppiah
1. When a local ES is deleted or the ES-bond goes into bypass we treat imported MAC-IP routes with that ES destination as remote routes instead of sync routes. This requires a re-evaluation of the routes as "non-local-dest" and an update to zebra. 2. When a ES is attached to an access port or the ES-bond transitions from bypass to LACP-up we treat imported MAC-IP routes with that ES destination as sync routes. This requires a re-evaluation of the routes as "local-dest" and an update to zebra. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24bgpd: use L3NHG while installing EVPN host routes in zebraAnuradha Karuppiah
Host routes imported into the VRF can have a destination ES (per-VRF) which is set up as a L3NHG for efficient failover. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-10-16bgpd: replace bgp_evpn_route2str with prefix2strPat Ruddy
Remove bgp_evpn_route2str and replace calls with prefix2str Signed-off-by: Pat Ruddy <pat@voltanet.io>
2020-08-05bgpd: support for Ethernet Segments and Type-1/EAD routesAnuradha Karuppiah
This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-08-05bgpd: pull the multihoming code out to a separate fileAnuradha Karuppiah
Re-org only; no other code changes. This is being done to make maintanence of MH functionality (which will have more code added to it) easy. The code moved here was originally committed via - 'commit 50f74cf13105 ("*: support for evpn type-4 route")' Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-06-23bgp: rename bgp_node to bgp_destDonald Sharp
This is the bulk part extracted from "bgpd: Convert from `struct bgp_node` to `struct bgp_dest`". It should not result in any functional change. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2020-03-26lib, bgpd: Another round of `struct const prefix` cleanupDonald Sharp
Cleanup another set of functions that need to respect the const'ness of a prefix. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-03-24bgpd: Rework code to use `const struct prefix`Donald Sharp
Future work needs the ability to specify a const struct prefix value. Iterate into bgp a bit to get this started. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-03-22bgpd: More `const struct prefix` workDonald Sharp
Modify more code to use `const struct prefix` throughout bgp. This is all prep work for adding an accessor function for bgp_node to get the prefix and reduce all the places that code needs to be touched when we get that work done. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-11-22bgpd: evpn pip parse vrr macChirag Shah
In L3VNI add callback parse, vrr rmac value. For non-zero vrr mac value, use it as anycast RMAC and svi mac as individual rmac value. If advertise-pip is disable or vrr rmac is not present use svi mac as anycast rmac value for all routes. Ticket:CM-26190 Reviewed By: Testing Done: Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2019-11-22bgpd: evpn pip data struct and cliChirag Shah
Evpn Primary IP advertisement feature uses individual system IP and system MAC for prefix (type-5) and self type-2 routes. The PIP knob is enabled by default for bgp vrf instance. Configuration CLI for enable/disable PIP feature knob. User can configure PIP system IP and MAC to retain as permanent values. For the PIP IP, the default behavior is to accept bgp default instance's router-id. When the default instance router-id change, reflect PIP IP assignment. Reflect type-5 to use system-IP and system MAC as nexthop and RMAC values. Ticket:CM-26190 Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2019-11-15bgpd: Add nexthop of received EVPN RT-5 for nexthop trackingAmeya Dharkar
Problem statement: When IPv4/IPv6 prefixes are received in BGP, bgp_update function registers the nexthop of the route with nexthop tracking module. The BGP route is marked as valid only if the nexthop is resolved. Even for EVPN RT-5, route should be marked as valid only if the the nexthop is resolvable. Code changes: 1. Add nexthop of EVPN RT-5 for nexthop tracking. Route will be marked as valid only if the nexthop is resolved. 2. Only the valid EVPN routes are imported to the vrf. 3. When nht update is received in BGP, make sure that the EVPN routes are imported/unimported based on the route becomes valid/invalid. Testcases: 1. At rtr-1, advertise EVPN RT-5 with a nexthop 10.100.0.2. 10.100.0.2 is resolved at rtr-2 in default vrf. At rtr-2, remote EVPN RT-5 should be marked as valid and should be imported into vrfs. 2. Make the nexthop 10.100.0.2 unreachable at rtr-2 Remote EVPN RT-5 should be marked as invalid and should be unimported from the vrfs. As this code change deals with EVPN type-5 routes only, other EVPN routes should be valid. 3. At rtr-2, add a static route to make nexthop 10.100.0.2 reachable. EVPN RT-5 should again become valid and should be imported into the vrfs. Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2019-09-27bgpd: Adding new bgp evpn cli's for ip-prefix lookupLakshman Krishnamoorthy
Implement CLIs for the following, to filter for a prefix within evpn type 5 route 1) show bgp l2vpn evpn A.B.C.D 2) show bgp l2vpn evpn A.B.C.D json 3) show bgp l2vpn evpn A.B.C.D/M 4) show bgp l2vpn evpn A.B.C.D/M json 5) show bgp l2vpn evpn X:X::X:X 6) show bgp l2vpn evpn X:X::X:X json 7) show bgp l2vpn evpn X:X::X:X/M 8) show bgp l2vpn evpn X:X::X:X/M json Sample output provided here: https://github.com/FRRouting/frr/pull/4850 Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
2019-04-20bgpd: maintain flood mcast group per-l2-vniAnuradha Karuppiah
If PIM-SM if used for BUM flooding the multicast group address can be configured per-vxlan-device. BGP receives this config from zebra via the L2 VNI add/update. Sample output - root@TORS1:~# vtysh -c "show bgp l2vpn evpn vni 1000" |grep Mcast Mcast group: 239.1.1.100 root@TORS1:~# Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2019-03-28Merge branch 'master' into evpn-session-vrfTuetuopay
2019-03-22bgpd, zebra: Redo checks to advertise_all_vniTuetuopay
This replaces manual checks of the flag with a wrapper macro to convey the meaning "is evpn enabled on this vrf?" Signed-off-by: Tuetuopay <tuetuopay@me.com> Sponsored-by: Scaleway
2019-03-19bgpd: Allow non-default instance to be EVPN oneTuetuopay
This makes the instance bearing the advertise-all-vni config option register to zebra as the EVPN one, forwarding it the option. Signed-off-by: Tuetuopay <tuetuopay@me.com> Sponsored-by: Scaleway
2019-03-15Merge pull request #3892 from vivek-cumulus/evpn_vrf_route_leakSri Mohana Singamsetty
Leaking of EVPN-based IPv4 and IPv6 routes between VRFs
2019-03-01bgpd: Recursively determine if route's source is EVPNvivek
With leaking of IPv4 or IPv6 unicast routes whose source is a EVPN type-2 or type-5 route between VRFs, the determination of whether the route's source is EVPN has to be made recursively. This is used during route install to pass along appropriate parameters to zebra. Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-28bgpd: Allow EVPN-sourced routes to be leaked back into EVPNvivek
Refine check on whether a route can be injected into EVPN to allow EVPN-sourced routes to be injected back into another instance. Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-28bgpd: No nexthop tracking for EVPN-imported leaked routesvivek
IPv4 or IPv6 unicast routes which are imported from EVPN routes (type-2 or type-5) and installed in a BGP instance and then leaked do not need any nexthop tracking, as any tracking should happen in the source instance. Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-27zebra, bgpd: Exchange L3 interface for VRF's VNIvivek
In the case of EVPN symmetric routing, the tenant VRF is associated with a VNI that is used for routing and commonly referred to as the L3 VNI or VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP) interface (SVI). Overlay next hops (i.e., next hops for routes in the tenant VRF) are reachable over this interface. https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement section 4.4 provides additional description of the above constructs. The implementation currently derives this L3 interface for EVPN tenant routes using special code that looks at route flags. This patch exchanges the L3 interface between zebra and bgpd as part of the L3-VNI exchange in order to eliminate some this special code. Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-25bgpd: reinstate current bgp best route on an inactive neigh delAnuradha Karuppiah
When an inactive-neigh delete is rxed bgp will not have a local path to remove (and re-run path selection). Instead it simply re-installs the current best remote path if any. Ticket: CM-23018 Testing Done: evpn-min Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2018-10-11bgpd: Add '[no] flood <disable|head-end-replication>'Donald Sharp
Add the '[no] flood <disable|head-end-replication>' command to the l2vpn evpn afi/safi sub commands for bgp. This command when entered as 'flood disable' will turn off type 3 route generation for the transmittal of the type 3 route necessary for BUM replication on the remote VTEP. Additionally it will turn off the BUM handling via the new zebra command, ZEBRA_VXLAN_FLOOD_CONTROL. Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-10-09bgpd: Convert `struct bgp_info` to `struct bgp_path_info`Donald Sharp
Do a straight conversion of `struct bgp_info` to `struct bgp_path_info`. This commit will setup the rename of variables as well. This is being done because `struct bgp_info` is not descriptive of what this data actually is. It is path information for routes that we keep to build the actual routes nexthops plus some extra information. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-08-20bgpd, zebra: EVPN extended mobility supportvivek
Implement procedures similar to what is specified in https://tools.ietf.org/html/draft-malhotra-bess-evpn-irb-extended-mobility in order to support extended mobility scenarios in EVPN. These are scenarios where a host/VM move results in a different (MAC,IP) binding from earlier. For example, a host with an address assignment (IP1, MAC1) moves behind a different PE (VTEP) and has an address assignment of (IP1, MAC2) or a host with an address assignment (IP5, MAC5) has a different assignment of (IP6, MAC5) after the move. Note that while these are described as "move" scenarios, they also cover the situation when a VM is shut down and a new VM is spun up at a different location that reuses the IP address or MAC address of the earlier instance, but not both. Yet another scenario is a MAC change for an attached host/VM i.e., when the MAC of an attached host changes from MAC1 to MAC2. This is necessary because there may already be a non-zero sequence number associated with MAC2. Also, even though (IP, MAC1) is withdrawn before (IP, MAC2) is advertised, they may propagate through the network differently. The procedures continue to rely on the MAC mobility extended community specified in RFC 7432 and already supported by the implementation, but augment it with a inheritance mechanism that understands the relationship of the host MACIP (ARP/neighbor table entry) to the underlying MAC (MAC forwarding database entry). In FRR, this relationship is understood by the zebra component which doubles as the "host mobility manager", so the MAC mobility sequence numbers are determined through interaction between bgpd and zebra. Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2018-05-30*: support for evpn type-4 routemitesh
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>