Donald Sharp [Fri, 8 Mar 2019 15:46:55 +0000 (10:46 -0500)]
zebra: Add some debugs to neighbor entry processing
When we get a neighbor entry in zebra we start processing it.
Let's add some additional debugs to the processing so that when
it bails out and we don't use the data, we know the reason.
This should help in debugging the problems from why bgp does
not appear to have data associated with a neighbor entry
in the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 6 Mar 2019 15:44:34 +0000 (10:44 -0500)]
configure: Default to 16 way ecmp on compilation
If a person who is compiling FRR does not specify the
multipath number on configure we are defaulting to a ecmp of 1.
Let's change this to 16. In this day and age most everything
supports actual ecmp.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Mark Stapp [Tue, 5 Mar 2019 20:28:26 +0000 (15:28 -0500)]
libs: make privilege escalation thread-safe
Privs escalation is process-wide, and a multi-threaded process
can deadlock. This adds a mutex and a counter to the privs
object, preventing multiple threads from making the privs
escalation system call.
Donald Sharp [Tue, 5 Mar 2019 15:29:35 +0000 (10:29 -0500)]
pimd: Ensure DR election happens when both sides change prio
Suppose we have 2 routers A and B. Both Router A and B have
the same priority of 1000. Router A is the elected DR.
Now suppose B lowers his priority to 1. He still looses the
DR election and we are not sending a hello with the new priority.
Immediately after this A's priority is also lowered to 1, it
looses the election and sends the hello. B receives this hello
and elects A as the DR( since it has the better ip address)
At this point A believes B is the DR, and B believes A is the
DR until such time that the normal hello from B is sent to A,
which if timed correctly can be a significant amount of time).
This code just causes a hello to be sent if the priority is
changed. Now both sides will be able to converge quickly
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Chirag Shah [Thu, 28 Feb 2019 00:36:47 +0000 (16:36 -0800)]
bgpd: router mac same as self supress bgp update
bgp update can contain router mac address same as one of SVIs
mac address, during processing of evpn route in bpg_update()
check for the flag is set and filter the route from installing.
This check is done prior to attribute lookup or storing in database.
Parse check and set is done once during attribute parse
because all the NLRIs containing evpn prefix
(type-2/type-5) will have same exntended community applicable.
Chirag Shah [Wed, 20 Feb 2019 00:02:00 +0000 (16:02 -0800)]
bgpd: parse and comapre rmac attr against self mac
Any evpn bgp update message comes with router mac extended
community, which can potentially contain the madd adddress
same as any of the local SVIs (L3VNI) MAC address.
Set route mac exist and during route processing in
bgp_update() filter the route.
Ticket:CM-23674
Reviewed By:CCR-8336
Testing Done:
Configure L3vni mac on TORS1 which is similar to TORC11
L3vni MAC. When TORC11 received the EVPN update with
Router mac extended community, this check rejected the
BGP update message.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Don Slice [Sat, 2 Mar 2019 19:40:17 +0000 (19:40 +0000)]
bpgd: resolve more neighbor peer-group issues
Found in testing that in a certain sequence, a neighbor's peer-group
membership would be lost. This fix resolves that issue. Additionally
found that "no neighbor swp1 remote-as 2" would sometimes leave the
config with "neighbor swp1 remote-as 0" rather than removing from the
config. That one is also resolved.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Donald Sharp [Thu, 28 Feb 2019 14:11:41 +0000 (09:11 -0500)]
zebra: Upon vrf deletion, actually release this data.
When a vrf is deleted we need to tell the zebra_router that we have
finished using the tables we are keeping track of. This will allow
us to properly cleanup the data structures associated with them.
This fixes this valgrind error found:
==8579== Invalid read of size 8
==8579== at 0x430034: zvrf_id (zebra_vrf.h:167)
==8579== by 0x432366: rib_process (zebra_rib.c:1580)
==8579== by 0x432366: process_subq (zebra_rib.c:2092)
==8579== by 0x432366: meta_queue_process (zebra_rib.c:2188)
==8579== by 0x48C99FE: work_queue_run (workqueue.c:291)
==8579== by 0x48C3788: thread_call (thread.c:1607)
==8579== by 0x48A2E9E: frr_run (libfrr.c:1011)
==8579== by 0x41316A: main (main.c:473)
==8579== Address 0x5aeb750 is 0 bytes inside a block of size 4,424 free'd
==8579== at 0x4839A0C: free (vg_replace_malloc.c:540)
==8579== by 0x438914: zebra_vrf_delete (zebra_vrf.c:279)
==8579== by 0x48C4225: vrf_delete (vrf.c:243)
==8579== by 0x48C4225: vrf_delete (vrf.c:217)
==8579== by 0x4151CE: netlink_vrf_change (if_netlink.c:364)
==8579== by 0x416810: netlink_link_change (if_netlink.c:1189)
==8579== by 0x41C1FC: netlink_parse_info (kernel_netlink.c:904)
==8579== by 0x41C2D3: kernel_read (kernel_netlink.c:389)
==8579== by 0x48C3788: thread_call (thread.c:1607)
==8579== by 0x48A2E9E: frr_run (libfrr.c:1011)
==8579== by 0x41316A: main (main.c:473)
==8579== Block was alloc'd at
==8579== at 0x483AB1A: calloc (vg_replace_malloc.c:762)
==8579== by 0x48A6030: qcalloc (memory.c:110)
==8579== by 0x4389EF: zebra_vrf_alloc (zebra_vrf.c:382)
==8579== by 0x438A42: zebra_vrf_new (zebra_vrf.c:93)
==8579== by 0x48C40AD: vrf_get (vrf.c:209)
==8579== by 0x415144: netlink_vrf_change (if_netlink.c:319)
==8579== by 0x415E90: netlink_interface (if_netlink.c:653)
==8579== by 0x41C1FC: netlink_parse_info (kernel_netlink.c:904)
==8579== by 0x4163E8: interface_lookup_netlink (if_netlink.c:760)
==8579== by 0x42BB37: zebra_ns_enable (zebra_ns.c:130)
==8579== by 0x42BC5E: zebra_ns_init (zebra_ns.c:208)
==8579== by 0x4130F4: main (main.c:401)
This can be found by: `ip link del <VRF DEVICE NAME>` then `ip link add <NAME> type vrf table X` again and
then attempting to use the vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Fri, 1 Mar 2019 18:56:12 +0000 (13:56 -0500)]
zebra: When installing a new route always use REPLACE
When we install a new route into the kernel always use
REPLACE. Else if the route is already there it can
be translated into an append with the flags we are
using.
This is especially true for the way we handle pbr
routes as that we are re-installing the same route
entry from pbr at the moment.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
bgpd: Code to remove the bottleneck in aggregation.
The code that causes the bottleneck has been written generically to
handle the below two cases:
a) When a new aggregate-address is configured.
b) When new routes, that can be aggregated under an existing
aggregate-address, are received.
This change optimizes the code that handles case-(b).
bgpd: Code to handle BGP aggregate's l-communities.
With this commit:
1) The code to manage the large-communities attribute of the routes that are
aggregatable under a configured aggregate-address is introduced.
2) The code to compute the aggregate-route's large-communities attribute is
introduced.
bgpd: Code to handle BGP aggregate's e-communities.
With this commit:
1) The code to manage the extended-communities attribute of the routes that are
aggregatable under a configured aggregate-address is introduced.
2) The code to compute the aggregate-route's extended-communities attribute is
introduced.
With this commit:
1) The code to manage the communities attribute of the routes that are
aggregatable under a configured aggregate-address is introduced.
2) The code to compute the aggregate-route's communities attribute is
introduced.
With this commit:
1) 'struct bgp_aggregate' is moved to bgp_route.h from bgp_route.c
2) Hashes to accommodate the as-path, communities, extended-communities and
large-communities attributes of all the routes aggregated by an
aggregate route is introduced in 'struct bgp_aggregate'.
3) Place-holders for the aggregate route's as-path, communities,
extended-communities and large-communities attributes are introduced in
'struct bgp_aggregate'.
4) The code to manage the as-path of the routes that are aggregatable under
a configured aggregate-address is introduced.
5) The code to compute the aggregate-route's as-path is introduced.
vivek [Wed, 27 Feb 2019 12:54:24 +0000 (12:54 +0000)]
*: Explicitly mark nexthop of EVPN-sourced routes as onlink
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface. Howver, in the model that
is supported in the implementation and commonly deployed, there is no
explicit Overlay IP address associated with the next hop in the tenant
VRF; the underlay IP is used if (since) the forwarding plane requires
a next hop IP. Therefore, the next hop has to be explicit flagged as
onlink to cause any next hop reachability checks in the forwarding plane
to be skipped.
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.
Use existing mechanism to specify the nexthops as onlink when installing
these routes from bgpd to zebra and get rid of a special flag that was
introduced for EVPN-sourced routes. Also, use the onlink flag during next
hop validation in zebra and eliminate other special checks.
vivek [Wed, 27 Feb 2019 12:25:53 +0000 (12:25 +0000)]
zebra, bgpd: Use L3 interface for VRF's VNI in route install
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface.
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.
Use the L3 interface exchanged between zebra and bgp in route install.
This patch in conjunction with the earlier one helps to eliminate some
special code in zebra to derive the next hop's interface.
vivek [Wed, 27 Feb 2019 11:52:34 +0000 (11:52 +0000)]
zebra, bgpd: Exchange L3 interface for VRF's VNI
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface.
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.
The implementation currently derives this L3 interface for EVPN tenant
routes using special code that looks at route flags. This patch
exchanges the L3 interface between zebra and bgpd as part of the L3-VNI
exchange in order to eliminate some this special code.
vivek [Wed, 27 Feb 2019 08:19:06 +0000 (08:19 +0000)]
bgpd: Fix EVPN advertise route-map application
When a IPv4 or IPv6 route that was formerly allowed by the route-map
to be injected into EVPN gets an updated set of attributes that now
causes it to be filtered, the route needs to be pulled out of EVPN.
Renato Westphal [Tue, 26 Feb 2019 21:22:27 +0000 (18:22 -0300)]
bgpd: add missing checks for vpnv6 nexthop lengths
A few code paths weren't handling the vpnv6 nexthop lenghts as
expected, which was leading to problems like imported vpnv6 routes
not being marked as valid when they should. Fix this.
Don Slice [Mon, 11 Feb 2019 19:17:40 +0000 (14:17 -0500)]
tools: keep exit-vrf to change context correctly between vrfs
Discovered in testing that if a static route in the default table
was entered immediately after a vrf static block, the static route
intended for the default table was put in the vrf instead. This
fix retains the "exit-vrf" statement which causes the following
static routes to appear in the default table correctly.
Ticket: CM-23985 Signed-off-by: Don Slice <dslice@cumulusnetwork.com>
Don Slice [Fri, 25 Jan 2019 18:37:03 +0000 (13:37 -0500)]
tools: fix blackhole static changes in frr-reload.py
Problem caused when nclu is used to create "ip route 1.1.1.0/24
blackhole" because frr-reload.py changed the line to Null0 instead
of blackhole. If nclu tries to delete it using the same line as
entered, the commit fails since it doesn't match.
Ticket: CM-23986 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Donatas Abraitis [Mon, 25 Feb 2019 19:16:02 +0000 (21:16 +0200)]
bgpd: Add peer action for PEER_FLAG_IFPEER_V6ONLY flag
peer_flag_modify() will always return BGP_ERR_INVALID_FLAG because
the action was not defined for PEER_FLAG_IFPEER_V6ONLY flag.
```
global PEER_FLAG_IFPEER_V6ONLY = 16384;
global BGP_ERR_INVALID_FLAG = -2;
probe process("/usr/lib/frr/bgpd").statement("peer_flag_modify@/root/frr/bgpd/bgpd.c:3975")
{
if ($flag == PEER_FLAG_IFPEER_V6ONLY && $action->type == 0)
printf("action not found for the flag PEER_FLAG_IFPEER_V6ONLY\n");
}
probe process("/usr/lib/frr/bgpd").function("peer_flag_modify").return
{
if ($return == BGP_ERR_INVALID_FLAG)
printf("return BGP_ERR_INVALID_FLAG\n");
}
```
produces:
action not found for the flag PEER_FLAG_IFPEER_V6ONLY
return BGP_ERR_INVALID_FLAG
bgpd: 'show bgp [ipv4|ipv6] neighbors' displays all address family neighbors
Display only ipv4 neighbors when 'show bgp ipv4 neighbors' command is issued.
Display only ipv6 neighbors when 'show bgp ipv6 neighbors' command is issued.
Take the address family of the peer address into account, while displaying the neighbors.
==17102== Invalid read of size 8
==17102== at 0x44D84C: rib_dest_from_rnode (rib.h:375)
==17102== by 0x4546ED: rib_process_result (zebra_rib.c:1904)
==17102== by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102== by 0x4D0902B: thread_call (thread.c:1607)
==17102== by 0x4CC3983: frr_run (libfrr.c:1011)
==17102== by 0x4266F6: main (main.c:473)
==17102== Address 0x83bd468 is 88 bytes inside a block of size 96 free'd
==17102== at 0x4A35F54: free (vg_replace_malloc.c:530)
==17102== by 0x4CCAC00: qfree (memory.c:129)
==17102== by 0x4D03DC6: route_node_destroy (table.c:501)
==17102== by 0x4D039EE: route_node_free (table.c:90)
==17102== by 0x4D03971: route_node_delete (table.c:382)
==17102== by 0x44D82A: route_unlock_node (table.h:256)
==17102== by 0x454617: rib_process_result (zebra_rib.c:1882)
==17102== by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102== by 0x4D0902B: thread_call (thread.c:1607)
==17102== by 0x4CC3983: frr_run (libfrr.c:1011)
==17102== by 0x4266F6: main (main.c:473)
==17102== Block was alloc'd at
==17102== at 0x4A36FF6: calloc (vg_replace_malloc.c:752)
==17102== by 0x4CCAA2D: qcalloc (memory.c:110)
==17102== by 0x4D03D88: route_node_create (table.c:489)
==17102== by 0x4D0360F: route_node_new (table.c:65)
==17102== by 0x4D034F8: route_node_set (table.c:74)
==17102== by 0x4D03486: route_node_get (table.c:327)
==17102== by 0x4CFB700: srcdest_rnode_get (srcdest_table.c:243)
==17102== by 0x4545C1: rib_process_result (zebra_rib.c:1872)
==17102== by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102== by 0x4D0902B: thread_call (thread.c:1607)
==17102== by 0x4CC3983: frr_run (libfrr.c:1011)
==17102== by 0x4266F6: main (main.c:473)
==17102==
This is happening because of this order of events:
1) Route is deleted in the main thread and scheduled for rib processing.
2) Rib garbage collection is run and we remove the route node since it
is no longer needed.
3) Data plane returns from the deletion in the kernel and we call
the srcdest_rnode_get function to get the prefix that was deleted.
This recreates a new route node. This creates a route_node with
a lock count of 1, which we freed via the route_unlock_node call.
Then we continued to use the rn pointer. Which leaves us with use
after frees.
The solution is, of course, to just move the unlock the node at the
end of the function if we have a route_node.
Fixes: #3854 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Sun, 24 Feb 2019 00:27:09 +0000 (19:27 -0500)]
bgpd: Cleanup cli for [l]community_delete functions
The community_delete and lcommunity_delete functionality was
creating a special string that needed to be specially parsed.
Remove all this string creation and just pass the pertinent
data into the appropriate functions.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Sat, 23 Feb 2019 00:19:18 +0000 (19:19 -0500)]
zebra: Prevent crash in dad auto recovery
Commit: 6005fe55bce1c9cd54f4f7773fc2b0e15a99008f
Introduced a crash with zebra looking up either the
nbr structure or the mac structure. This is because
the zvni used is NULL and we eventually call a hash_lookup
call that would cause a NULL dereference. Partially
revert this commit to original behavior.
Problems found via clang Static Analyzer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>