Donald Sharp [Wed, 16 Sep 2020 19:04:16 +0000 (15:04 -0400)]
bgpd: allow bestpath to handle mutliple locally-originated paths
Current code in bgp bestpath selection would accept the newest
locally originated path as the best path. Making the selection
non-deterministic. Modify the code to always come to the
same bestpath conclusion when you have multiple locally originated
paths in bestpath selection.
Before:
eva# conf
eva(config)# router bgp 323
eva(config-router)# address-family ipv4 uni
eva(config-router-af)# redistribute connected
eva(config-router-af)# network 192.168.161.0/24
eva(config-router-af)# do show bgp ipv4 uni 192.168.161.0
BGP routing table entry for 192.168.161.0/24
Paths: (2 available, best #1, table default)
Not advertised to any peer
Local
0.0.0.0(eva) from 0.0.0.0 (192.168.161.245)
Origin IGP, metric 0, weight 32768, valid, sourced, local, bestpath-from-AS Local, best (Origin)
Last update: Wed Sep 16 15:03:03 2020
Local
0.0.0.0(eva) from 0.0.0.0 (192.168.161.245)
Origin incomplete, metric 0, weight 32768, valid, sourced
Last update: Wed Sep 16 15:02:52 2020
eva(config-router-af)# no redistribute connected
eva(config-router-af)# do show bgp ipv4 uni 192.168.161.0
BGP routing table entry for 192.168.161.0/24
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
0.0.0.0(eva) from 0.0.0.0 (192.168.161.245)
Origin IGP, metric 0, weight 32768, valid, sourced, local, bestpath-from-AS Local, best (First path received)
Last update: Wed Sep 16 15:03:03 2020
eva(config-router-af)# redistribute connected
eva(config-router-af)# do show bgp ipv4 uni 192.168.161.0
BGP routing table entry for 192.168.161.0/24
Paths: (2 available, best #2, table default)
Not advertised to any peer
Local
0.0.0.0(eva) from 0.0.0.0 (192.168.161.245)
Origin incomplete, metric 0, weight 32768, valid, sourced
Last update: Wed Sep 16 15:03:32 2020
Local
0.0.0.0(eva) from 0.0.0.0 (192.168.161.245)
Origin IGP, metric 0, weight 32768, valid, sourced, local, bestpath-from-AS Local, best (Origin)
Last update: Wed Sep 16 15:03:03 2020
eva(config-router-af)#
Notice the route choosen depends on order received
Fixed behavior:
eva# conf
eva(config)# router bgp 323
eva(config-router)# address-family ipv4 uni
eva(config-router-af)# redistribute connected
eva(config-router-af)# network 192.168.161.0/24
eva(config-router-af)# do show bgp ipv4 uni 192.168.161.0
BGP routing table entry for 192.168.161.0/24
Paths: (2 available, best #1, table default)
Not advertised to any peer
Local
0.0.0.0(eva) from 0.0.0.0 (192.168.161.245)
Origin IGP, metric 0, weight 32768, valid, sourced, local, bestpath-from-AS Local, best (Origin)
Last update: Wed Sep 16 15:03:03 2020
Local
0.0.0.0(eva) from 0.0.0.0 (192.168.161.245)
Origin incomplete, metric 0, weight 32768, valid, sourced
Last update: Wed Sep 16 15:02:52 2020
eva(config-router-af)# no redistribute connected
eva(config-router-af)# do show bgp ipv4 uni 192.168.161.0
BGP routing table entry for 192.168.161.0/24
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
0.0.0.0(eva) from 0.0.0.0 (192.168.161.245)
Origin IGP, metric 0, weight 32768, valid, sourced, local, bestpath-from-AS Local, best (First path received)
Last update: Wed Sep 16 15:03:03 2020
eva(config-router-af)# redistribute connected
eva(config-router-af)# do show bgp ipv4 uni 192.168.161.0
BGP routing table entry for 192.168.161.0/24
Paths: (2 available, best #2, table default)
Not advertised to any peer
Local
0.0.0.0(eva) from 0.0.0.0 (192.168.161.245)
Origin incomplete, metric 0, weight 32768, valid, sourced
Last update: Wed Sep 16 15:03:32 2020
Local
0.0.0.0(eva) from 0.0.0.0 (192.168.161.245)
Origin IGP, metric 0, weight 32768, valid, sourced, local, bestpath-from-AS Local, best (Origin)
Last update: Wed Sep 16 15:03:03 2020
eva(config-router-af)#
Ticket: CM-31490 Found-by: Trey Aspelund <taspelund@nvidia.com> Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* If the MSDP peer receives the SA from a non-RPF peer towards the
originating RP, it will drop the message.
* SA messages are forwarded away from the RP address only.
* SA messages are not forwarded within the mesh group.
* Preventing the MSDP connection from being dropped due to RPF check
failure (RFC3618, section 13 "MSDP Error Handling")
Signed-off-by: Adriano Marto Reis <adrianomarto@gmail.com> Signed-off-by: Adriano Reis <areis@barrukka.local>
Chirag Shah [Thu, 17 Sep 2020 21:20:44 +0000 (14:20 -0700)]
bgpd: no router bgp check candidate config
For `no router bgp` without ASN check candidate
config for default bgp instance presence to avoid
failure from checking backend db where bgp instance
may not be created.
This situation can be seen in transactional cli mode
with following config.
bharat(config)# router bgp 101
bharat(config-router)# exit
bharat(config)# no router bgp
% No BGP process is configured
bharat(config)# no router bgp
% No BGP process is configured
bharat(config)#
Chirag Shah [Sun, 20 Sep 2020 15:13:53 +0000 (08:13 -0700)]
bgpd: move router bgp nb callback
move `router bgp` nb callback at `bgp` node level
to have access to bgp context at neighbor and peer-group
level and align create/destroy callbacks call during
no router bgp.
Earlier `no router bgp` is performed first global destroy
callback is called which essentially removes `bgp context`
then it calls to remove (parallel nodes) neighbor and peer-group
which does not have access to bgp context.
Moving router bgp at bgp solves this destroy callback ordering issue.
Chirag Shah [Wed, 16 Sep 2020 17:28:11 +0000 (10:28 -0700)]
bgpd: correct bgp global context
Move bgp (router bgp) context at "bgp" node
level from (instead of) "global" level.
This change allows access of bgp context at neighbor
and peer-group node levels.
Stephen Worley [Fri, 2 Oct 2020 21:25:36 +0000 (17:25 -0400)]
lib: remove nexthop_same_firsthop() api
Remove the nexthop_same_firsthop() api and just call nexthop_same().
Not entirely sure why we were using this function in the first place,
but now we are just marking dupes with it so lets just call a
common function and avoid issues.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Donald Sharp [Thu, 1 Oct 2020 18:58:37 +0000 (14:58 -0400)]
zebra: Make connected routes their own entry on the meta_q
During quick ifdown / ifup events from the linux kernel there
exists a situation where a prefix that has both a kernel route
and a static route can queued up on the meta-q. If the static
route happens to point at a connected route for nexthop resolution
and we receive a series of quick up/down events *after* the
static route and kernel route are queued up for rib reprocessing.
Since the static route and kernel route are queued on meta-q 1
and the connected route is also on meta-q 1 there exists a situation
where the connected route will be resolved after the static route
fails to resolve, leaving the static route in a unresolved state.
Add a new queue level and put connected routes on their own level,
since they are the fundamental building blocks of pretty much
all the other routes.
Donald Sharp [Wed, 30 Sep 2020 21:55:44 +0000 (17:55 -0400)]
zebra: When processing route_entries ignore unusable routes
When zebra is processing routes to determine what to send
to the rib, suppose we have two routes (a) a route processed
earlier that none of it's nexthops were active and (b)
a route that has good nexthops but has a worse admin distance.
rib_process, would not relook at (a)'s nexthops because
the ROUTE_ENTRY_CHANGED flag was not true and it would
win when compared to (b) because it's admin distance
was better, leaving us with a state where we would
attempt and fail to install route (a) because it
was not valid.
Modify the code to consider the number of nexthops
we have as a determiner if we can use the route.
Donald Sharp [Wed, 30 Sep 2020 21:26:02 +0000 (17:26 -0400)]
zebra: Prevent uninstall attempts when new entry is not happy
In rib_process_update_fib, the function is sent two route entries
the old ( previously installed ) and new ( the one to install )
When the function detects that the new is unusable because
the number of nexthops that are usable for that route is 0,
then we uninstall the old route. The problem here is that
we should not attempt to uninstall any route that is
not owned by FRR. Modify the code to not attempt
this behavior
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 29 Sep 2020 11:54:35 +0000 (07:54 -0400)]
zebra: Make nexthop_active check use the same debug
When debugging why a route was not successfully installed into the
rib, it would be preferable that the end user only have to turn
on `debug zebra rib detail` as that is what we have been telling
people to do for the last couple of years. Consolidate *back*
to this.
Chirag Shah [Sun, 27 Sep 2020 21:09:43 +0000 (14:09 -0700)]
zebra: avoid duplication node in l3vni l2vni-list
With l2vni flap leading to duplicate entry creation
in l3vni's l2vni-list.
Use list sorted add with no duplicates.
root@TORC11:mgmt:~# show evpn vni 4001
VNI: 4001
Type: L3
Tenant VRF: vrf1
State: Up
...
L2 VNIs: 1000 1000 1000 0 0 1002
root@TORC11:mgmt:~# ip link set down vx-1002
root@TORC11:mgmt:~# ip link set up vx-1002
root@TORC11:mgmt:~# show evpn vni 4001
VNI: 4001
Type: L3
Tenant VRF: vrf1
State: Up
...
L2 VNIs: 1000 1000 1000 0 0 1002 1002
Ticket:CM-31545
Reviewed By:
Testing Done:
With Fix:
Multiple time flaps vni counts remained the same.
root@TORC11:mgmt:~# ip link set down vx-1002
root@TORC11:mgmt:~# ip link set up vx-1002
root@TORC11:mgmt:~# ip link set down vx-1002
root@TORC11:mgmt:~# ip link set up vx-1002
root@TORC11:mgmt:~# net show evpn vni 4001
VNI: 4001
Type: L3
Tenant VRF: vrf1
State: Up
...
L2 VNIs: 1000 1002
bfdd: Make new multihop peer if local-address is unique
Previously if there were two multihop peers created that had the same
peer address but different local addresses then the second peer to be
created would be merged with the first one and niether would be able to
be deleted. This was due to an issue in the function bfd_key_lookup().
When the second peer was created its key would be sent into the lookup
function and would reach the last section, even though it shouldn't
have. A check has been placed around the section so that it will not be
entered if a peer is multihop.
Stephen Worley [Mon, 28 Sep 2020 16:39:22 +0000 (12:39 -0400)]
zebra: set NHG/backup NHG pointers on success zapi read
Only set the NHG/backup NHG pointers of the caller if the read
of the nexthops was successfull. Otherwise, we might free when not
neccessary or double free.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Fri, 25 Sep 2020 17:48:21 +0000 (13:48 -0400)]
lib,zebra,sharpd: add code for backup proto-NHs but disabled
Add the zapi code for encoding/decoding of backup nexthops for when
we are ready for it, but disable it for now so that we revert
to the old way with them.
When zebra gets a proto-NHG with a backup in it, we early fail and
tell the upper level proto. In this case sharpd. Sharpd then reverts
to the old way of installation with the route.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Thu, 3 Sep 2020 17:04:10 +0000 (13:04 -0400)]
zebra: limit no re-install to NHG PROTO using routes
Limit the not re-installation of routes with the same NHG ID
to routes that are using the new NHG PROTO API. This would
only include sharpd and EVPN-MH for now.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Tue, 1 Sep 2020 20:02:12 +0000 (16:02 -0400)]
lib: add doc to clear-up hash_iterate multi deletion
Add some header documentation to make it clear that you
cannot delete more than one item during each iteration.
Doing so could cause memory corruption for next pointer
if its also deleted from the table.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Tue, 1 Sep 2020 18:53:09 +0000 (14:53 -0400)]
zebra: use list to mark for removal when scoring
In scoring our NHEs during shutdown there is a chance we could release mutliple
NHEs at the same time during one iteration. This can cause memory corruption
if the two being released are directly next to each other in the hash table.
hash_iterate accounts for releasing one during the iteration but not
two by setting hbnext before release but if hbnext is also freed,
we obviously can have a problem.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Mon, 3 Aug 2020 18:34:52 +0000 (14:34 -0400)]
zebra: reject proto NHGs of blackhole/interface
Reject proto NHGs of type blackhole/interface for now.
We need to think a bit more about how to resolve these
given the linux kernel needs to know the Address Family
of the routes that will use them and install it with them.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Wed, 22 Jul 2020 17:45:47 +0000 (13:45 -0400)]
zebra: add flag track released state of proto NHGS
Add a flag to track the released state of a proto-based NHG.
This flag is used to know whether the upper level proto has called
the *_del API. Typically, the NHG would just get removed and uninstalled
at this point but there is a chance we are being sent it while routes
are still being owned or we were sent it multiple times. This flag
and associated code handles that.
Ticket: CM-30369
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Thu, 11 Jun 2020 17:49:25 +0000 (13:49 -0400)]
sharpd: implement NHG notification handling
Implement handling of NHG notifications in sharpd so that
the routes don't attempt to use an NHG ID that did not
successfully get created. If it does not get installed, we
fall back to traditional zapi messaging.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Thu, 11 Jun 2020 17:46:48 +0000 (13:46 -0400)]
zebra: reply fail on NHG add if not ifindex/onlink
We currently don't support ADD/DEL/REPLACE with proto-based
NHGs that are not already fully resolved and ifindex/onlink
based. If we are handed one that doesn't have ifindex set
i.e. recursive, gracefully fail and with a notification.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>