Donald Sharp [Wed, 8 Jul 2020 01:48:34 +0000 (21:48 -0400)]
ospfd: allow interfaces to come up in rare situation
On startup of both zebra and ospfd. If ospfd has not
received a valid router-id *but* has received interface
data, interfaces will not be turned on in the state
machine. When ospf finally receives a valid router-id
it would never actually kick the state machine into
action for those interfaces it has been configured for.
Modify ospf on router id changes, *if* the old
router id was INADDR_ANY *and* the interface is
operative *and* the oi->state is ISM_Down, give
it the old kick in the patooeys
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Rafael Zalamena [Mon, 6 Jul 2020 14:39:27 +0000 (11:39 -0300)]
lib: fix route map description memory leak
Route map entries are not getting a chance to call `description` string
deallocation on shutdown or when the parent entry is destroyed, so lets
add a code to handle this in the `route_map_index_delete` function.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Kuldeep Kashyap [Thu, 18 Jun 2020 11:33:31 +0000 (11:33 +0000)]
tests: Add bgp_recursive_route_ebgp_multi_hop test suite
1. Added 7 test cases to verify bgp recursive nexthop and ebgp multi-hop functionality
2. Added framework support to automate these test cases
3. Total execution time is ~5 mins
the code in isis_spf_add2tent was asserting in case the vertex
we were trying to add was already present in the path or tent
trees. This however CAN happen if the user accidentally configures
the system Id of the area to the same value of an estabished
neighbor. Handle this more gracefully by logging and returning,
to prevent crashes.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Mark Stapp [Tue, 30 Jun 2020 16:47:46 +0000 (12:47 -0400)]
zebra: check LSP flags when deleting an LSP
Check the LSP INSTALLED flag in delete apis, to ensure we
enqueue a delete operation for the lfib. Some apis were only
checking the nexthop/nhlfe INSTALLED flags, and those could be
unset if there's an in-flight dataplane update.
GalaxyGorilla [Tue, 19 May 2020 11:52:04 +0000 (11:52 +0000)]
isisd: Fast RIB recovery from BFD recognized link failures
Unfortunately as the topotests show a fast recovery after failure
detection due to BFD is currently not possible because of the following
issue:
There are multiple scheduling mechanisms within isisd to prevent
overload situations. Regarding our problem these two are important:
* scheduler for regenerating ISIS Link State PDUs scheduler for managing
* consecutive SPF calculations
In fact both schedulers are coupled, the first one triggers the second
one, which again is triggered by isis_adj_state_change (which again is
triggered by a BFD 'down' message). The re-calculation of SPF paths
finally triggers updates in zebra for the RIB.
Both schedulers work as a throttle, e.g. they allow the regeneration of
Link State PDUs or a re-calculation for SPF paths only once within a
certain time interval which is configurable (and by default different!).
This means that a request can go through the first scheduler but might
still be 'stuck' at the second one for a while. Or a request can be
'stuck' at the first scheduler even though the second one is ready. This
also explains the 'random' behaviour one can observe testing since a
'fast' recovery is only possible if both schedulers are ready to process
this request.
Note that the solution in this commit is 'thread safe' in the sense that
both schedulers use the same thread master such that the introduced
flags are only used exactly one time (and one after another) for a
'fast' execution.
Further there are some irritating comments and logs which I partially
removed. They seems to be not valid anymore due to changes in thread
management (or they were never valid in the first place).
GalaxyGorilla [Tue, 12 May 2020 11:49:58 +0000 (11:49 +0000)]
tests: Introduce BFD IS-IS topotests
The tests work with the default settings of BFD meaning that bfdd is
able to recognize a 'down' link after ~900ms so a route recovery should
be visible in the RIB after 1 second.
In the current state only IPv4 is used (when using IPv6
autoconfiguration) within BFD, even though the recovery also affects
IPv6 routes. This is different to the current state of ospfd/ospf6d in
combination with BFD since both IPv4 and IPv6 sessions are used there.
Route recovery is tested on RT1. The focus here lies on the two
different routes to RT5. Link failures are generated by taking
down interfaces via the mininet Python interface on RT2 and RT3.
Hence routes are supposed to be adjusted to use RT3 when a link
failure happens on RT2 or vice versa.
Note that only failure recognition and recovery is "fast". BFD
does not monitor a link becoming available again.
Pat Ruddy [Thu, 2 Jul 2020 16:33:37 +0000 (17:33 +0100)]
bgpd: detect change of RT for L3VPN routes
If the RT changes on a L3VPN route then any leak of this route into
a VRF should be withdrawn.
Extend existing EVPN check for RT change to cover L3VPN routes.
Rafael Zalamena [Thu, 2 Jul 2020 17:47:28 +0000 (14:47 -0300)]
topotests: remove duplicated code
Handle the duplicated code with a simple conditional: if called from
specialized API use provided daemons configuration, otherwise fallback
to old `Router` own daemon settings.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Donald Sharp [Tue, 30 Jun 2020 15:16:36 +0000 (11:16 -0400)]
pbrd: Be a bit more lenient with `set nexthop A.B.C.D <intf>`
When specifying an interface in a pbr-map `set nexthop ..` command
be a bit more lenient about the interface.
a) If the interface does not exist bail on the command
(this is the same)
b) If the interface exists but is in a different vrf
than specified use the vrf it is actually in.
(this is new behavior)
Ticket: CM-30187 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
if we shutdown an interface isisd will delete the adjacencies
on the corresponding circuit, but it will not log the change.
Fix it to make sure that each change is logged. Also specify
the level of the adjacency in the log message, while we are at it.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Donald Sharp [Wed, 1 Jul 2020 13:00:59 +0000 (09:00 -0400)]
bgpd: peer_af_flag_modify_vty assumes 1 flag at a time
We have a bunch of code in bgp_vty.c that was passing
to peer_af_flag_modify_vty more than 1 flag at a time.
This was causing the underlying routines to get the
flags wrong. In order to prevent this convert all the
places where we send multiple flags down to this function
to individual flag changes.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 30 Jun 2020 19:10:20 +0000 (15:10 -0400)]
tests: pbr is not working properly on arm 4.9 kernels
Just disable pbr tests on anything less than 4.10.
This has to do with the fact that the arm platform
is not allowing us to install a route into a
non default table using a interface associated
with a vrf.
ip route add default 4.5.6.7 via swp39 table 10000
When swp39 is in a vrf other than default
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
attempted to use sorted master lists to do faster lookups
by using a RB Tree. Unfortunately the original code
was creating a list->cmp function *but* never using it.
If you look at the commit, it clearly shows that the
function listnode_add is used to insert but when you
look at that function it is a tail push.
Fixes: #6573
Namely now this ordering is preserved:
bgp as-path access-list originate-only permit ^$
bgp as-path access-list originate-only deny .*
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Fri, 26 Jun 2020 11:10:08 +0000 (07:10 -0400)]
tests: Add some more data gathering
From last addition we can tell that the nexthop-group C is
installed but pbr does not think it is. This failure
has been consistent the last 4-5 runs in master. Lets
add a bit more data gathering to figure out what is going on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Mark Stapp [Thu, 18 Jun 2020 17:26:47 +0000 (13:26 -0400)]
zebra: improve route_entry comparison logic
Improve and centralize some logic used to a) compare two
route_entries, and b) to locate a route_entry that matches
a dplane context object that contains the results of a
fib update. We were not rigorous enough in checking routes'
properties, especially when examining connected routes where
we allow multiple route_entries.
Donald Sharp [Thu, 25 Jun 2020 00:15:12 +0000 (20:15 -0400)]
bgpd: Have bgp ignore SIGHUP at the moment
SIGHUP is ostensibly supposed to reload configuration
from a fresh slate. This is currently horribly broken
so much so that bgp just crashes. I see no point
in trying to make this work considering the yang
work coming down the pike.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Quentin Young [Wed, 24 Jun 2020 20:33:18 +0000 (16:33 -0400)]
alpine: enable multi-arch builds
Now that amd64 dependencies have been removed we can use the correct
architecture specifier for Alpine packaging metadata in order to build
packages for all supported platforms.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Donald Sharp [Wed, 24 Jun 2020 18:30:49 +0000 (14:30 -0400)]
tools: Fix reload with 'ipv6 address...' in interface
When you have this configuration:
int foo
ipv6 address fd01:0:0:1::1/64
And issue a reload statement, FRR-reload
is reducing the code to a
`no ipv6 address fd01:0:0:1::/64`
and then issuing a:
`ipv6 address fd01:0:0:1::/64`
The end result is of course that the foo
interface now has two v6 addresses on it.
The brilliance of this is of course if you
happen to have two systems that are connected
over an interface, and you issue a reload command.
They both get fd01:0:0:1::/64 as an ipv6 address
and DAD detection kicks in and stomps on your stuff.
Put a special hey don't munch the v6 address line
in a reload situation.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Mark Stapp [Wed, 24 Jun 2020 13:27:05 +0000 (09:27 -0400)]
tests: fix interface and debug config diff in topojson framework
Include vrf name with interface name when topojson framework
generates interface configuration. This matches the output of
'show runn', and makes config reset less disruptive. Also
stop removing configured debugs and log output when re-generating
config.
Olivier Dugeon [Tue, 23 Jun 2020 14:31:55 +0000 (16:31 +0200)]
isisd: Segment Routing improve subTLVs parser
For Segment Routing, isis_tlvs.c may failed if incorrect or maformed TLVs
are sent to the FRR router. This patch improve detection of such subTLVs error
and skip them, in particular for SRGB, SRLB and MSD subTLVs.
Olivier Dugeon [Thu, 4 Jun 2020 16:03:04 +0000 (18:03 +0200)]
isisd: Start Label Manager safer
Initial attempt to connect to the Label Manager used an infinite loop with
a sleep statement which block isisd until Label Manager connection fire up.
This commit changes the way Label Manager connection is established and uses
a `thread_add_timer()` call to re-attempt to establish the connection in case
of failure (zebra or label manager not ready).
New variables are added to the SRDB in order to control the request of SRGB
and SRLB to the Label Manager to start Segment Routing in a safe way.
Olivier Dugeon [Wed, 20 May 2020 09:18:31 +0000 (11:18 +0200)]
isisd: Add Segment Routing Local Block support
Segment Routing Local Block (SRLB) is part of RFC8667. This change introduces
the possibility for isisd to advertize SRLB in LSP. Base and Range of SRLB
could be configured through CLI or Yang.
Adjacency-SID are now using this SRLB for label allocation. SRLB could also
be used for SID-Binding (e.g. LDP to SR).