when gre information could not be retrieved because GRE interface has
been deleted, a GRE_UPDATE message may be sent to NHRP. In that case,
the gre values are reset. There was a missing tunnel destination value,
which has been omitted.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Tue, 12 Oct 2021 17:23:40 +0000 (13:23 -0400)]
zebra: modify rib_update to be a bit smarter about malloc
rib_update() was mallocing memory then attempting to schedule
and if the schedule failed( it was already going to be run )
FRR would then free the memory. Fix this memory usage pattern
Donald Sharp [Wed, 20 Oct 2021 12:02:10 +0000 (08:02 -0400)]
tests: When heavily loaded do not send SIGBUS so fast
Our topotests send SIGBUS 2 seconds after a SIGTERM is
initiated. This is bad because under a heavily loaded
topotest system we may have a case where the system has
not had a chance to properly shut down the daemon.
Extend the time greatly before topotests send SIGBUS.
Donatas Abraitis [Tue, 19 Oct 2021 12:14:19 +0000 (15:14 +0300)]
bgpd: Add autocomplete for community/large/extcommunity stuff
```
exit1-debian-9(config)# route-map test1 permit 10
exit1-debian-9(config-route-map)# match community ?
(1-99) Community-list number (standard)
(100-500) Community-list number (expanded)
COMMUNITY_LIST_NAME Community-list name
testas
exit1-debian-9(config-route-map)# match large-community ?
(1-99) Large Community-list number (standard)
(100-500) Large Community-list number (expanded)
LCOMMUNITY_LIST_NAME Large Community-list name
LCL-ORIGINATED-ALL
exit1-debian-9(config-route-map)#
```
David Lamparter [Tue, 12 Oct 2021 22:01:43 +0000 (00:01 +0200)]
tools: remove Linux kernel bits from checkpatch
These aren't appropriate for use in FRR. Among other things, this
enables running checkpatch by calling it in a git working tree with
`tools/checkpatch.pl -g origin/master..`
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Wed, 29 Sep 2021 20:15:11 +0000 (22:15 +0200)]
lib: use sentinel for single-linked lists
Using a non-NULL sentinel allows distinguishing between "end of list"
and "item not on any list". It's a compare either way, just the value
is different.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Sat, 27 Mar 2021 21:05:07 +0000 (22:05 +0100)]
lib: typesafe *_member()
This provides a "is this item on this list" check, which may or may not
be faster than using *_find() for the same purpose. (If the container
has no faster way of doing it, it falls back to using *_find().)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Wed, 29 Sep 2021 18:45:34 +0000 (20:45 +0200)]
lib: null out deleted pointers in typesafe containers
Some of the typesafe containers didn't null out their innards of items
after an item was deleted or popped off the container. This is both a
bit unsafe as well as hinders the upcoming _member() from working
efficiently.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Wed, 29 Sep 2021 20:13:01 +0000 (22:13 +0200)]
tests: fix leak in test code
Even if it doesn't matter for an unit test in general, it hides actual
leaks in the code being tested. Fix so any leaks will be actual bugs.
(Currently there aren't any, yay.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donatas Abraitis [Tue, 19 Oct 2021 12:31:39 +0000 (15:31 +0300)]
bgpd: Add autocomplete for as-path filters
```
exit1-debian-9# show bgp as-path-access-list
<cr>
AS_PATH_FILTER_NAME AS path access list name
acl1 acl2
json JavaScript Object Notation
exit1-debian-9(config)# route-map testas permit 10
exit1-debian-9(config-route-map)# match as-path ?
AS_PATH_FILTER_NAME AS path access-list name
acl1 acl2
```
Igor Ryzhov [Wed, 13 Oct 2021 12:06:38 +0000 (15:06 +0300)]
lib: allow to create interfaces in non-existing VRFs
It allows FRR to read the interface config even when the necessary VRFs
are not yet created and interfaces are in "wrong" VRFs. Currently, such
config is rejected.
For VRF-lite backend, we don't care at all about the VRF of the inactive
interface. When the interface is created in the OS and becomes active,
we always use its actual VRF instead of the configured one. So there's
no need to reject the config.
For netns backend, we may have multiple interfaces with the same name in
different VRFs. So we care about the VRF of inactive interfaces. And we
must allow to preconfigure the interface in a VRF even before it is
moved to the corresponding netns. From now on, we allow to create
multiple configs for the same interface name in different VRFs and
the necessary config is applied once the OS interface is moved to the
corresponding netns.
Donald Sharp [Fri, 15 Oct 2021 15:43:44 +0000 (11:43 -0400)]
tests: Fix ospf_asbr_summary_topo1.py
This script is failing occassionally in our upstream topotests.
Where it was changing route-maps and attempting to see if
summarization was working correctly. The problem was that
the code appeared to be attempting to add route-maps to
redistribution in ospf then modifying the route-maps behavior
to affect summarization as well as the metric type of that
summarization.
The problem is of course that ospf does not appear to modify
the summary routes metric-type when the components
of that summary change it's metric-type. So the test
is testing nothing. In addition the test had messed
up the usage of the route-map generation code and all
the generated config was in different sequence numbers
but route-map processing would never get to those
new sequence numbers because of how route-maps are processed.
Let's just remove this part of the test instead of trying
to unwind it into anything meaningfull
Donald Sharp [Fri, 15 Oct 2021 15:42:06 +0000 (11:42 -0400)]
lib: Add `metric-type` to possible set operations
Several tests used the route_map_create functionality
with `metric-type` but never bothered to add the
backend code to ensure it works correctly.
Add it in so it can be used.
David Lamparter [Thu, 14 Oct 2021 17:11:02 +0000 (19:11 +0200)]
doc/developer: fix warnings in topotests doc
Sphinx warns about a few nits here, just fix. (Note :option:`-E` can't
be used without a "option:: -E" definition, it's intended as a cross
reference.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donald Sharp [Wed, 13 Oct 2021 18:34:08 +0000 (14:34 -0400)]
babeld: Prevent compiler warning about uninited value for n
the variable n, when used must have been set via the find_route_slot
but the compiler in question is probably getting confused with the
multiple levels of indention. Just get around the whole problem
by setting n = 0 and being done with it.
Donald Sharp [Wed, 13 Oct 2021 18:12:51 +0000 (14:12 -0400)]
tests: BFD timing tests under system load need more leeway
We have this pattern in this test:
# Let's kill the interface on rt2 and see what happens with the RIB and BFD on rt1
tgen.gears["rt2"].link_enable("eth-rt1", enabled=False)
# By default BFD provides a recovery time of 900ms plus jitter, so let's wait
# initial 2 seconds to let the CI not suffer.
topotest.sleep(2, 'Wait for BFD down notification')
Under a heavy CI load, interface down events and then reacting to them may not actually
happen within 2 seconds. Allow some more grace time in the test to ensure that we
react to it in an appropriate manner.
Donald Sharp [Wed, 13 Oct 2021 16:46:22 +0000 (12:46 -0400)]
tests: Convert over to using converged to test for ospf being converged
OSPF when it is deciding on whom it should elect for DR and backup
has a process that prioritizes network stabilty over the exact
same results of who is the DR / Backups.
Essentially if we have r1 ----- r2
Let's say r1 has a higher priority, but r2 comes up first, starts
sending hello packets and then decides that it is the DR. At some
point in time in the future, r1 comes up and then connects to r2
at that point it sees that r2 has elected itself DR and it keeps
it that way.
This is by design of the system. With our tight ospf timers as
well as high load being experienced on our test systems. There
exists a bunch of ospf tests that we cannot guarantee that a
consistent DR will be elected for the test. As such let's not
even pretend that we care a bunch and just look for `Full`.
If we care about `ordering` we need to spend more time getting
the tests to actually start routers, ensure that htey are up and
running in the right order so that priority can take place.
Donald Sharp [Wed, 13 Oct 2021 16:40:35 +0000 (12:40 -0400)]
ospfd: Add `converged` and `role` json output for neighbor command
The `show ip ospf neighbor json` command was displaying
state:`Full\/DR`
Where state was both the role and whether or not the neigbhor
was converged. While from a OSPF perspective this is the state.
This state is a combination of two things.
This creates a problem in testing because we have no guarantee
that a particular ospf router will actually have a particular role
given how loaded our topotest systems are. So add a bit of json
output to display both the converged status as well as the
role this router is playing on this neighbor/interface.
The above becomes:
state:`Full\/DR`
converged:`Full`
role:`DR`
Tests can now be modified to look for `Full` and allow it to
continue. Most of the tests do not actually care if this
router is the DR or Backup.
Donald Sharp [Wed, 13 Oct 2021 13:03:27 +0000 (09:03 -0400)]
tests: Fix `Invalid escape sequence` warnings in test runs
Test runs are creating these warnings:
bgp_l3vpn_to_bgp_vrf/test_bgp_l3vpn_to_bgp_vrf.py::test_check_linux_mpls
<string>:7: DeprecationWarning: invalid escape sequence \d
Donald Sharp [Wed, 13 Oct 2021 11:58:37 +0000 (07:58 -0400)]
lib: Add missing enum values in switch statement for if_link_type_str
The switch statement over `enum zebra_link_type` had a default
and FRR was missing a few of the pre-defined types we cared about.
Remove the default statement and add the missing values.
Renato Westphal [Tue, 12 Oct 2021 19:08:23 +0000 (16:08 -0300)]
ospfd: fix another DR election issue during graceful restart
Commit 3551ee9e90304 introduced a regression that causes GR to fail
under certain circumstances. In short, while ISM events should
be ignored while acting as a helper for a restarting router, the
DR/BDR fields of the neighbor structure should still be updated
while processing a Hello packet. If that isn't done, it can cause
the helper to elect the wrong DR while exiting from the helper mode,
leading to a situation where there are two DRs for the same network
segment (and a failed GR by consequence). Fix this.
Renato Westphal [Sat, 9 Oct 2021 23:02:16 +0000 (20:02 -0300)]
ospfd: introduce additional opaque capability check in the GR code
Before starting the graceful restart procedures, ospf_gr_prepare()
verifies for each configured OSPF instance whether it has the opaque
capability enabled (a pre-requisite for GR). If not, a warning is
emitted and GR isn't performed on that instance.
This PR introduces an additional opaque capability check that will
return a CLI error when the opaque capability isn't enabled. The
idea is to make it easier for the user to identify when the GR
activation has failed, instead of requiring him or her to check
the logs for errors.
The original opaque capability check from ospf_gr_prepare() was
retaining as it's possible that that function might be called from
other contexts in the future.
Renato Westphal [Fri, 8 Oct 2021 12:05:28 +0000 (09:05 -0300)]
ospfd: fix flushing of Grace-LSAs on broadcast interfaces
The ospfd opaque LSA infrastruture has an issue where it can't store
different versions of the same Type-9 LSA for different interfaces.
When flushing the self-originated Grace-LSAs upon exiting from the GR
mode, the code was looking up the single self-originated Grace-LSA
from the LSDB, setting its age to MaxAge and sending it out on all
interfaces.
The problem is that Grace-LSAs sent on broadcast interfaces have
their own unique "IP interface address" TLV that is used to identify
the restarting router. That way, just reusing the same Grace-LSA for
all interfaces doesn't work.
Fix this by generating a new Grace-LSA with its age manually set
to MaxAge whenever one needs to be flushed. This will allow the "IP
interface address" TLV to be set correctly and make GR work even in
the presence of multiple broadcast interfaces.
In the long term, the opaque LSA infrastructure should be updated
to support Type-9 link-local LSAs correctly so that we don't need to
resort to hacks like this.