Igor Ryzhov [Mon, 21 Sep 2020 12:35:56 +0000 (15:35 +0300)]
lib: fix regcomp error processing
* use actual error code instead of "false"
* add missing new line
Before:
```
nfware# show interface | include (a]
% Regex compilation error: Success% Bad regexp '(a]'
% Unknown command: show interface | include (a]
```
After:
```
nfware# show interface | include (a]
% Regex compilation error: Unmatched ( or \(
% Bad regexp '(a]'
% Unknown command: show interface | include (a]
```
would end up leaving the 4.4.4.4/32 address on the interface under
freebsd.
This all boils down to the fact that the interface is not
considered connected yet we have a destination. If the
destination is the same and we are not connected ignore
it on freebsd.
I am sure there are other fun scenarios that someone
will have to squirrel out.
bgpd: Implement BGP-wide configuration for graceful shutdown
Add support for a BGP-wide setting to enter and exit graceful shutdown.
This will apply to all BGP peers across all BGP instances. Per-instance
configuration is disallowed if the BGP-wide setting is in effect.
1. Ospf dead-interval will be set as 4 times of hello-interval, incase
if it is not set by using "ip ospf dead-interval <dead-val>".
2. On resetting hello-interval using "no ip ospf hello-interval" the
dead interval and hello due will be changed accordingly.
Donald Sharp [Fri, 18 Sep 2020 11:14:55 +0000 (07:14 -0400)]
lib: Remove debug associated with vrf_get
The vrf_get function is called throughout the code base
so much so that when you turn on vrf debugging it eclipses
everything else to a degree that is completely unreasonable.
Quentin Young [Thu, 17 Sep 2020 19:46:55 +0000 (15:46 -0400)]
tools: fix vtysh failure error handling
Based on the current code, I think the intent was to gracefully handle
vtysh failures and print a useful error message. Barriers in the way of
that:
- Despite reading the results of subprocess.communicate(), there won't
be anything there, because we aren't passing subprocess.PIPE as stdin
and stderr when calling subprocess.Popen()
- Despite catching subprocess.TimeoutExpired, if we were to actually hit
this case frr-reload.py would just crash because it's calling
.communicate() on an unbound process variable, probably a copy-paste
error
- Aside from that, building a kwargs dict to pass to a function that
contains something if something else is not None and nothing if it is,
is pointless when we could just pass the thing itself
Net result is that if vtysh fails to read an frr.conf due to syntax
errors, instead of crashing with a traceback, we actually handle the
error condition, log the problem and vtysh's output, and exit. Actually
we were printing the failed line just by chance because stderr wasn't
captured from the subprocess and I guess showed up as part of systemd's
error capturing or something, but the traceback did a good job of
obscuring that with useless noise.
Old:
frrinit.sh[32183]: * Started watchfrr
frrinit.sh[32183]: line 20: % Unknown command: eee
frrinit.sh[32183]: Traceback (most recent call last):
frrinit.sh[32183]: File "/usr/lib/frr/frr-reload.py", line 1316, in <module>
frrinit.sh[32183]: newconf.load_from_file(args.filename)
frrinit.sh[32183]: File "/usr/lib/frr/frr-reload.py", line 231, in load_from_file
frrinit.sh[32183]: file_output = self.vtysh.mark_file(filename)
frrinit.sh[32183]: File "/usr/lib/frr/frr-reload.py", line 146, in mark_file
frrinit.sh[32183]: % (child.returncode, stderr))
frrinit.sh[32183]: __main__.VtyshException: vtysh (mark file) exited with status 2:
frrinit.sh[32183]: None
New:
frrinit.sh[30090]: * Started watchfrr
frrinit.sh[30090]: vtysh failed to process new configuration: vtysh (mark file) exited with status 2:
frrinit.sh[30090]: line 20: % Unknown command: eee
Quentin Young [Thu, 17 Sep 2020 16:38:12 +0000 (12:38 -0400)]
bgpd: rename bgp_fsm_event_update
This function is poorly named; it's really used to allow the FSM to
decide the next valid state based on whether a peer has valid /
reachable nexthops as determined by NHT or BFD.
Donald Sharp [Wed, 16 Sep 2020 21:48:15 +0000 (17:48 -0400)]
bgpd: Avoid memset when tip hash is empty
The tip hash is only used when we are dealing with
evpn. In bgp_nexthop_self we are doing a memset
irrelevant of whether we will ever find data. Yes
hash_lookup will return pretty quickly.
Modify the code to avoid doing a memset in the case
where the tip hash is empty as that we know we'll
never find anything. With full BGP feeds this
small memset does take some time.
zebra: re-name some mh functions to make the code more readable
As a part of the re-factoring some of the evpn_vni_es apis got re-named
as evpn_evpn_es. Changed them to evpn_es_evi to make it common to
vxlan and mpls.
lib: simplify handling of the sysrepo startup configuration
In the new Sysrepo, all SR_EV_ENABLED notifications are followed by
SR_EV_DONE notifications (assuming no errors occur), so there's no
need to special case the SR_EV_ENABLED event anymore (e.g. do full
transactions in one step).
While here, add a few more guarded debug messages to facilitate
troubleshooting.
lib: fix handling of deleted nodes in the sysrepo plugin
Make the sysrepo plugin ignore the deletion of configuration
nodes that don't exist anymore instead of logging an error and
rejecting the changes. This is necessary because Sysrepo delivers
delete notifications for all nodes of a deleted data tree instead
of delivering a single delete notification of the top-level subtree
node (which would suffice for the northbound layer).
From Sysrepo's documentation:
"Note: do not use fork() after creating a connection. Sysrepo
internally stores PID of every created connection and this way a
mismatch of PID and connection is created".
Introduce a new "frr_very_late_init" hook in libfrr that is only
called after the daemon is forked (when the '-d' option is used)
and after the configuration is read. This way we can initialize
the sysrepo plugin correctly even when the daemon is daemonized,
and after the Sysrepo CLI commands are processed (only "debug
northbound client sysrepo" for now).
suppress route-event logs that are uninformative and add more info to
the ones that matter, i.e. hints on what changed in a route update. The
suppressed logs can be enabled by defining EXTREME_DEBUG to 1, similarly
to what is done elsewhere in isisd (e.g. in isis_spf.c)
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Currently, when the is-type of an area is changed and its circuits resign,
we are not resetting the DIS flag. Consequently, if the area type is reverted
we are not running the DR election and not regenerating the pseudonode LSP.
Also adding event debug logs for circuit commence/resign.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
When the ASBR stops announcing a prefix into the NSSA area, the LSA
type 7 is removed from the area. However the ABR is refreshing the
type 5 in its LSDB while removing the Type 7 LSA. Routers outside
the area do not get an update.
With this change the LSA type 5 is flushed from the LSDB and the
change is announced to routers outside the area
Don Slice [Thu, 10 Sep 2020 12:40:28 +0000 (12:40 +0000)]
bgpd: correct community-list replace logic
Problem rerported that if you enter an existing community list
sequence number with new community information, the entire community
list would be deleted. This commit fixes the replace logic to do
the right thing.
Ticket: CM-30555 Signed-off-by: Don Slice <dslice@nvidia.com>
Donald Sharp [Wed, 9 Sep 2020 03:04:42 +0000 (23:04 -0400)]
tests: Speed up topotests by being more aggressive
We have a bunch of tests that wait *then* check a command for success/failure.
Modify the tests to check *first* then to wait. This reduces test
run times on my system by ~1400 seconds for a full run.
Donald Sharp [Fri, 11 Sep 2020 12:27:28 +0000 (08:27 -0400)]
pimd: Warn when we try to build MAXVIFS > 256
We use the pim mroute socket for kernel notifications of events.
Currently this is limited to 8 bits of data. There are patches
coming down the pike in kernel land to allow this to expand.
Rather than fix this and all the other places we assume MAXVIFS < 256
in the pim code right now. Leave a land mine for the developer
doing this work to point them in the right direction.
Donald Sharp [Fri, 11 Sep 2020 17:05:55 +0000 (13:05 -0400)]
pbrd: Ensure rule is installed on interface up
If we are experiencing an interface that is bouncing
very fast and the last operation that we experienced
was a ifdown we will send rule deletions associated
with that interface. If we have not received notification
that hte rule was removed *but* we immiedately get another
ifup notification when we go to install the rule we
are deciding that it's not ready to send down again,
as that we still think it is installed.
Force the rule installation when we have a interface up
event.
Ticket: CM-31042 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Thu, 10 Sep 2020 15:31:39 +0000 (11:31 -0400)]
bgpd, lib, pbrd, zebra: Pass by ifname
When installing rules pass by the interface name across
zapi.
This is being changed because we have a situation where
if you quickly create/destroy ephermeal interfaces under
linux the upper level protocol may be trying to add
a rule for a interface that does not quite exist
at the moment. Since ip rules actually want the
interface name ( to handle just this sort of situation )
convert over to passing the interface name and storing
it and using it in zebra.
Ticket: CM-31042 Signed-off-by: Stephen Worley <sworley@nvidia.com> Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Change the way the YANG schema node iteration functions work so that
the northbound layer won't have issues with more complex YANG modules
that contain multiple levels of YANG augmentations or modules that
augment themselves indirectly (by augmenting groupings).
Summary of the changes:
* Change the yang_snodes_iterate_subtree() function to always follow
augmentations and add an optional "module" parameter to narrow down
the iteration to nodes of a single module (which is necessary in
some cases). Also, remove the YANG_ITER_ALLOW_AUGMENTATIONS flag
as it's no longer necessary.
* Change yang_snodes_iterate_all() to do a DFS iteration on the resolved
YANG data hierarchy instead of iterating over each module and their
augmentations sequentially.
Reported-by: Rafael Zalamena <rzalamena@opensourcerouting.org> Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
staticd: fix display of the "nexthop-vrf" parameter of static routes
When the static route VRF and its nexthop VRF are inactive in the
kernel, both VRFs will have the same ID (VRF_UNKNOWN) even though
they might not be the same. This can cause "sh run" to not display
the "nexthop-vrf" parameter correctly when necessary. Change the
code to compare VRFs by their names to fix this problem.
staticd: remove checks that are no longer necessary
All call sites of static_route_leak() are passing a non-null pointer
to the 'vty' parameter, hence remove the 'vty' null checks that
are no longer necessary.