Igor Ryzhov [Wed, 28 Apr 2021 22:59:56 +0000 (01:59 +0300)]
isisd: fix ldp-sync configuration
YANG model and CLI commands allow user to configure LDP-sync per area.
But the actual implementation is incorrect - all commands are changing
the config for the whole VRF instead of a single area. This commit fixes
this issue by actually implementing per area configuration.
Igor Ryzhov [Tue, 27 Apr 2021 22:57:21 +0000 (01:57 +0300)]
isisd: allow arbitrary order of area/interface configuration
Currently we don't allow to configure the interface before the area is
configured. This approach has the following issues:
1. The area config can be deleted even when we have an interface config
relying on it. The code is not ready for that - we'll have a whole
bunch of stale pointers if user does that.
2. The code doesn't correctly process the event of changing the VRF for
an interface. There is no mechanism to ensure that the area exists
in the new VRF so currently the circuit still stays in the old VRF.
This commit allows an arbitrary order of area/interface configuration.
There is no more need to configure the area before configuring the
interface.
Igor Ryzhov [Tue, 27 Apr 2021 12:06:37 +0000 (15:06 +0300)]
isisd: update link params after circuit is up
Call from isis_circuit_create works only if we enable isis on an already
existing interface. If we configure isis on a pseudo interface and then
actually create it - this call doesn't work.
Igor Ryzhov [Tue, 27 Apr 2021 14:28:44 +0000 (17:28 +0300)]
isisd: fix incorrect snmp-id gen/free
Necessary structures for snmp-id generation are currently stored in
`struct isis`. When we generate the new circuit ID, we always use the
instance from the default VRF. When we free the circuit ID, we use the
instance from the circuit VRF. This causes the following problems:
1. If there is no instance in the default VRF, this code doesn't work.
2. When circuit in non-default VRF is deleted, the ID is not actually
freed.
This is fixed by using global structures instead. The code itself is
moved to isis_snmp.c and linked to the main code using hooks. We should
not call SNMP-related code when the SNMP module is not loaded at all.
More than that, we don't allow to activate the circuit if we failed to
generate the SNMP ID. Even if SNMP support is completely disabled! This
check is removed.
Igor Ryzhov [Fri, 23 Apr 2021 18:36:12 +0000 (21:36 +0300)]
bgpd: fix bgp_get_vty return values
There are multiple problems:
- commit ef7c53e2 introduced a new return value 2 which broke things,
because a lot of code treats non-zero return as an error,
- there is an incorrect error returned when AS number mismatches.
Fredi Raspall [Thu, 18 Feb 2021 22:45:08 +0000 (23:45 +0100)]
ldpd: defer register for info until configured
Instead of registering to receive default-VRF information and routes
when first connected to zebra, defer the registration until some ldp
configuration is entered.
This avoids redistributing IPv4/IPv6 routes to ldpd when not needed.
Signed-off-by: Fredi Raspall <fredi@voltanet.io> Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Donald Sharp [Sat, 24 Apr 2021 03:50:31 +0000 (23:50 -0400)]
bgpd: Prevent race condition loss of config
If we have a situation where BGP is partially reading in a config
file for a neighbor, *and* the neighbor is coming up *and* we
have a doppelganger. There exists a race condition when we transfer
the config from the doppelganger to the config peer that we will
overwrite later config because we are copying the config data
from the doppelganger peer( which was captured at the start of initiation
of the peering ).
From what I can tell the peer->af_flags variable is to hold configuration
flags for the local peer. The doppelganger should never overwrite this.
Igor Ryzhov [Fri, 23 Apr 2021 22:32:53 +0000 (01:32 +0300)]
tests: fix bfd-bgp-cbit-topo3 test
This test is completely incorrect on test_bfd_loss_intermediate step.
It shuts down the interface and then "waiting" for the BGP session to
fail. But instead of the actual wait it compares the output of "show bfd
peers" with the "up" state. As it does this comparison right after the
interface shutdown, the BFD session has not yet failed and the comparison
is always successful except very rare cases when the command takes a lot
of time to execute (due to the heavy load on CI system I suppose).
David Lamparter [Fri, 23 Apr 2021 13:17:07 +0000 (15:17 +0200)]
ldpd: set `frr_is_after_fork` in lde/ldpe
These subprocesses don't use frr_config_fork(), so frr_is_after_fork is
never set. While the frr_pthread stuff isn't currently used there, set
the flag anyway to avoid future headaches.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donald Sharp [Thu, 22 Apr 2021 19:04:15 +0000 (15:04 -0400)]
isisd: Remove warnings and add some data to debugs for isis_csm.c
When running isis and not running isis on all interfaces results
in a bunch of warn messages to the log about circuit state
changes. These warn messages also didn't bother to inform
the end user what interface was causing the fun. Since
the end operator cannot do anything with these warn messages
and nor should they in the vast array of normal operations
modify the code to use event debugging and turn the warns
to debugs.
Additionally add some information to clue the operator
in on to what actual interface we are talking about.
the code processing an NHT update was only resetting the BGP_NEXTHOP_VALID
flag, so labeled nexthops were considered valid even if there was no
nexthop. Reset the flag in response to the update, and also make the
isvalid_nexthop functions a little more robust by checking the number
of nexthops.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Stephen Worley [Thu, 22 Apr 2021 21:21:12 +0000 (17:21 -0400)]
zebra: handle gracefulRS/retain with proto NHGs
Properly handle refcounting of Proto-owned NHGs when
zebra is operating under graceful restart and retain
conditions.
We have an extra refcnt of 1 we keep for proto-owned NHGs to
indicate the upper level proto has created and owns it.
When we are reading these in from the kernel, we need to set them
to 1 as appropriate. Without this, we fail in the assert() during
zebra_nhg_proto_add() after the owning daemons resends the NHG
and the refcnts are off by one.
Also add in the same logic we use for routes when sweeping with
respect to uptimes.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Donald Sharp [Thu, 22 Apr 2021 19:47:37 +0000 (15:47 -0400)]
tests: Remove kill_mininet_router_process
This function kills all processes that happen to have the same
name to frr processes and it was only ever used in the setup.
Setup should not be used to kill old runs. That should be a
separate process.
Igor Ryzhov [Thu, 22 Apr 2021 12:24:49 +0000 (15:24 +0300)]
lib: remove enabled flag for bfd sessions
Currently this flag is only helpful in an extremely rare situation when
the BFD session registration was unsuccessful and after that zebra is
restarted. Let's remove this flag to simplify the API. If we ever want
to solve the problem of unsuccessful registration/deregistration, this
can be done using internal flags, without API modification.
Also add the error log to help user understand why the BFD session is
not working.
David Lamparter [Thu, 22 Apr 2021 10:10:27 +0000 (12:10 +0200)]
lib: hard-fail creating threads before fork()
Creating any threads before we fork() into the background (if `-d` is
given) is an extremely dangerous footgun; the threads are created in
the parent and terminated when that exits.
This is extra dangerous because while testing, you'd often run the
daemon in foreground without `-d`, and everything works as expected.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Thu, 22 Apr 2021 11:18:19 +0000 (13:18 +0200)]
lib: add frr_config_pre hook
... for any initialization that needs to run after forking, but that
would be racy if it were just scheduled on the thread_master (since the
config load is also just a thread callback, ordering would be undefined
for another scheduled thread callback.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Mark Stapp [Thu, 1 Apr 2021 15:56:30 +0000 (11:56 -0400)]
zebra: include inner labels with recursive backups
When capturing backup nexthops with recursive resolution,
ensure that inner labels from the recursive nexthop are
included in each backup (as they are with the resolving
primary nexthops).
David Lamparter [Thu, 8 Apr 2021 11:35:09 +0000 (13:35 +0200)]
lib: correctly exit CLI nodes on file config load
The (legacy) code for reading split configs tries to execute config
commands in parent nodes, but doesn't call the node_exit function when
it goes up to a parent node. This breaks BGP RPKI setup (and extended
syslog, which is in the next commit.)
Doing this correctly is a slight bit involved since the node_exit
callbacks should only be called if the command is actually executed on a
parent node.
Signed-off-by: David Lamparter <equinox@diac24.net>
David Lamparter [Sat, 10 Apr 2021 19:02:06 +0000 (21:02 +0200)]
lib: fix possible assert() fail in zlog_fd()
If the last message in a batched logging operation isn't printed due to
priority, this skips the code that flushes prepared messages through
writev() and can trigger the assert() at the end of zlog_fd().
Since any logmsg above info priority triggers a buffer flush, running
into this situation requires a log file target configured for info
priority, at least 1 message of info priority buffered, a debug message
buffered after that, and then a buffer flush (explicit or due to buffer
full).
I haven't seen this chain of events happen in the wild, but it needs
fixing anyway.
Signed-off-by: David Lamparter <equinox@diac24.net>
David Lamparter [Wed, 21 Apr 2021 09:54:48 +0000 (11:54 +0200)]
build: properly split CFLAGS from AC_CFLAGS
`CFLAGS` is a "user variable", not intended to be controlled by
configure itself. Let's put all the "important" stuff in AC_CFLAGS and
only leave debug/optimization controls in CFLAGS.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donald Sharp [Wed, 21 Apr 2021 12:57:29 +0000 (08:57 -0400)]
tools: Fix warning when running frr-reload.py
When I run frr-reload.py I am seeing this error:
Apr 21 06:23:51 eva frrinit.sh[3776992]: /usr/lib/frr/frr-reload.py:1094: SyntaxWarning: "is not" with a literal. Did you mean "!="?
Apr 21 06:23:51 eva frrinit.sh[3776992]: if line is not "exit-vrf":
ospf6d: Do not delete external table when configure max-path
Issue: When maximum-path is configured in ospf6 view, the
function ospf6_restart_spf deletes the external table as well
which is not required since that stores the redistribute routes.