Christian Hopps [Sun, 18 Jun 2023 20:19:54 +0000 (16:19 -0400)]
mgmtd: KISS the locking code
Move away from things like "lock if not locked" type code, require the
user has locked prior to geting to that point.
For now we warn if we are taking a lock we already had; however, this
should really be a failure point.
New requirements:
SETCFG -
not implicit commit - requires user has locked candidate DS and they
must unlock after
implicit commit - requires user has locked candidate and running DS
both locks will be unlocked on reply to the SETCFG
COMMITCFG -
requires user has locked candidate and running DS and they must unlock
after
rollback - this code now get both locks and then does an unlock and
early return thing on the adapter side. It needs to be un-special
cased in follow up work that would also include tests for this
functionality.
Christian Hopps [Fri, 9 Jun 2023 20:54:54 +0000 (16:54 -0400)]
lib: mgmtd: improvements in logging and commentary
- log names of datastores not numbers
- improve logging for mgmt_msg_read
- Rather than use a bool, instead store the pending const string name of
the command being run that has postponed the CLI. This adds some nice
information to the logging when enabled.
Christian Hopps [Mon, 12 Jun 2023 02:13:48 +0000 (22:13 -0400)]
lib: mgmtd: session create and destroy both short-circuit
For creation this is the first thing done so short-circuit just means inline
sync response. However, for destroy there could be commands in-flight, these
will be discarded when they match no session, and the state cleaned up
immediately when the message short-circuits.
Christian Hopps [Sun, 11 Jun 2023 21:53:10 +0000 (17:53 -0400)]
vtysh: stop reading config file if user `exit`s from root level.
This is required to make sure that we properly send the
XFRR_end_configuration tag to the daemons. Previously if the user had an
`exit` at the root level the parser would just drop out of the config
node and so XFRR_end_configuration, even if sent, would be ignored
isisd: Fix use beyond end of stream of ASLA Sub-TLV parsing
Fixes a crash associated with attempting to read beyond the end of the
stream when parsing ASLA Sub-TLV.
```
Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
subtlv_len=13 '\r') at isisd/isis_tlvs.c:1473
at isisd/isis_tlvs.c:3264
context=<optimized out>, mtid=<optimized out>) at isisd/isis_tlvs.c:6078
indent=4) at isisd/isis_tlvs.c:6142
avail_len=<optimized out>, context=<optimized out>) at isisd/isis_tlvs.c:7032
at isisd/isis_tlvs.c:7054
(gdb)
```
Mark Stapp [Tue, 23 May 2023 19:31:31 +0000 (15:31 -0400)]
pbrd, zebra: fix zapi and netlink rule encoding
In pbrd, don't encode a rule without a table. There are cases
where the zapi encoding was incorrect because the 4-octet
table id was missing. In zebra, mask off the ECN bits in the
TOS byte when encoding an iprule to match netlink's
expectation.
Chirag Shah [Fri, 26 May 2023 20:43:50 +0000 (13:43 -0700)]
ospfd: fix interface param type update
interface link update event needs
to be handle properly in ospf interface
cache.
Example:
When vrf (interface) is created its default type
would be set to BROADCAST because ifp->status
is not set to VRF.
Subsequent link event sets ifp->status to vrf,
ospf interface update need to compare current type
to new default type which would be VRF (OSPF_IFTYPE_LOOPBACK).
Since ospf type param was created in first add event,
ifp vrf link event didn't update ospf type param which
leads to treat vrf as non loopback interface.
zmw12306 [Sat, 3 Jun 2023 19:08:34 +0000 (15:08 -0400)]
bfdd: fix version bits check.
The version of bfd pkt is represented by 3 bits in B[0]. Signed-off-by: zmw12306 <zmw12306@gmail.com>
(cherry picked from commit 3f658e8b1cfc82e1644cc36fcbc1554c70f558d0)
Donald Sharp [Thu, 8 Jun 2023 16:03:49 +0000 (12:03 -0400)]
zebra: Prevent crash because nl is NULL on shutdown
When shutting down the main pthread was first closing
the sockets associated with the dplane pthread and
then telling it to shutdown the pthread at a later point
in time. This caused the dplane to crash because the nl
data has been freed already. Change the shutdown order
to stop the dplane pthread *and* then close the sockets.
Christian Hopps [Thu, 8 Jun 2023 08:12:26 +0000 (04:12 -0400)]
tests: convert old pim test to more cleanly use pytest fixture
This is a good way to run a per-test background helper process. Here the
helper object is created before the test function requesting it (through param
name match), and then cleaned up after the test function exits (pass or failed).
A context manager is used to further guarantee the cleanup is done.
Christian Hopps [Thu, 8 Jun 2023 06:42:32 +0000 (02:42 -0400)]
tests: fixing pim6 topotest bugs
- Remove use of bespoke socat
- Use ipv6 support in mcast-tester.py
- do not run processes in the background behind munet/micronet's
back with `&` (ever) -- use popen or the helper class
According RFC 5880, add a simpilfed version handling authentication Signed-off-by: zmw12306 <zmw12306@gmail.com>
(cherry picked from commit 98707b04d425dfcc24670704d268a733bbf0bc3f)
Chirag Shah [Tue, 6 Jun 2023 04:48:12 +0000 (21:48 -0700)]
tools: fix list value remove in frr-reload
There might be a time element(s) from
temporary list are removed more than once
which leads to valueError in certain python3
version.
commit-id 1543f58b5 did not handle valueError
properly. This caused regression where
prefix-list config leads to delete followed
by add.
The new fix should just pass the exception as
value removal from list_to_add or list_to_del
is best effort.
This allows prefix-list config has no change
then removes the lines from lines_to_del and
lines_to_add properly.
Configure prefix-list in frr.conf and perform
multiple frr-reload. After first reload operatoin
subsequent ones should not result in delete followed
by add of the prefix-list but rather no-op operation.
Christian Hopps [Tue, 6 Jun 2023 19:12:58 +0000 (15:12 -0400)]
mgmtd: assert an assertion for coverity
I believe coverity can't tell the length of the return value from strftime based
on the format string (like we can), so it allows `n` to be larger than it could
be which then allows `sz - n` to be negative which is size_t positive and very
large so it thinks an overrun is possible.
Christian Hopps [Sat, 27 May 2023 16:11:48 +0000 (12:11 -0400)]
tests: fix some broken logging
- make sure we close and remove all handlers for named logs on each reuse.
- test module level exec.log no longer truncated to last test case output
- cleanup the log names, and make sure they are present in all exec logs
- keep separate exec logs for each pytest worker when running in distributed mode
- disabled code due to CI infra can't handle it: add per test case exec logs
Donald Sharp [Fri, 2 Jun 2023 19:04:38 +0000 (15:04 -0400)]
bgpd: entry->any is never true
The only places entry->any could ever be set to true was
when str was NULL. Unfortunately with the way our CLI works
str is impossible to be NonNULL. The entry->any value *used*
to work prior to commit e961923c7217b935027107cad30c35c3907c936f
but it was changed back in 2016 and no-one has noticed the changed
ability.
Let's just admit that there are no users of this and remove this
dead code.
Donald Sharp [Fri, 2 Jun 2023 15:02:54 +0000 (11:02 -0400)]
bgpd: Give more data when state machine fails to change state
When a state machine transition fails, bgpd would output
data about what happened, but not necessarily give the
reason why. Add that data to the output.
Donald Sharp [Thu, 1 Jun 2023 13:57:48 +0000 (09:57 -0400)]
tests: new mgmt_startup tests are failing due to insufficient time
The tests are failing due to heavily loaded system and insufficient
time for large configs to be handled. Increasing the time
allows the tests to complete locally for me under heavy load.
Louis Scalbert [Wed, 31 May 2023 14:53:58 +0000 (16:53 +0200)]
isisd: fix wrongly disabled flex-algo
A configured flex-algo algorithm may remain in disabled state after its
definition is advertised on the area.
It happens sometimes that, in isis_sr_flex_algo_topo1 topotest step 4 or
8, flex-algo 203 is disabled. It depends on the following sequence:
1. Flex-algo 203 is configured on a remote router to be re-advertised.
2. A LSP is received on the local router and contains the algo 203
definition.
3. The local router re-builds its own LSP with lsp_build().
4. local router isis_run_spf() recomputes the algo 203 SPF tree.
A 1. 2. 3. 4. sequence results in a working test. The reception of the
remote LSP (2.) does not trigger the built of the local LSP. If for
some reasons, the sequence is 1. 3. 4. 2. 4., isis_run_spf() will not
knows that flex-algo 203 has been re-enabled because
flex_algo_get_state() only returns the state from the local LSP.
Compare in sequence step 4. the flex-algo state from the local LSP with
the actual state. If the state is not the same, request a new local LSP
generation and quits the re-computation of algo SPF tree. The SPF tree
will be recomputed just after the built of the local LSP.
Fixes: 3f55b8c621 ("isisd: fix disabled flex-algo on race condition") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Donald Sharp [Wed, 31 May 2023 15:40:07 +0000 (11:40 -0400)]
zebra: Unlock the route node when sending route notifications
When using a context to send route notifications to upper
level protocols, the code was using a locking function to
get the route node. There is no need for this to be locked
as such FRR should free it up.