Donald Sharp [Wed, 29 Mar 2023 19:27:09 +0000 (15:27 -0400)]
zebra: Ensure gr events run after Meta Queue has run
BGP signals to zebra that a afi has converged immediately
after it has finished processing all routes for a given
afi/safi. This generates events in zebra in this order
a) Routes received from BGP, placed on early-rib Meta-Q
b) Signal GR for the afi.
Now imagine that zebra reads GR code and immediately
processes routes that are in the actual rib and
removes some routes. This generates a
c) route deletion to the kernel for some number of
routes that may be in the the early-rib Meta-Q
d) Process the Meta-Q, and re-install the routes
This is undesirable behavior in zebra. In that
while we may end up in a correct state, there
will be a blip for some number of routes that
happen to be in the early rib Meta-Q.
Modify the GR code to have it's own processing
entry at the end of the Meta-Q. This will
allow all routes to be processed and ready
for handling by the Graceful Restart code.
Donald Sharp [Tue, 28 Mar 2023 20:33:07 +0000 (16:33 -0400)]
zebra: Allow GR to run per AFI as they are reported
The GR code in FRR used to wait till all AFI's were complete
before cleaning up the routes from the upper level protocol.
This of course can lead to some weird situations where say
ipv4 finishes and then v6 is stuck waiting for a peer to come
up and never finishes. v4 when it finishes signals zebra that
it is done but no action is taken at that moment.
Modify the code to allow the zebra_gr.c code to handle a per
afi removal, instead of doing it all at the end.
Donald Sharp [Fri, 24 Mar 2023 19:58:41 +0000 (15:58 -0400)]
zebra: Rearrange zebra_gr zapi functions
The zebra_gr code had 3 functions when effectively only
1 was needed. Cleans up some code weirdness around
multiple switch statements for the same api->cap
as well as consolidating down to only caring about
SAFI_UNICAST, since that is all we care about at the
moment.
Donald Sharp [Fri, 24 Mar 2023 18:29:26 +0000 (14:29 -0400)]
zebra: GR code could potentially stop running
When GR is running and attempting to clear up a node
if the node that is currently saved and we are coming
back to happens to be deleted during the time zebra
suspends the GR code due to hitting the node limit
then zebra GR code will just completely stop processing
and potentially leave stale nodes around forever.
Let's just remove this hole and process what we can.
Can you imagine trying to debug this after the fact?
If we remove a node then that counts toward the maximum
to process of ZEBRA_MAX_STALE_ROUTE_COUNT. This should
prevent any non-processing with a slightly larger cost
of having to look at a few nodes repeatedly
Donald Sharp [Fri, 24 Mar 2023 18:16:49 +0000 (14:16 -0400)]
zebra: Just set the variable for what is wanted in GR code
The info->do_delete variable was being set to true only when
u.val was 1. The problem with this is that u.val is a union
and the various ways that we can call this event causes
different values to be written to the union value on the thread.
This makes no sense. Just set the variable to what we want it to
be when we need it to be true. Since it was only ever set during
a thread_execute section.
Donald Sharp [Mon, 27 Mar 2023 12:09:42 +0000 (08:09 -0400)]
mgmtd: Remove unnecessary asserts
event_add_XXXX functions have no failure path where
if you pass in a double event pointer that it could
return without setting the pointer. As such these
asserts make no sense and are unnecessary
Donald Sharp [Mon, 27 Mar 2023 12:08:36 +0000 (08:08 -0400)]
lib: Remove unneeded asserts in mgmt code
event_add_XXXX functions have no failure path where
if you pass in a double event pointer that it could
return without setting the pointer. As such these
asserts make no sense and are unnecessary
David Lamparter [Sat, 25 Mar 2023 03:34:35 +0000 (12:34 +0900)]
lib/clippy: bail out on newline inside string
While C compilers will generally process strings across lines, we really
don't want that. I rather treat this as the indication of the typo it
probably is warn about it than support this odd C edge case.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donatas Abraitis [Thu, 23 Mar 2023 18:55:14 +0000 (20:55 +0200)]
tools: Set correct directory of vtysh for frr-reload.py
Before it was setting SDIR, which is /usr/lib/frr, but the vtysh binary is put
under bindir (which is /usr/local by default). And running `/usr/lib/frr/frr reload`
failed.
Donald Sharp [Tue, 1 Mar 2022 21:18:12 +0000 (16:18 -0500)]
*: Rename `struct thread` to `struct event`
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Donald Sharp [Mon, 28 Feb 2022 15:40:31 +0000 (10:40 -0500)]
*: Rename thread.[ch] to event.[ch]
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Manoj Naragund [Thu, 23 Mar 2023 12:59:38 +0000 (05:59 -0700)]
ospfd: Fix for memory leak issue in ospf related to flood_reduction tests.
Problem:
Multiple memory leaks after pr12366
RCA:
ospf_lsa_unlock was not happening for the few of the LSAs in
ospf_lsa_refresh_walker after pr12366 due to which memory
related to lsas was leaking.
Donald Sharp [Wed, 22 Mar 2023 22:24:56 +0000 (18:24 -0400)]
pimd: Fix use after free issue for ifp's moving vrfs
We have this valgrind trace:
==1125== Invalid read of size 4
==1125== at 0x170A7D: pim_if_delete (pim_iface.c:203)
==1125== by 0x170C01: pim_if_terminate (pim_iface.c:80)
==1125== by 0x174F34: pim_instance_terminate (pim_instance.c:68)
==1125== by 0x17535B: pim_vrf_terminate (pim_instance.c:260)
==1125== by 0x1941CF: pim_terminate (pimd.c:161)
==1125== by 0x1B476D: pim_sigint (pim_signals.c:44)
==1125== by 0x4910C22: frr_sigevent_process (sigevent.c:133)
==1125== by 0x49220A4: thread_fetch (thread.c:1777)
==1125== by 0x48DC8E2: frr_run (libfrr.c:1222)
==1125== by 0x15E12A: main (pim_main.c:176)
==1125== Address 0x6274d28 is 1,192 bytes inside a block of size 1,752 free'd
==1125== at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1125== by 0x174FF1: pim_vrf_delete (pim_instance.c:181)
==1125== by 0x4925480: vrf_delete (vrf.c:264)
==1125== by 0x4925480: vrf_delete (vrf.c:238)
==1125== by 0x49332C7: zclient_vrf_delete (zclient.c:2187)
==1125== by 0x4934319: zclient_read (zclient.c:4003)
==1125== by 0x492249C: thread_call (thread.c:2008)
==1125== by 0x48DC8D7: frr_run (libfrr.c:1223)
==1125== by 0x15E12A: main (pim_main.c:176)
==1125== Block was alloc'd at
==1125== at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1125== by 0x48E80AF: qcalloc (memory.c:116)
==1125== by 0x1750DA: pim_instance_init (pim_instance.c:90)
==1125== by 0x1750DA: pim_vrf_new (pim_instance.c:161)
==1125== by 0x4924FDC: vrf_get (vrf.c:183)
==1125== by 0x493334C: zclient_vrf_add (zclient.c:2157)
==1125== by 0x4934319: zclient_read (zclient.c:4003)
==1125== by 0x492249C: thread_call (thread.c:2008)
==1125== by 0x48DC8D7: frr_run (libfrr.c:1223)
==1125== by 0x15E12A: main (pim_main.c:176)
and you do this series of events:
a) Create a vrf, put an interface in it
b) Turn on pim on that interface and turn on pim in that vrf
c) Delete the vrf
d) Do anything with the interface, in this case shutdown the system
The move of the interface to a new vrf is leaving the pim_ifp->pim pointer pointing
at the old pim instance, which was just deleted, so the instance pointer was freed.
Let's clean up the pim pointer in the interface pointer as well.
Donald Sharp [Wed, 22 Mar 2023 15:35:28 +0000 (11:35 -0400)]
bgpd: Ensure suppress-fib-pending works with network statements
The flag for telling BGP that a route is expected to be installed
first before notifying a peer was always being set upon receipt
of a path that could be accepted as bestpath. This is not correct:
imagine that you have a peer sending you a route and you have a
network statement that covers the same route. Irrelevant if the
network statement would win the flag on the dest was being set
in bgp_update. Thus you could get into a situation where
the network statement path wins but since the flag is set on
the node, it will never be announced to a peer.
Let's just move the setting of the flag into bgp_zebra_announce
and _withdraw. In _announce set the flag to TRUE when suppress-fib
is enabled. In _withdraw just always unset the flag as that a withdrawal
does not need to wait for rib removal before announcing. This will
cover the case when a network statement is added after the route has
been learned from a peer.
Pushpasis Sarkar [Tue, 14 Mar 2023 10:36:06 +0000 (03:36 -0700)]
lib, mgmtd: Add few fixes for commit-check and rollback
This commit contains fixes for the following issues found
- 'mgmt commit check' issued through 'vtysh -f' was actually commtting the changeset.
- On config validation failure backend, mgmtd was not passing the correct error-reason
to frontend.
- 'mgmt rollback ...' was reverting the change on backend, but config on mgmtd daemon
remains intact
Christian Hopps [Tue, 28 Feb 2023 09:08:41 +0000 (04:08 -0500)]
mgmtd: Enroll Staticd as a backend client for MGMTD
This commmit introduces Staticd as a backend client for the MGMTd
framework. All the static commands will be diverted to the MGMT
daemon and will use the transactional model to make changes to the
internal state. Similar mechanism can be used by other daemons to use
the MGMT framework in the future.
This commit includes the following functionalities in the changeset:
1. Diverts all the staticd (config only) commands to MGMTd.
2. Enrolls staticd as a backend client to use the MGMT framework.
3. Modify the staticd NB config handlers so that they can be compiled
into a library and loaded in the MGMTd process context.
Yash Ranjan [Thu, 28 Oct 2021 07:07:11 +0000 (00:07 -0700)]
mgmtd: Add MGMT Transaction Framework
This commit introduces the MGMT Transaction framework that takes
management requests from one (or more) frontend client sessions,
translates them into transactions and drives them to completion
in co-oridination with one (or more) backend client daemons
involved in the request.
This commit includes the following functionalities in the changeset:
1. Introduces the actual Transaction module. Commands added related to
transaction are:
a. show mgmt transaction all
2. Adds support for commit rollback feature which stores upto the 10
commit buffers. Each commit has a commit-id which can be used to
rollback to the exact configuration state.
Commands supported for this feature are:
a. show mgmt commit-history
b. mgmt rollback commit-id COMMIT_ID
3. Add hidden commands to enable record various performance metrics:
a. mgmt performance-measurement
b. mgmt reset-statistic
Christian Hopps [Tue, 28 Feb 2023 15:00:19 +0000 (10:00 -0500)]
mgmtd: Add MGMT Backend Interface Framework
This commit introduces the MGMT Backend Interface which can be used
by back-end management client daemons like BGPd, Staticd, Zebra to
connect with new FRR Management daemon (MGMTd) and utilize the new
FRR Management Framework to let any Frontend clients to retrieve any
operational data or manipulate any configuration data owned by the
individual Backend daemon component.
This commit includes the following functionalities in the changeset:
1. Add new Backend server for Backend daemons connect to.
2. Add a C-based Backend client library which can be used by daemons
to communicate with MGMTd via the Backend interface.
3. Maintain a backend adapter for each connection from an appropriate
Backend client to facilitate client requests and track one or more
transactions initiated from Frontend client sessions that involves
the backend client component.
4. Add the following commands to inspect various Backend client
related information
a. show mgmt backend-adapter all
b. show mgmt backend-yang-xpath-registry
c. show mgmt yang-xpath-subscription
Christian Hopps [Wed, 8 Mar 2023 22:26:38 +0000 (17:26 -0500)]
mgmtd: Add MGMT Frontend Interface Framework
This commit introduces the Frontend Interface which can be used
by front-end management clients like Netconf server, Restconf
Server and CLI to interact with new FRR Management daemon (MGMTd)
to access and sometimes modify FRR management data.
This commit includes the following functionalities in the changeset:
1. Add new Frontend server for clients connect to.
2. Add a C-based Frontend client library which can be used by Frontend
clients to communicate with MGMTd via the Frontend interface.
3. Maintain a frontend adapter for each connection from an appropriate
Frontend client to facilitate client requests and track one or more
client sessions across it.
4. Define the protobuf message format for messages to be exchanged
between MGMTd Frontend module and the Frontend client.
5. This changeset also introduces an instance of MGMT Frontend client
embedded within the lib/vty module that can be leveraged by any FRR
daemon to connect to MGMTd's Frontend interface. The same has been
integrated with and initialized within the MGMTd daemon's process
context to implement a bunch of 'set-config', 'commit-apply',
'get-config' and 'get-data' commands via VTYSH
Christian Hopps [Wed, 8 Mar 2023 22:22:09 +0000 (17:22 -0500)]
mgmtd: Bringup MGMTD daemon and datastore module support
Features added in this commit:
1. Bringup/shutdown new management daemon 'mgmtd' along with FRR.
2. Support for Startup, Candidate and Running DBs.
3. Lock/Unlock DS feature using pthread lock.
4. Load config from a JSON file onto candidate DS.
5. Save config to a JSON file from running/candidate DS.
6. Dump candidate or running DS contents on the terminal or a file in
JSON/XML format.
7. Maintaining commit history (Full rollback support to be added in
future commits).
8. Addition of debug commands.
Donald Sharp [Tue, 21 Mar 2023 19:06:00 +0000 (15:06 -0400)]
bgpd: Add vrf name and peer name to some bgp debugs
Some debugs were especially hard to figure out in bgp_route.c
a) If using a %p pointer to print the bgp_path_info this
is pretty useless. Print where it came from instead
b) Use A.B.C.D/M(VRFNAME) when outputing the prefix
Donald Sharp [Tue, 21 Mar 2023 12:54:21 +0000 (08:54 -0400)]
*: Add a hash_clean_and_free() function
Add a hash_clean_and_free() function as well as convert
the code to use it. This function also takes a double
pointer to the hash to set it NULL. Also it cleanly
does nothing if the pointer is NULL( as a bunch of
code tested for ).
rgirada [Mon, 20 Mar 2023 11:22:26 +0000 (11:22 +0000)]
ospfd: Fixing Summary origination after range configuration
Description:
After area range config, summary lsas are aggerated to configured
route but later it was being flushed instead of the actual summary
lsa. This was seen when prefix-id of the aggregated route is same
as one of the actual summary route.
Here, aggregated summary lsa need to be returned to set the flag
SUMMARY_APPROVE after originating aggregated summary lsa but its not.
Which is being cleaned up as part of unapproved summary cleanup.
Corrected this now.