Pushpasis Sarkar [Tue, 14 Mar 2023 10:36:06 +0000 (03:36 -0700)]
lib, mgmtd: Add few fixes for commit-check and rollback
This commit contains fixes for the following issues found
- 'mgmt commit check' issued through 'vtysh -f' was actually commtting the changeset.
- On config validation failure backend, mgmtd was not passing the correct error-reason
to frontend.
- 'mgmt rollback ...' was reverting the change on backend, but config on mgmtd daemon
remains intact
Christian Hopps [Tue, 28 Feb 2023 09:08:41 +0000 (04:08 -0500)]
mgmtd: Enroll Staticd as a backend client for MGMTD
This commmit introduces Staticd as a backend client for the MGMTd
framework. All the static commands will be diverted to the MGMT
daemon and will use the transactional model to make changes to the
internal state. Similar mechanism can be used by other daemons to use
the MGMT framework in the future.
This commit includes the following functionalities in the changeset:
1. Diverts all the staticd (config only) commands to MGMTd.
2. Enrolls staticd as a backend client to use the MGMT framework.
3. Modify the staticd NB config handlers so that they can be compiled
into a library and loaded in the MGMTd process context.
Yash Ranjan [Thu, 28 Oct 2021 07:07:11 +0000 (00:07 -0700)]
mgmtd: Add MGMT Transaction Framework
This commit introduces the MGMT Transaction framework that takes
management requests from one (or more) frontend client sessions,
translates them into transactions and drives them to completion
in co-oridination with one (or more) backend client daemons
involved in the request.
This commit includes the following functionalities in the changeset:
1. Introduces the actual Transaction module. Commands added related to
transaction are:
a. show mgmt transaction all
2. Adds support for commit rollback feature which stores upto the 10
commit buffers. Each commit has a commit-id which can be used to
rollback to the exact configuration state.
Commands supported for this feature are:
a. show mgmt commit-history
b. mgmt rollback commit-id COMMIT_ID
3. Add hidden commands to enable record various performance metrics:
a. mgmt performance-measurement
b. mgmt reset-statistic
Christian Hopps [Tue, 28 Feb 2023 15:00:19 +0000 (10:00 -0500)]
mgmtd: Add MGMT Backend Interface Framework
This commit introduces the MGMT Backend Interface which can be used
by back-end management client daemons like BGPd, Staticd, Zebra to
connect with new FRR Management daemon (MGMTd) and utilize the new
FRR Management Framework to let any Frontend clients to retrieve any
operational data or manipulate any configuration data owned by the
individual Backend daemon component.
This commit includes the following functionalities in the changeset:
1. Add new Backend server for Backend daemons connect to.
2. Add a C-based Backend client library which can be used by daemons
to communicate with MGMTd via the Backend interface.
3. Maintain a backend adapter for each connection from an appropriate
Backend client to facilitate client requests and track one or more
transactions initiated from Frontend client sessions that involves
the backend client component.
4. Add the following commands to inspect various Backend client
related information
a. show mgmt backend-adapter all
b. show mgmt backend-yang-xpath-registry
c. show mgmt yang-xpath-subscription
Christian Hopps [Wed, 8 Mar 2023 22:26:38 +0000 (17:26 -0500)]
mgmtd: Add MGMT Frontend Interface Framework
This commit introduces the Frontend Interface which can be used
by front-end management clients like Netconf server, Restconf
Server and CLI to interact with new FRR Management daemon (MGMTd)
to access and sometimes modify FRR management data.
This commit includes the following functionalities in the changeset:
1. Add new Frontend server for clients connect to.
2. Add a C-based Frontend client library which can be used by Frontend
clients to communicate with MGMTd via the Frontend interface.
3. Maintain a frontend adapter for each connection from an appropriate
Frontend client to facilitate client requests and track one or more
client sessions across it.
4. Define the protobuf message format for messages to be exchanged
between MGMTd Frontend module and the Frontend client.
5. This changeset also introduces an instance of MGMT Frontend client
embedded within the lib/vty module that can be leveraged by any FRR
daemon to connect to MGMTd's Frontend interface. The same has been
integrated with and initialized within the MGMTd daemon's process
context to implement a bunch of 'set-config', 'commit-apply',
'get-config' and 'get-data' commands via VTYSH
Christian Hopps [Wed, 8 Mar 2023 22:22:09 +0000 (17:22 -0500)]
mgmtd: Bringup MGMTD daemon and datastore module support
Features added in this commit:
1. Bringup/shutdown new management daemon 'mgmtd' along with FRR.
2. Support for Startup, Candidate and Running DBs.
3. Lock/Unlock DS feature using pthread lock.
4. Load config from a JSON file onto candidate DS.
5. Save config to a JSON file from running/candidate DS.
6. Dump candidate or running DS contents on the terminal or a file in
JSON/XML format.
7. Maintaining commit history (Full rollback support to be added in
future commits).
8. Addition of debug commands.
Donald Sharp [Mon, 20 Mar 2023 20:07:20 +0000 (16:07 -0400)]
lib: on bfd peer shutdown actually stop event
When deleting a bfd peer during shutdown, let's ensure
that any scheduled events are actually stopped.
==7759== Invalid read of size 4
==7759== at 0x48BF700: _bfd_sess_valid (bfd.c:419)
==7759== by 0x48BF700: _bfd_sess_send (bfd.c:470)
==7759== by 0x492F79C: thread_call (thread.c:2008)
==7759== by 0x48E9BD7: frr_run (libfrr.c:1223)
==7759== by 0x1C739B: main (bgp_main.c:550)
==7759== Address 0xfb687a4 is 4 bytes inside a block of size 272 free'd
==7759== at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==7759== by 0x48BFA5A: bfd_sess_free (bfd.c:535)
==7759== by 0x2B7034: bgp_peer_remove_bfd (bgp_bfd.c:339)
==7759== by 0x29FF8A: peer_free (bgpd.c:1160)
==7759== by 0x29FF8A: peer_unlock_with_caller (bgpd.c:1192)
==7759== by 0x2A0506: peer_delete (bgpd.c:2633)
==7759== by 0x208190: bgp_stop (bgp_fsm.c:1639)
==7759== by 0x20C082: bgp_event_update (bgp_fsm.c:2751)
==7759== by 0x492F79C: thread_call (thread.c:2008)
==7759== by 0x48E9BD7: frr_run (libfrr.c:1223)
==7759== by 0x1C739B: main (bgp_main.c:550)
==7759== Block was alloc'd at
==7759== at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==7759== by 0x48F53AF: qcalloc (memory.c:116)
==7759== by 0x48BF98D: bfd_sess_new (bfd.c:397)
==7759== by 0x2B76DC: bgp_peer_configure_bfd (bgp_bfd.c:298)
==7759== by 0x2B76DC: bgp_peer_configure_bfd (bgp_bfd.c:279)
==7759== by 0x29BA06: peer_group2peer_config_copy (bgpd.c:2803)
==7759== by 0x2A3D96: peer_create_bind_dynamic_neighbor (bgpd.c:4107)
==7759== by 0x2A4195: peer_lookup_dynamic_neighbor (bgpd.c:4239)
==7759== by 0x21AB72: bgp_accept (bgp_network.c:422)
==7759== by 0x492F79C: thread_call (thread.c:2008)
==7759== by 0x48E9BD7: frr_run (libfrr.c:1223)
==7759== by 0x1C739B: main (bgp_main.c:550)
tl;dr -> Effectively, in this test setup we have 300 dynamic bgp
sessions all of which are using bfd. When a peer collision is detected
or we remove the peers, if an event has been scheduled but not actually
executed yet the event event was not actually being stopped, leaving
the bsp pointer on the thread->arg and causing a crash when it is
executed.
Issue:
When a netns is deleted, since zebra doesn’t receive interface down/delete
notifications from kernel, it manually deletes the interface without removing
the association between zebra_l3vni and the interface that is being deleted
(i.e it deletes the interface without setting “zl3vni->vxlan_if” to NULL).
Later, during the deletion of netns, when zl3vni_rmac_uninstall() is called to
uninstall the remote RMAC from the kernel, zebra ends up accessing stale
“zl3vni->vxlan_if” pointer, which now points to freed memory.
This was causing heap use-after-free.
Fix:
Before zebra starts deleting the interfaces when it receives netns delete notification,
appropriate functions() are being called to remove the association between evpn structs
and interface and set “zl3vni->vxlan_if” to NULL. This ensures that when
zl3vni_rmac_uninstall() is called during netns deletion, it will bail because
“zl3vni->vxlan_if” is NULL.
David Lamparter [Sun, 19 Mar 2023 11:38:49 +0000 (12:38 +0100)]
nhrpd: drop peer references on freeing cache entry
When dropping an interface (e.g. at shutdown) while there are still
valid cache entries, the reference held on the cache entries' peer
pointers was leaking.
Fixes: #12505 Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Renato Westphal [Sat, 18 Mar 2023 01:48:59 +0000 (22:48 -0300)]
ospfd: Fix inconsistency in LSDB JSON output
As it can be seen below, the LSDB JSON output varies depending
whether a filter option is specified or not (e.g. "adv-router",
"self-originate"):
> show ip ospf database router json
{
"routerId":"3.3.3.3",
"routerLinkStates":{
"areas":{
"0.0.0.0":[
{
"lsaAge":175,
"options":"*|-|-|-|-|-|E|-",
[snip]
> show ip ospf database router adv-router 2.2.2.2 json
{
"routerId":"3.3.3.3",
"Router Link States":{
"0.0.0.0":{
"2.2.2.2":{
"lsaAge":193,
"options":"*|-|-|-|-|-|E|-",
[snip]
This inconsistency is undesirable since it makes this data harder to
consume programmatically. Also, in the second output, "Router Link
States" is used as a JSON key, which doesn't conform to our JSON
guidelines (JSON keys need to be camelCased).
Make the required changes to ensure the first output structure is used,
regardless if any output filter is used or not.
pbr-map map6 seq 1
match src-ip 2000::200:100:100:0/96
match dst-ip 2000::100:100:100:0/96
set nexthop-group group3
Before:
torc-12(config)# pbr-map map6 seq 1
torc-12(config-pbr-map)# no match src-ip 2000::200:100:100:0/96
Cannot mismatch families within match src/dst
After:
torc-12(config)# pbr-map map6 seq 1
torc-12(config-pbr-map)# no match src-ip 2000::200:100:100:0/96
torc-12(config-pbr-map)#
Donald Sharp [Fri, 17 Mar 2023 19:40:33 +0000 (15:40 -0400)]
bgpd: Prevent Null pointer deref when outputting data
Crash:
(gdb) bt
0 0x00007fee27de15cb in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
1 0x00007fee280ecd9c in core_handler (signo=11, siginfo=0x7ffe56001bb0, context=<optimized out>) at lib/sigevent.c:264
2 <signal handler called>
3 0x0000555e321c41b2 in prefix_rd2str (prd=0x10, buf=buf@entry=0x7ffe56002080 "27.0.0.R\340\373\062\062^U", size=size@entry=28) at bgpd/bgp_rd.c:168
4 0x0000555e321c431a in printfrr_prd (buf=0x7ffe560021a0, ea=<optimized out>, ptr=<optimized out>) at bgpd/bgp_rd.c:224
5 0x00007fee2812069b in vbprintfrr (cb_in=cb_in@entry=0x7ffe56002330, fmt0=fmt0@entry=0x555e3229a3ad " RD: %pRD\n", ap=ap@entry=0x7ffe560023d8) at lib/printf/vfprintf.c:564
6 0x00007fee28122ef7 in vasnprintfrr (mt=mt@entry=0x7fee281cb5e0 <MTYPE_VTY_OUT_BUF>, out=out@entry=0x7ffe560023f0 " RD: : R\n", outsz=outsz@entry=1024, fmt=fmt@entry=0x555e3229a3ad " RD: %pRD\n", ap=ap@entry=0x7ffe560023d8) at lib/printf/glue.c:103
7 0x00007fee28103504 in vty_out (vty=vty@entry=0x555e33f82d10, format=format@entry=0x555e3229a3ad " RD: %pRD\n") at lib/vty.c:190
8 0x0000555e32185156 in bgp_evpn_es_show_entry_detail (vty=0x555e33f82d10, es=0x555e33c38420, json=<optimized out>) at bgpd/bgp_evpn_mh.c:2655
9 0x0000555e32188fe5 in bgp_evpn_es_show (vty=vty@entry=0x555e33f82d10, uj=false, detail=true) at bgpd/bgp_evpn_mh.c:2721
notice prd=0x10 in #3. This is because in bgp_evpn_mh.c we are sending &es->es_base_frag->prd.
There is one spot in the code where during output the es->es_base_frag is checked for non nullness
Let's just make sure it's right in all the places.
Donatas Abraitis [Fri, 17 Mar 2023 12:48:35 +0000 (14:48 +0200)]
lib: Adjust only `any` flag for prefix-list entries if destroying
Before this patch, if we destroy `any` flag for a prefix-list entry, we always
set destination as 0.0.0.0/0 and/or ::/0.
This means that, if we switch from `ip prefix-list r1-2 seq 5 deny any` to
`ip prefix-list r1-2 seq 5 permit 10.10.10.10/32` we will have
`permit any` eventually, which broke ACLs.
David Lamparter [Fri, 17 Mar 2023 12:34:46 +0000 (13:34 +0100)]
pimd: stop t_sg_expire in MLD NOINFO transition
When hitting gm_sg_update from the S,G expiry timer, t_sg_expire will
already be cancelled. But when arriving there from e.g. the MLD packet
getting cleared out, it'll still be running.
Clear out the timer if we arrive with `has_expired == true`.
rgirada [Fri, 17 Mar 2023 09:17:39 +0000 (09:17 +0000)]
ospfd: Ospf ABR doesnt Advertise LSA summary
Description:
OSPF ABR will summarise the networks based on configured range
and re-advtertise the summarised route. But if configured range
prefix id is same as one of the subset of routes prefix id then
as per rcf2328 Appendex-E recommendation, it will prepare the LSID and originate.
While re-advertising, it is using ospf LSDB instead of area specific
LSDB which is making it fail to re-advertise the summary lsa.
Fixed this by passing correct LSDB pointer.
Donald Sharp [Thu, 16 Mar 2023 23:19:04 +0000 (19:19 -0400)]
bgpd: Always restart timer from scratch in OpenConfirm/Established
Imagine this scenario:
A peer has very large hold/keepalive timers of 600/200. This peer is
using the DataCenter default time. As such the open will cause
the t_holdtime to be negotiated to 600 seconds. Now also imagine
that both peers are in update-delay. If we do not restart the
timers and both peers are in Update Delay, we will continously
reset the peer because the hold time will be hit( since the peer
is not sending us any data ).
Donald Sharp [Wed, 15 Mar 2023 17:51:19 +0000 (13:51 -0400)]
lib: Speedup prefix-list readin by a large factor
Reading in prefix-lists is reading in the specified
prefix list and validating that the prefix is unique
2 times. This makes no sense. Relax the requirement
that a prefix list can limit this as well as completely
remove this check. Validation then just becomes
does this prefix-list specified actually make sense
and that is taken care of by the the cli code.
Reading in prefix-lists was looking for duplicate prefixes
2 times instead of doing it just one time. Let's just
not do it at all.
By doing this change, The code changes from never
completing for a 27k long prefix-list to taking
just under 30 seconds, with 4 daemons processing
this data.
Donald Sharp [Thu, 16 Mar 2023 14:24:25 +0000 (10:24 -0400)]
bgpd: Use interface name instead of pointer value
Log message is borked in a manner that makes it unusable:
bgpd[52]: [VX6SM-8YE5W][EC 33554460] 2000:31:0:53::2: nexthop_set failed, resetting connection - intf 0x561eb9005a30
David Lamparter [Thu, 16 Mar 2023 10:00:02 +0000 (11:00 +0100)]
bgpd: fix NULL argument warning
gcc 12.2.0 complains `error: ‘%s’ directive argument is null`, even
though all enum values are covered with a string. Let's just go with a
`???` default.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
anlan_cs [Thu, 9 Mar 2023 01:34:07 +0000 (09:34 +0800)]
yang, bgpd: Fix "aggregator-asn" to support asdot
The following command is not working:
> (routemap) set aggregator as ASNUM A.B.C.D
Since "aggregator-asn" has already supported asdot,
fixed it with new yang type. Extra ASN validation
(leading zeroes for instance) are done in the validate
hook of the yang leaf.
Signed-off-by: anlan_cs <vic.lan@pica8.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Chirag Shah [Wed, 15 Mar 2023 04:32:40 +0000 (21:32 -0700)]
tools: frr-reload fix list value not present
Check for value present in list before removing
as in certain python3 ValueError traceback is observed.
Traceback (most recent call last):
File "/usr/lib/frr/frr-reload.py",
line 2278, in <module>
(lines_to_add, lines_to_del, restart_frr)
= compare_context_objects(newconf, running)
File "/usr/lib/frr/frr-reload.py",
line 1933, in compare_context_objects
lines_to_add, lines_to_del
File "/usr/lib/frr/frr-reload.py",
line 1549, in ignore_delete_re_add_lines
lines_to_del.remove((ctx_keys, line))
ValueError: list.remove(x): x not in list
Donald Sharp [Sun, 12 Mar 2023 00:37:21 +0000 (19:37 -0500)]
pimd: IN_MULTICAST needs host order
New correct behavior:
eva# conf
eva(config)# ip pim rp 192.168.1.224 224.0.0.0/24
No Path to RP address specified: 192.168.1.224
eva(config)# ip pim rp 224.1.2.3 224.0.0.0/24
% Bad RP address specified: 224.1.2.3
eva(config)#
Fixes: #12970 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Sat, 11 Mar 2023 17:05:44 +0000 (12:05 -0500)]
bgpd: Increment version number even when no data is sent
When an update group decides to not send a prefix
announcement because it has not changed, still increment
the version number. Why? To allow for the situation
where you have say 2 peers in 1 peer group and shortly
after they come up a 3rd peer comes up. It will be
placed into a separate update group and could be
coalesced down, when it finishes updating all data
to it. Now imagine that a single prefix changes at
this point in time as well. Then first 2 peers may
decide to not send the data, since nothing has changed.
While the 3rd peer will and since the versions numbers
never match they will never coalesce. So when the decision
is made to skip, update the version number as well.
zebra: add json support when "show zebra mpls" returns nothing
The "show zebra mpls .. json" vty command may return empty information
in case the MPLS database is empty or a given label entry is not
available. When those errors occur, add the braces to return a
valid json format.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>