Igor Ryzhov [Sun, 25 Feb 2024 23:00:17 +0000 (01:00 +0200)]
lib: fix prefix-list entry update
When a prefix-list entry is updated, current NB code calls the
replacement code multiple times, once per each updated field. It means
that when multiple fields of an entry are changed in a single commit,
the replacement is done with an interim state of a prefix-list instead
of a final one. To fix the issue, we should call the replacement code
once, after all fields of an entry are updated.
Igor Ryzhov [Sun, 25 Feb 2024 21:12:14 +0000 (23:12 +0200)]
lib: fix access-list entry update
When an access-list entry is updated, current NB code calls notification
hooks for each updated field. It means that when multiple fields of an
entry are changed in a single commit, the hooks are run with an interim
state of an access-list instead of a final one. To fix the issue, we
should call the hooks once, after all fields of an entry are updated.
adding a tests about:
"no bgp as-path access-list" command.
the folloxing "clear bgp *" command leads to the
crash exhibited above.
a sleep had been added to capture the crash befor the end of scenario.
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f5f05cbb9c0 (LWP 1371086))]
(gdb) bt
context=0x7ffcf2c216c0) at lib/sigevent.c:248
acl_list=0x55c976ec03c0) at bgpd/bgp_aspath.c:1688
dummy=0x7ffcf2c22340, object=0x7ffcf2c21e70) at bgpd/bgp_routemap.c:2401
match_object=0x7ffcf2c21e70, set_object=0x7ffcf2c21e70, pref=0x0)
at lib/routemap.c:2687
attr=0x7ffcf2c220b0, afi=AFI_IP, safi=SAFI_UNICAST, rmap_name=0x0, label=0x0,
num_labels=0, dest=0x55c976ebeaf0) at bgpd/bgp_route.c:1807
addpath_id=0, attr=0x7ffcf2c22450, afi=AFI_IP, safi=SAFI_UNICAST, type=10,
sub_type=0, prd=0x0, label=0x0, num_labels=0, soft_reconfig=0, evpn=0x0)
at bgpd/bgp_route.c:4424
packet=0x7ffcf2c22410) at bgpd/bgp_route.c:6266
packet=0x7ffcf2c22410, mp_withdraw=false) at bgpd/bgp_packet.c:341
peer=0x55c976e89ed0, size=43) at bgpd/bgp_packet.c:2414
at bgpd/bgp_packet.c:3899
Igor Ryzhov [Mon, 26 Feb 2024 09:30:57 +0000 (11:30 +0200)]
lib: fix setting temporary log options for libyang
By calling `ly_log_options` with `LY_LOSTORE`, the current code
effectively disables libyang logging and never enables it back. The call
is done to get the current logging options, but we don't really need
that. When looking for a schema node, we don't want neither to log nor
to store the error, so simply set the temporary options to 0.
Igor Ryzhov [Sun, 25 Feb 2024 19:50:41 +0000 (21:50 +0200)]
bgpd, yang: fix missing mandatory/default statements on some leafs
The code expects these leafs to always exist. If they are not set, the
daemon would crash. CLI always sets them, but we should properly mark
them as mandatory/default to prevent them from being missed when using
the API.
Igor Ryzhov [Fri, 23 Feb 2024 22:06:41 +0000 (00:06 +0200)]
lib: fix nb callbacks for containers inside choice case
Containers inside a choice's case must be treated as presence containers
as they can be explicitly created and deleted. They must have `create`
and `destroy` callbacks, otherwise the internal data they represent may
never be deleted.
The issue can be reproduced with the following steps:
- create an access-list with destination-network params
```
# access-list test seq 1 permit ip any 10.10.10.0 0.0.0.255
```
- delete the `destination-network` container
```
# mgmt delete-config /frr-filter:lib/access-list[name='test'][type='ipv4']/entry[sequence='1']/destination-network
# mgmt commit apply
MGMTD: No changes found to be committed!
```
As the `destination-network` container is non-presence, and all its
leafs are mandatory, mgmtd doesn't see any changes to be commited and
simply updates its YANG data tree without passing any updates to backend
daemons.
This commit fixes the issue by requiring `create` and `destroy`
callbacks for containers inside choice's cases.
Igor Ryzhov [Fri, 23 Feb 2024 19:14:26 +0000 (21:14 +0200)]
lib: fix order of northbound operations
When ordering operations, destroys must always come before other
operations, to correctly cover the change of a "case" in a "choice".
The problem can be reproduced with the following commands:
```
access-list test seq 1 permit 10.0.0.0/8
access-list test seq 1 permit host 10.0.0.1
access-list test seq 1 permit 10.0.0.0/8
```
Before this commit, the order of changes would be the following:
- `access-list test seq 1 permit 10.0.0.0/8`
- `modify` for `ipv4-prefix`
- `access-list test seq 1 permit host 10.0.0.1`
- `destroy` for `ipv4-prefix`
- `modify` for `host`
- `access-list test seq 1 permit 10.0.0.0/8`
- `modify` for `ipv4-prefix`
- `destroy` for `host`
As `destroy` for `host` is called last, it rewrites the fields that were
filled by `modify` callback of `ipv4-prefix`. This commit fixes this
problem by always calling `destroy` callbacks first.
Igor Ryzhov [Tue, 20 Feb 2024 20:32:52 +0000 (22:32 +0200)]
lib: fix order of northbound callbacks
When ordering the NB callbacks according to their priorities, if the
operation is "destroy" we should reverse the order, to destroy the
dependants before the dependencies.
This fixes the crash, that can be reproduced with the following steps:
```
frr# conf term file-lock
frr(config)# affinity-map map bit-position 10
frr(config)# interface test
frr(config-if)# link-params
frr(config-link-params)# affinity map
frr(config-link-params)# exit
frr(config-if)# exit
frr(config)# mgmt commit apply
frr(config)# no affinity-map map
frr(config)# interface test
frr(config-if)# link-params
frr(config-link-params)# no affinity map
frr(config-link-params)# exit
frr(config-if)# exit
frr(config)# mgmt commit apply
```
Igor Ryzhov [Mon, 12 Feb 2024 18:34:33 +0000 (20:34 +0200)]
mgmtd: fix be_is_client_interested
Backend "subscribe" API allows daemons to dynamically register xpaths
they are interested in. Such xpaths are not stored in hardcoded
config/oper xpath arrays so this function fails to understand that a
backend daemon is interested in them. Fix by using dynamic xpath maps
instead which store both hardcoded and dynamic xpaths.
Igor Ryzhov [Sat, 10 Feb 2024 00:11:13 +0000 (02:11 +0200)]
lib, mgmtd: rework processing of yang notifications
Currently, YANG notification processing is done using a special type of
callbacks registered in backend clients. In this commit, we start using
regular northbound infrastructure instead, because it already has a
convenient way of registering xpath-specific callbacks without the need
for creating additional structures for each necessary notification. We
also now pass a notification data to the callback, instead of a plain
JSON. This allows to use regular YANG library functions for inspecting
notification fields, instead of manually parsing the JSON.
In `qpb.h` we have a bunch of functions that make use of
`union g_addr`. `union g_addr` is defined in `nexthop.h`, which
actually is NOT included in `qpb.h`.
Igor Ryzhov [Fri, 9 Feb 2024 22:58:49 +0000 (00:58 +0200)]
lib, mgmtd: fix processing of yang notifications
Current code assumes that notification is always sent in stripped JSON
format and therefore notification xpath starts at the third symbol of
notification data. Assuming JSON is more or less fine, because this
representation is internal to FRR, but the assumption about the xpath is
wrong, because it won't work for not top-level notifications. YANG
allows to define notification as a child for some data node deep into
the tree and in this case notification data contains not only the
notification node itself, but also all its parents.
To fix the issue, parse the notification data and get its xpath from its
schema node.
> VRF r1-cust4:
> B 5.1.0.0/24 [20/98] via 99.0.0.1 (vrf r1-cust1) inactive, weight 1, 00:00:08
> B>* 99.0.0.1/32 [20/0] via 192.168.1.2, r1-eth4 (vrf r1-cust1), weight 1, 00:00:08
When announcing the routes to zebra, use the peer of the ultimate bgp
path info instead of the one of the first parent path info to determine
whether the route is recursive.
The result is:
> VRF r1-cust4:
> B> 5.1.0.0/24 [20/98] via 99.0.0.1 (vrf r1-cust1) (recursive), weight 1, 00:00:02
> * via 192.168.1.2, r1-eth4 (vrf r1-cust1), weight 1, 00:00:02
> B>* 99.0.0.1/32 [20/0] via 192.168.1.2, r1-eth4 (vrf r1-cust1), weight 1, 00:00:02
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
ospfd: can not delete "segment-routing node-msd" when SR if off
This fixes the initial implementation of commit 7743f2f8c00 ("OSPFd: Update
Segment Routing PR following review") where it wsa not possible to remove
the "segment-routing node-msd" CLI nodes via vtysh once segment-routing got
disabled.
Closes #14910
Signed-off-by: Christian Breunig <christian@breunig.cc>
pimd: re-evaluated S,G OILs upon RP changes and for empty SG upstream oils
Topology:
TOR11 (FHR) --- LEAF-11---SPINE1 (RP)MSDP SPINE-2(RP)MSDP --- LEAF-12 -- TOR12 (LHR)
| | | | |
| -----------------------------------------------------(ECMP) |
| | | |
-----------------------------------------------------------------------(ECMP)
Issue:
In some triggers, S,G upstream is preserved even with the PP timer expiry, resulting
in S,G with NULL OILS. This could be because we create a dummy S,G upstream and
dummy channel_oif for *,G, where RPF is UNKNOWN. As a result, PIM+VXLAN traffic is never
forwarded downstream to LHR.
Fix:
when the S,G stream is running, Determine if a reevaluation of the outgoing interface
list (OIL) is required. S,G upstream should then inherit the OIL from *,G.
bgpd: fix flushing ipv6 flowspec entries when peering stops
When a BGP flowspec peering stops, the BGP RIB entries for IPv6
flowspec entries are removed, but not the ZEBRA RIB IPv6 entries.
Actually, when calling bgp_zebra_withdraw() function call, only
the AFI_IP parameter is passed to the bgp_pbr_update_entry() function
in charge of the Flowspec add/delete in zebra. Fix this by passing
the AFI parameter to the bgp_zebra_withdraw() function.
Note that using topotest does not show up the problem as the
flowspec driver code is not present and was refused. Without that,
routes are not installed, and can not be uninstalled.
Fixes: 529efa234655 ("bgpd: allow flowspec entries to be announced to zebra") Link: https://github.com/FRRouting/frr/pull/2025 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Igor Ryzhov [Fri, 2 Feb 2024 23:15:46 +0000 (01:15 +0200)]
vtysh: remove resync workaround when exiting to config node
When exiting from a level below the config node, like `router rip`,
vtysh executes a resync by sending "end" and "conf term [file-lock]"
commands to all the daemons. As statet in the description comment, it's
done "in case one of the daemons is somewhere else". I don't think this
actually ever happens, but even if it is, it is a bug in a daemon that
needs to be fixed. This resync was okay before the introduction of
mgmtd, but now it unlocks and locks back the datastores during the
configuration reading process, which can lead to a failure which is
explained in the previous commit.
Igor Ryzhov [Fri, 2 Feb 2024 22:42:58 +0000 (00:42 +0200)]
mgmtd, vtysh: fix possible conflict when reading the config
When FRR starts, after mgmtd is initialized, backend clients connect to
it and request their config. To supply the config, mgmtd creates a
configuration transaction. At the same time, `vtysh -b` tries to read
the startup config and configure mgmtd, which also creates a
configuration transaction. If these two actions happen at the exact same
time, there's a conflict between them, because only a single
configuration translaction is allowed. Because of that, vtysh fails and
the config is completely ignored.
When starting the config reading, vtysh locks candidate and running
datastores in mgmtd. This commit adds locking of running datastore when
initializing the backend client. It allows to retry locking on the vtysh
side and read the config only when the lock is aquired instead of
failing.
This change also prevents running datastore from being changed during
initialization of backend clients. This could lead to a desynchronized
state between mgmtd and backends.
Donald Sharp [Wed, 7 Feb 2024 15:38:02 +0000 (10:38 -0500)]
lib, ospfclient, tests, vtysh: Allow for a minimum fd poll size
There exists cases where just honoring the FD_LIMIT size
as given to us by the operating system makes no sense.
Let's just make a switch to allow for this for things
like vtysh and ospfclient which will never have 1k files
open at any given time.
Fixes: #15315 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Louis Scalbert [Wed, 7 Feb 2024 12:16:26 +0000 (13:16 +0100)]
lib: fix affinity map duplicate
Fix duplicate definition of frr_affinity_map_cli_info in libfrr.so.0 and
libmgmt_be_nb.so.0
> =================================================================
> ==3860488==ERROR: AddressSanitizer: odr-violation (0x7f12c98c4d20):
> [1] size=296 'frr_affinity_map_cli_info' lib/affinitymap_cli.c:77:35
> [2] size=296 'frr_affinity_map_cli_info' lib/affinitymap_cli.c:77:35
> These globals were registered at these points:
> [1]:
> #0 0x7f12c9a36f40 in __asan_register_globals ../../../../src/libsanitizer/asan/asan_globals.cpp:341
> #1 0x7f12c9585b7d in _sub_I_00099_1 (/lib/libfrr.so.0+0x185b7d)
> #2 0x7f12ca437fe1 in call_init elf/dl-init.c:72
>
> [2]:
> #0 0x7f12c9a36f40 in __asan_register_globals ../../../../src/libsanitizer/asan/asan_globals.cpp:341
> #1 0x7f12c93824ed in _sub_I_00099_1 (/lib/libmgmt_be_nb.so.0+0x6f4ed)
> #2 0x7f12ca437fe1 in call_init elf/dl-init.c:72
>
> ==3860488==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0
> SUMMARY: AddressSanitizer: odr-violation: global 'frr_affinity_map_cli_info' at lib/affinitymap_cli.c:77:35
> ==3860488==ABORTING
Fixes: dc6ff4c0de ("lib: convert affinity-map to mgmtd") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
David Schweizer [Wed, 24 Jan 2024 10:16:03 +0000 (11:16 +0100)]
topotests: fix snmptrap log OID parsing
Replace OID string parsing of snmptrap log files based on odd/even line
numbers with regex string search to prevent test failures in cases where
log entries don't match assumed order.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>