Christian Hopps [Wed, 26 Feb 2025 13:34:59 +0000 (13:34 +0000)]
lib: nb: fix bug with oper-state query on list data
The capacity of the xpath string was not guaranteed to be sufficient to hold all
the key predicates and so would truncate. Calculate the required space and
guarantee that it is available.
Gabriel Goller [Tue, 25 Feb 2025 09:24:58 +0000 (10:24 +0100)]
fabricd: add option to treat dummy interfaces as loopback interfaces
Enable dummy-interfaces to be used as router-id interfaces in openfabric
networks. This allows multiple openfabric routers with different
router-ids on a single node when using IP unnumbered setup (interfaces
without IPs configured). Previously we were limited by having a single
loopback interface, allowing only one openfabric router per node.
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
staticd: Extend `static_zebra_release_srv6_sid` to release SRv6 uA SIDs
When removing an SRv6 uA SID, staticd should ask SRv6 SID Manager to
release the SID.
Currently, `static_zebra_release_srv6_sid` does not allow to release uA
SIDs.
This commit extends `static_zebra_release_srv6_sid` to allow staticd to
release SRv6 uA SIDs.
staticd: Extend `static_zebra_request_srv6_sid` to request SRv6 uA SIDs
In order to configure an SRv6 uA SID in staticd, staticd should request
SRv6 SID Manager to allocate a SID bound to the uA behavior.
Currently, `static_zebra_request_srv6_sid` does not support requesting
SIDs bound to the uA behavior.
This commit extends the `static_zebra_request_srv6_sid` function to
enable staticd to request SIDs bound to the uA behavior.
yang: Extend staticd YANG model to support the SRv6 uA behavior
The SRv6 uA behavior is associated with a L3 adjacency.
This commit extends the staticd YANG model by adding two leafs
`interface` and `next-hop` under the `static-sids` container. This
extension allows us to associate an interface and a nexthop when
configuring an SRv6 uA SID.
The uA behavior is associated with an interface and the IP address of
the nexthop. However, the current SID context data structure only
includes the IP address. It lacks the interface.
This commit extends the SID context data structure by adding the
ifindex. This extension allows daemons to allocate uA SIDs with
the required interface and IP address.
Gabriel Goller [Tue, 25 Feb 2025 09:13:34 +0000 (10:13 +0100)]
zebra: add ZEBRA_IF_DUMMY flag for dummy interfaces
Introduce ZEBRA_IF_DUMMY interface flag to identify Linux dummy interfaces [0].
These interfaces behave similarly to loopback interfaces and can be
specially handled by daemons.
Shbinging [Tue, 25 Feb 2025 08:07:45 +0000 (16:07 +0800)]
ripd: fix no ip rip split-horizon poisoned-reverse command
`no ip rip split-horizon poisoned-reverse` will undo poisoned-reverse and set default behavior which is split-horizon.
By contrast, `no ip rip split-horizon` will undo interface's split-horizon behavior.
Acee Lindem [Mon, 24 Feb 2025 21:44:32 +0000 (21:44 +0000)]
ospf6d: Fix use after free of router in OSPFv3 ABR route calculation.
This PR fixes FRR issue https://github.com/FRRouting/frr/issues/18040. The
OSPFv3 route is locked during the ABR calculation since there are
scenarios under which it is freed. The OSPFv3 ABR computation is
sub-optimal and this PR doesn't attempt to rework it.
Louis Scalbert [Fri, 14 Feb 2025 10:58:24 +0000 (11:58 +0100)]
tests: check as number in show run
Creates the default VRF instance after the other VRF instances. The
default VRF instance is created in hidden state. Check that AS number
in show run is correctly written.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Louis Scalbert [Fri, 14 Feb 2025 14:03:00 +0000 (15:03 +0100)]
bgpd: fix leaving hidden state
Upon configuration of a VRF instance that references an absent default
VRF with "import vrf default", the default instance is created in hidden
state. However, the default instance is not properly un-hidden when
configured.
Restore the behavior prior to commit below.
Fixes: 9f7177af13 ("bgpd: fix duplicate BGP instance created with unified config") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
'import vrf VRF' could define a hidden bgp instance with
the default AS_UNSPECIFIED (i.e. = 1) value.
When a
router bgp AS vrf VRF
gets configured later on, replace this AS_UNSPECIFIED setting
with a requested value.
Fixes: 9680831518 ("bgpd: fix as_pretty mem leaks when un-hiding") Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com> Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Upon reconfiguration of the default instance, the prefixes are never set
into a meta queue by mq_add_handler(). They are never processed for
zebra RIB installation and announcements of update/withdraw.
Louis Scalbert [Wed, 12 Feb 2025 11:56:49 +0000 (12:56 +0100)]
bgpd: fix default instance name when un-hiding
When unconfiguring a default BGP instance with VPN SAFI configurations,
the default BGP structure remains but enters a hidden state. Upon
reconfiguration, the instance name incorrectly appears as "VIEW ?"
instead of "VRF default". And the name_pretty pointer
The name_pretty pointer is replaced by another one with the incorrect
name. This also leads to a memory leak as the previous pointer is not
properly freed.
Christian Hopps [Mon, 17 Feb 2025 09:43:11 +0000 (09:43 +0000)]
lib: northbound: support pre-built oper state in libyang tree
This also fixes a bug with specific (position specified) queries on keyless
lists. If the `get_next` callback is using the parent entry it will probably
crash as the code is passing the list_entry as both parent and child in the
specific lookup case.
There may currently be no code that uses the parent entry if the child entry is
non-NULL, though.
Donald Sharp [Sun, 23 Feb 2025 16:04:43 +0000 (11:04 -0500)]
zebra: Add operational retrieval of Multipath Number
The multipath number specified is not available through
the yang data and is not retrievable. Make it so.
At this point in time do not allow this to be set from
yang. Perhaps in the future.
(gdb) bt
0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140400982256064) at ./nptl/pthread_kill.c:44
1 __pthread_kill_internal (signo=6, threadid=140400982256064) at ./nptl/pthread_kill.c:78
2 __GI___pthread_kill (threadid=140400982256064, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
3 0x00007fb1a6442476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
4 0x00007fb1a6950823 in core_handler (signo=6, siginfo=0x7ffd6d832ff0, context=0x7ffd6d832ec0) at lib/sigevent.c:268
5 <signal handler called>
6 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140400982256064) at ./nptl/pthread_kill.c:44
7 __pthread_kill_internal (signo=6, threadid=140400982256064) at ./nptl/pthread_kill.c:78
8 __GI___pthread_kill (threadid=140400982256064, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
9 0x00007fb1a6442476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
10 0x00007fb1a64287f3 in __GI_abort () at ./stdlib/abort.c:79
11 0x00007fb1a699a422 in _zlog_assert_failed (xref=0x55f7dfd3dac0 <_xref.117>,
extra=0x55f7dfd30c30 "BUG: NH %pFX registered but not in hashtable") at lib/zlog.c:789
12 0x000055f7dfd1201f in static_zebra_nht_register (nh=0x55f7fd2ecd80, reg=true) at staticd/static_zebra.c:333
13 0x000055f7dfd29c9d in static_install_nexthop (nh=0x55f7fd2ecd80) at staticd/static_routes.c:299
14 0x000055f7dfd2a126 in static_fixup_vrf (vrf=0x55f7fd2333a0, stable=0x55f7fd271030, afi=AFI_IP, safi=SAFI_UNICAST)
at staticd/static_routes.c:441
15 0x000055f7dfd2a2be in static_fixup_vrf_ids (vrf=0x55f7fd2333a0) at staticd/static_routes.c:494
16 0x000055f7dfd15b53 in static_vrf_enable (vrf=0x55f7fd2333a0) at staticd/static_vrf.c:124
17 0x00007fb1a696ffa5 in vrf_enable (vrf=0x55f7fd2333a0) at lib/vrf.c:325
18 0x00007fb1a6991c87 in zclient_vrf_add (cmd=33, zclient=0x55f7fd29f740, length=76, vrf_id=8) at lib/zclient.c:2701
19 0x00007fb1a6996cba in zclient_read (thread=0x7ffd6d834230) at lib/zclient.c:4764
20 0x00007fb1a696bd9b in event_call (thread=0x7ffd6d834230) at lib/event.c:2019
21 0x00007fb1a68e1a3a in frr_run (master=0x55f7fd102e10) at lib/libfrr.c:1246
22 0x000055f7dfd1081e in main (argc=7, argv=0x7ffd6d834478, envp=0x7ffd6d8344b8) at staticd/static_main.c:193
Tracking this down, the crash is because the nh believes that is already
registered but lookup fails, causing this assert. Looking at the code
static_fixup_vrf is changing the vrf_id. I put a zlog_debug right
before the change of the nh vrf_id and noticed that the vrf id was
UNKNOWN. So, the code is attempting to register into zebra the nexthop
with a vrf unknown( which will be ignored ).
Modify the code in the registration process to notice that the nh is
still UNKNOWN and as such nothing should be done.
Nathan Bahr [Fri, 21 Feb 2025 17:59:04 +0000 (17:59 +0000)]
pim: Fix autorp group joins
Group joining got broken when moving the autorp socket to open/close
as needed. This fixes it so autorp group joining is properly handled
as part of opening the socket.
Martin Buck [Fri, 21 Feb 2025 07:54:49 +0000 (08:54 +0100)]
pimd: Fix PIM VRF support (send register/register stop in VRF)
In 946195391406269003275850e1a4d550ea8db38b and 8ebcc02328c6b63ecf85e44fdfbf3365be27c127, transmission of PIM register and
register stop messages was changed to use a separate socket. However, that
socket is not bound to a possible VRF, so the messages were sent in the
default VRF instead. Call vrf_bind() once after socket creation and when the
VRF is ready to ensure transmission in the correct VRF. vrf_bind() handles
the non-VRF case (i.e. VRF_DEFAULT) automatically, so it may be called
unconditionally.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Donald Sharp [Thu, 20 Feb 2025 19:28:15 +0000 (14:28 -0500)]
bgpd: remove dmed check not required in bestpath selection
As part of the upstream master commit (f3575f61c7 bgpd: Sort the
bgp_path_inf) the snippet of the code for dmed check condition
left out, which leads to an issue of selecting incorrect bestpath.
As an example:
During the bestpath selection local route looses to another path due
to dmed condition being hit.
The snippet of the logs:
2025/02/20 03:06:20.131441 BGP: [JW7VP-K1YVV]
[2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path
27.0.0.7 flags Valid with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131445 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.7 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131452 BGP: [JW7VP-K1YVV] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path 27.0.0.8 flags Valid with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131456 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.8 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131458 BGP: [WEWEC-8SE72] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): path Static announcement is the bestpath from AS 0 <<<< static is best
2025/02/20 03:06:20.131463 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.7 dmed
2025/02/20 03:06:20.131467 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.8 dmed
2025/02/20 03:06:20.131471 BGP: [N6CTF-2RSKS] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): After path selection, newbest is path 27.0.0.7 oldbest was Static announce
Donald Sharp [Tue, 18 Feb 2025 15:25:47 +0000 (10:25 -0500)]
bgpd: Fix another crash in orf
I was pointed at yet another crash in the orf code. I think it
stems from basicaly the same problem as the last one. Let's just
make sure that the orf_plist is handled appropriately.
Shbinging [Mon, 17 Feb 2025 06:45:05 +0000 (14:45 +0800)]
doc: correct `ip rip split-horizon` command in the RIP documentation.
The previous version incorrectly spelled the command as `ip split-horizon`. The correct command is `ip rip split-horizon`, as indicated in the code at line 675 of rip_cli.c.
Additional machine readable information can be printed via the `extra`
argument.
Example:
```python
log.debug("exit context"), extra={"line": line, "ctx_keys": ctx_keys})
log.error(f"Failed to execute command {' '.join(cmd)}", extra={"cmd": cmd})
```
Signed-off-by: Giovanni Tataranni <g.tataranni@gmail.com>
tests: Fix intermittent failures in `srv6_encap_src_addr` topotest
The `srv6_encap_src_addr` runs a vtysh command to configure the SRv6
encapsulation source address and then immediately invokes an iproute2
command to verify that zebra has set this address in the kernel. There
is no wait between the two operations and the verification is attempted
only once. If the topotest does not find the expected address it fails
immediately.
The problem is that when topotest is run on a heavyily loaded system,
it can take some time for zebra to set the address in the kernel.
In this case, when the topotest checks the kernel address right after
running the vtysh command, it doesn't find the expected address because
zebra hasn't set it yet.
This commit gives zebra some time to configure the address. It keeps to
check that the address is the expected one for about 1 minute. If after
1 minute the address is not the expected one then the test fails.
isisd: Request SRv6 locator after zebra connection
When SRv6 is enabled and an SRv6 locator is specified in the IS-IS
configuration, IS-IS may attempt to request SRv6 locator information from
zebra before the connection is fully established. If this occurs, the
request fails with the following error:
staticd: Failed to register nexthop after networking restart
Problem:
After networking restart, staticd unregistered the nexthop
but failed to register the nexthop again, which caused the
nexthop to remain inactive in zebra for static route.
Fix:
Call to static_zebra_nht_register() from static_install_path() was
removed in 3c05d53bf8defc36acdfe6e78064e068d60c649f. Adding it back
so that staticd can register the nexthop for static routes.
Testing:
After networking restart trigger on h1:
Before fix:
```
h1# show ipv6 route vrf vrf1012
Codes: K - kernel route, C - connected, L - local, S - static,
R - RIPng, O - OSPFv3, I - IS-IS, B - BGP, N - NHRP,
T - Table, A - Babel, D - SHARP, F - PBR, f - OpenFabric,
t - Table-Direct, Z - FRR,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF vrf1012:
S ::/0 [1/0] via 2003:7:2::1, swp1.2 inactive, weight 1, 00:00:39
K>* ::/0 [255/8192] unreachable (ICMP unreachable) (vrf default), 00:00:39
L * 2000:9:12::3/128 is directly connected, vrf1012, 00:00:39
C>* 2000:9:12::3/128 is directly connected, vrf1012, 00:00:39
C>* 2003:7:2::/125 is directly connected, swp1.2, 00:00:37
L>* 2003:7:2::3/128 is directly connected, swp1.2, 00:00:37
C>* fe80::/64 is directly connected, swp1.2, 00:00:37
h1#
```
After fix:
```
h1# show ipv6 route vrf vrf1012
Codes: K - kernel route, C - connected, L - local, S - static,
R - RIPng, O - OSPFv3, I - IS-IS, B - BGP, N - NHRP,
T - Table, A - Babel, D - SHARP, F - PBR, f - OpenFabric,
t - Table-Direct, Z - FRR,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF vrf1012:
S>* ::/0 [1/0] via 2003:7:2::1, swp1.2, weight 1, 00:00:15
K * ::/0 [255/8192] unreachable (ICMP unreachable) (vrf default), 00:00:17
L * 2000:9:12::3/128 is directly connected, vrf1012, 00:00:17
C>* 2000:9:12::3/128 is directly connected, vrf1012, 00:00:17
C>* 2003:7:2::/125 is directly connected, swp1.2, 00:00:15
L>* 2003:7:2::3/128 is directly connected, swp1.2, 00:00:15
```
Christian Hopps [Tue, 11 Feb 2025 07:12:06 +0000 (07:12 +0000)]
lib: nb: call child destroy CBs when YANG container is deleted
Previously the code was only calling the child destroy callbacks if the target
deleted node was a non-presence container. We now add a flag to the callback
structure to instruct northbound to perform the rescursive delete for code that
wishes for this to happen.
- Fix wrong relative path lookup in keychain destroy callback