nhrpd: Add Hop Count Validation Before Forwarding in nhrp_peer_recv()
According to [RFC 2332, Section 5.1], if an NHS receives a packet that it would normally forward and the hop count is zero, it must send an error indication back to the source and drop the packet.
Donald Sharp [Thu, 27 Mar 2025 14:20:07 +0000 (10:20 -0400)]
pimd: Fix memory leak on shutdown
The gm_join_list has a setup where it attempts to only
create the list upon need and deletes it when the list
is empty. On interface shutdown it was calling the
function to empty the list but it was not empty so
the list was being left at the end. Just add a bit
of code to really clean up the list in the shutdown
case.
Direct leak of 40 byte(s) in 1 object(s) allocated from:
0 0x7f84850b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
1 0x7f8484c391c4 in qcalloc lib/memory.c:106
2 0x7f8484c1ad36 in list_new lib/linklist.c:49
3 0x55d982827252 in pim_if_gm_join_add pimd/pim_iface.c:1354
4 0x55d982852b59 in lib_interface_gmp_address_family_join_group_create pimd/pim_nb_config.c:4499
5 0x7f8484c6a5d3 in nb_callback_create lib/northbound.c:1512
6 0x7f8484c6a5d3 in nb_callback_configuration lib/northbound.c:1910
7 0x7f8484c6bb51 in nb_transaction_process lib/northbound.c:2042
8 0x7f8484c6c164 in nb_candidate_commit_apply lib/northbound.c:1381
9 0x7f8484c6c39f in nb_candidate_commit lib/northbound.c:1414
10 0x7f8484c6cf1c in nb_cli_classic_commit lib/northbound_cli.c:57
11 0x7f8484c72f67 in nb_cli_apply_changes_internal lib/northbound_cli.c:195
12 0x7f8484c73a2e in nb_cli_apply_changes lib/northbound_cli.c:251
13 0x55d9828bd30f in interface_ip_igmp_join_magic pimd/pim_cmd.c:5436
14 0x55d9828bd30f in interface_ip_igmp_join pimd/pim_cmd_clippy.c:6366
15 0x7f8484bb5cbd in cmd_execute_command_real lib/command.c:1003
16 0x7f8484bb5fdc in cmd_execute_command lib/command.c:1062
17 0x7f8484bb6508 in cmd_execute lib/command.c:1228
18 0x7f8484cfb6ec in vty_command lib/vty.c:626
19 0x7f8484cfbc3f in vty_execute lib/vty.c:1389
20 0x7f8484cff9f0 in vtysh_read lib/vty.c:2408
21 0x7f8484cec846 in event_call lib/event.c:1984
22 0x7f8484c1a10a in frr_run lib/libfrr.c:1246
23 0x55d9828fc765 in main pimd/pim_main.c:166
24 0x7f848470c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:
staticd: Avoid requesting SRv6 sid from zebra when loc and sid block dont match
Currently, when the locator block and sid block differs, staticd would
still go ahead and request zebra to allocate the SID which it does if
there is atleast one match (from any locators).
Only when staticd tries to install the route, it sees that the locator
block and sid block are different and avoids installing the route.
Fix:
Check if the locator block and sid block match before even requesting
Zebra to allocate one.
`"` was accidentally added, and random tests failures happening.
Fixes: a4f61b78dd382c438ff4fec2fda7450ecc890edf ("tests: Check if routes are marked as stale and retained with N-bit for GR") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 55d88ee3de422ff6dc206c6ebe5ba96b3ff67967)
Donatas Abraitis [Wed, 26 Mar 2025 12:47:44 +0000 (14:47 +0200)]
tests: Use label 0x800000 instead of 0x000000 for BMP tests
Related-to: 94e2aadf7187d7d695babce21033b5bc8e454f25 ("bgpd: Set the label for MP_UNREACH_NLRI 0x800000 instead of 0x000000") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit e69459c7144568381c05bbaf962adecd914975d5)
Donatas Abraitis [Wed, 26 Mar 2025 08:30:52 +0000 (10:30 +0200)]
bgpd: Set the label for MP_UNREACH_NLRI 0x800000 instead of 0x000000
RFC8277 says:
The procedures in [RFC3107] for withdrawing the binding of a label
or sequence of labels to a prefix are not specified clearly and correctly.
=> How to Explicitly Withdraw the Binding of a Label to a Prefix
Suppose a BGP speaker has announced, on a given BGP session, the
binding of a given label or sequence of labels to a given prefix.
Suppose it now wishes to withdraw that binding. To do so, it may
send a BGP UPDATE message with an MP_UNREACH_NLRI attribute. The
NLRI field of this attribute is encoded as follows:
Upon transmission, the Compatibility field SHOULD be set to 0x800000.
Upon reception, the value of the Compatibility field MUST be ignored.
[RFC3107] also made it possible to withdraw a binding without
specifying the label explicitly, by setting the Compatibility field
to 0x800000. However, some implementations set it to 0x000000. In
order to ensure backwards compatibility, it is RECOMMENDED by this
document that the Compatibility field be set to 0x800000, but it is
REQUIRED that it be ignored upon reception.
In FRR case where a single label is used per-prefix, we should send 0x800000,
and not 0x000000.
Martin Buck [Tue, 25 Mar 2025 15:53:12 +0000 (16:53 +0100)]
tests: Fix wait times in test_ospf6_gr_topo1 topotest
Increase wait times to at least the minimum wait time accepted by
topotest.run_and_expect(). Also change poll interval to 1s, no point in
doings this more frequently.
Finally, slightly improve the topology diagram to also include area numbers.
Donatas Abraitis [Tue, 25 Mar 2025 15:35:41 +0000 (17:35 +0200)]
tests: Check if routes are marked as stale and retained with N-bit for GR
Related-to: b7c657d4e065f310fcf6454abae1a963c208c3b8 ("bgpd: Retain the routes if we do a clear with N-bit set for Graceful-Restart") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit a4f61b78dd382c438ff4fec2fda7450ecc890edf)
Modified the bgp_fsm code to dissallow the extension
of the hold time when the system is under extremely
heavy load. This was a attempt to remove the return
code but it was too aggressive and messed up this bit
of code.
staticd: Fix crash that occurs when modifying an SRv6 SID
When the user modifies an SRv6 SID and then removes all SIDs, staticd
crashes:
```
2025/03/23 08:37:22.691860 STATIC: lib/memory.c:74: mt_count_free(): assertion (mt->n_alloc) failed
STATIC: Received signal 6 at 1742715442 (si_addr 0x8200007cf0); aborting...
STATIC: zlog_signal+0x390 fcc704a844b8ffffd7450390 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: core_handler+0x1f8 fcc704b79990ffffd7450590 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: ---- signal ----
STATIC: ? fcc705c008f8ffffd74507a0 linux-vdso.so.1 (mapped at 0xfcc705c00000)
STATIC: pthread_key_delete+0x1a0 fcc70458f1f0ffffd7451a00 /lib/aarch64-linux-gnu/libc.so.6 (mapped at 0xfcc704510000)
STATIC: raise+0x1c fcc70454a67cffffd7451ad0 /lib/aarch64-linux-gnu/libc.so.6 (mapped at 0xfcc704510000)
STATIC: abort+0xe4 fcc704537130ffffd7451af0 /lib/aarch64-linux-gnu/libc.so.6 (mapped at 0xfcc704510000)
STATIC: _zlog_assert_failed+0x3c4 fcc704c407c8ffffd7451c40 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: mt_count_free+0x12c fcc704a93c74ffffd7451dc0 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: qfree+0x28 fcc704a93fa0ffffd7451e70 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: static_srv6_sid_free+0x1c adc1df8fa544ffffd7451e90 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
STATIC: delete_static_srv6_sid+0x14 adc1df8faafcffffd7451eb0 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
STATIC: list_delete_all_node+0x104 fcc704a60eecffffd7451ed0 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: list_delete+0x8c fcc704a61054ffffd7451f00 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: static_srv6_cleanup+0x20 adc1df8fabdcffffd7451f20 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
STATIC: sigint+0x40 adc1df8be544ffffd7451f30 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
STATIC: frr_sigevent_process+0x148 fcc704b79460ffffd7451f40 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: event_fetch+0x1c4 fcc704bc0834ffffd7451f60 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: frr_run+0x650 fcc704a5d230ffffd7452080 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: main+0x1d0 adc1df8be75cffffd7452270 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
STATIC: __libc_init_first+0x7c fcc7045373fcffffd74522b0 /lib/aarch64-linux-gnu/libc.so.6 (mapped at 0xfcc704510000)
STATIC: __libc_start_main+0x98 fcc7045374ccffffd74523c0 /lib/aarch64-linux-gnu/libc.so.6 (mapped at 0xfcc704510000)
STATIC: _start+0x30 adc1df8be0f0ffffd7452420 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
```
Tracking this down, the crash occurs because every time we modify a
SID, staticd executes some callbacks to modify the SID and finally it
calls `apply_finish`, which re-adds the SID to the list `srv6_sids`.
This leads to having the same SID multiple times in the `srv6_sids`
list. When we delete all SIDs, staticd attempts to deallocate the same
SID multiple times, which leads to the crash.
This commit fixes the issue by moving the code that adds the SID to the
list from the `apply_finish` callback to the `create` callback.
This ensures that the SID is inserted into the list only once, when it
is created. For all subsequent modifications, the SID is modified but
not added to the list.
Tuetuopay [Mon, 17 Mar 2025 14:08:15 +0000 (15:08 +0100)]
bgpd: fix evpn attributes being dropped on input
All assignments of the EVPN attributes (ESI and Gateway IP) are gated
behind the peer being set up for inbound soft-reconfiguration.
There are no actual reasons for this limitation, so let's perform the
EVPN attribute assignment no matter what when soft reconfiguration is
not enabled.
Fixes: 6e076ba5231 ("bgpd: Fix for ain->attr corruption during path update") Signed-off-by: Tuetuopay <tuetuopay@me.com>
(cherry picked from commit 7320659f78cbe86dd983d5101831120fc14583d7)
Tuetuopay [Fri, 14 Mar 2025 19:21:53 +0000 (20:21 +0100)]
tests: add route-map evpn set gateway-ip topotest
This test does not actually look at the route since the gateway-ip is
not exposed in vtysh output. However, this ensures such a route-map does
not crash bgpd.
Tuetuopay [Fri, 14 Mar 2025 19:21:46 +0000 (20:21 +0100)]
bgpd: fix `set evpn gateway-ip ipv[46]` route-map
The `route_set_evpn_gateway_ip` function copies `gw_ip->ip.addr` in the
route's gateway ip. In a nutshell, this skips the `ipa_type` field,
writing the actual IP in the IP type. This later rightfully trips
asserts about unknown IP types.
The following route-map...
```
route-map test permit 10
set evpn gateway-ip ipv4 1.1.1.1
```
...will make the following gateway IP in the route:
Donatas Abraitis [Tue, 11 Feb 2025 19:22:12 +0000 (21:22 +0200)]
zebra: Do not flush an existing vni configuration trying to remove wrong vni
Before:
```
pc.donatas.net(config)# do sh run | include vni
vni 1
pc.donatas.net(config)# no vni 2
pc.donatas.net(config)# do sh run | include vni
pc.donatas.net(config)#
```
Martin Winter [Wed, 19 Mar 2025 12:40:53 +0000 (13:40 +0100)]
redhat: Make sure zeromq is always disabled
Fix issue where zeromq is getting enabled if build system has the libs
installed. For RPMs, we want it always based on intended config options.
(and currently the zeromq is not part of the packages)
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Martin Winter [Wed, 19 Mar 2025 06:21:37 +0000 (07:21 +0100)]
redhat: Make docs and rpki optional for RPM package build
Adding options to disable docs and rpki during the build. By
default they are always built. RPKI sub-package will not be built
(and not available) if built without the RPKI support.
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Donald Sharp [Wed, 19 Mar 2025 20:50:11 +0000 (16:50 -0400)]
bgpd: Fix leaked memory when showing some bgp routes
The two memory leaks:
==387155== 744 (48 direct, 696 indirect) bytes in 1 blocks are definitely lost in loss record 222 of 262
==387155== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==387155== by 0x4C1B982: json_object_new_object (in /usr/lib/x86_64-linux-gnu/libjson-c.so.5.1.0)
==387155== by 0x2E4146: peer_adj_routes (bgp_route.c:15245)
==387155== by 0x2E4F1A: show_ip_bgp_instance_neighbor_advertised_route_magic (bgp_route.c:15549)
==387155== by 0x2B982B: show_ip_bgp_instance_neighbor_advertised_route (bgp_route_clippy.c:722)
==387155== by 0x4915E6F: cmd_execute_command_real (command.c:1003)
==387155== by 0x4915FE8: cmd_execute_command (command.c:1062)
==387155== by 0x4916598: cmd_execute (command.c:1228)
==387155== by 0x49EB858: vty_command (vty.c:626)
==387155== by 0x49ED77C: vty_execute (vty.c:1389)
==387155== by 0x49EFFA7: vtysh_read (vty.c:2408)
==387155== by 0x49E4156: event_call (event.c:2019)
==387155== by 0x4958ABD: frr_run (libfrr.c:1247)
==387155== by 0x206A68: main (bgp_main.c:557)
==387155==
==387155== 2,976 (192 direct, 2,784 indirect) bytes in 4 blocks are definitely lost in loss record 240 of 262
==387155== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==387155== by 0x4C1B982: json_object_new_object (in /usr/lib/x86_64-linux-gnu/libjson-c.so.5.1.0)
==387155== by 0x2E45CA: peer_adj_routes (bgp_route.c:15325)
==387155== by 0x2E4F1A: show_ip_bgp_instance_neighbor_advertised_route_magic (bgp_route.c:15549)
==387155== by 0x2B982B: show_ip_bgp_instance_neighbor_advertised_route (bgp_route_clippy.c:722)
==387155== by 0x4915E6F: cmd_execute_command_real (command.c:1003)
==387155== by 0x4915FE8: cmd_execute_command (command.c:1062)
==387155== by 0x4916598: cmd_execute (command.c:1228)
==387155== by 0x49EB858: vty_command (vty.c:626)
==387155== by 0x49ED77C: vty_execute (vty.c:1389)
==387155== by 0x49EFFA7: vtysh_read (vty.c:2408)
==387155== by 0x49E4156: event_call (event.c:2019)
==387155== by 0x4958ABD: frr_run (libfrr.c:1247)
==387155== by 0x206A68: main (bgp_main.c:557)
For the 1st one, if the operator issues a advertised-routes command, the
json_ar variable was never being freed.
For the 2nd one, if the operator issued a command where the
output_count_per_rd is 0, we need to free the json_routes value.
Nathan Bahr [Wed, 19 Mar 2025 16:07:37 +0000 (16:07 +0000)]
lib: Create VRF if needed
When creating a control plane protocol through NB, create the vrf
if needed instead of only looking up and asserting if it doesn't
exist yet.
Fixes 18429.
Acee Lindem [Fri, 14 Mar 2025 16:02:28 +0000 (16:02 +0000)]
ospf6d: Disable and delete OSPFv3 areas that no longer have interfaces or configuration.
This fix will delete an OSPFv3 area when all the interfaces and
configuration (ranges, NSSA ranges, stub area, NSSA area, filter-list,
import-list and export-list) have been removed. The changes provides
a general solution to https://github.com/FRRouting/frr/issues/18324.
Manpreet Kaur [Thu, 13 Mar 2025 11:14:24 +0000 (04:14 -0700)]
bgpd: Fixed crash upon bgp network import-check command
BT:
```
3 <signal handler called>
4 0x00005616837546fc in bgp_static_update (bgp=bgp@entry=0x5616865eac50, p=0x561686639e40,
bgp_static=0x561686639f50, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at ../bgpd/bgp_route.c:7232
5 0x0000561683754ad0 in bgp_static_add (bgp=0x5616865eac50) at ../bgpd/bgp_table.h:413
6 0x0000561683785e2e in no_bgp_network_import_check (self=<optimized out>, vty=0x5616865e04c0,
argc=<optimized out>, argv=<optimized out>) at ../bgpd/bgp_vty.c:4609
7 0x00007fdbcc294820 in cmd_execute_command_real (vline=vline@entry=0x561686663000,
```
The program encountered a SEG FAULT when attempting to access pi->extra->vrfleak->bgp_orig because
pi->extra->vrfleak was NULL.
```
(gdb) p pi->extra->vrfleak
$1 = (struct bgp_path_info_extra_vrfleak *) 0x0
(gdb) p pi->extra->vrfleak->bgp_orig
Cannot access memory at address 0x8
```
Added NOT NULL check on pi->extra->vrfleak before accessing pi->extra->vrfleak->bgp_orig
to prevent the segmentation fault.
Rajasekar Raja [Mon, 10 Mar 2025 22:26:38 +0000 (15:26 -0700)]
zebra: ensure proper return for failure for Sid allocation
The functions alloc_srv6_sid_func_explicit/dynamic expect to return bool
but we have places where we return a -1 or NULL which the caller is
assuming as a True/Valid and ending up allocating Sid
Without Fix:
2025/03/10 21:44:04.295350 ZEBRA: [XWV20-TGK70] alloc_srv6_sid_func_explicit: trying to allocate explicit SID function 65088 from block fcbb:bbbb::/32
2025/03/10 21:44:04.295351 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 10000 e 20000 wlib s 1000 ewlib s 30000 e 1000 SID_FUNC 65088
2025/03/10 21:44:04.295352 ZEBRA: [QGHMB-SWNFW] alloc_srv6_sid_func_explicit: function 65088 is outside ELIB [10000/20000] and EWLIB alloc ranges [30000/1000]
2025/03/10 21:44:04.295367 ZEBRA: [H0GZA-NNSWJ] get_srv6_sid_explicit: allocated explicit SRv6 SID fcbb:bbbb:1:fe40:: for context End.X nh6 2001::2
2025/03/10 21:44:04.295368 ZEBRA: [XBBYD-T1Q7P] srv6_manager_get_sid_internal: got new SRv6 SID for ctx End.X nh6 2001::2: sid_value=fcbb:bbbb:1:fe40:: (func=65088) (proto=4, instance=0, sessionId=0), notifying all clients
With Fix:
2025/03/10 22:04:25.052235 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 30000 e 31000 wlib s 31000 ewlib s 30000 e 31000 SID_FUNC 65056
2025/03/10 22:04:25.052236 ZEBRA: [YHMRC-EMYNX] alloc_srv6_sid_func_explicit: function 65056 is outside ELIB [30000/31000] and EWLIB alloc ranges [30000/31000]
2025/03/10 22:04:25.052254 ZEBRA: [XSG8X-Q2XJX] get_srv6_sid_explicit: invalid SM request arguments: failed to allocate SID function 65056 from block fcbb:bbbb::/32
2025/03/10 22:04:25.052257 ZEBRA: [YC52T-427SJ] srv6_manager_get_sid_internal: not got SRv6 SID for ctx End.DT6 vrf_id 4, sid_value=fcbb:bbbb:1:fe20::, locator_name=MAIN
root@rajasekarr:/tmp/topotests/static_srv6_sids.test_static_srv6_sids/r1#
- Major highlights:
- Lua 5.4 support
- Fixed CVE-2024-55553
- New match community-count BGP command to limit communities count
- New set metric igp|aigp BGP command to inject IGP metric as MED into BGP
- New bgp ipv6-auto-ra BGP command
- Optimize BGP EVPN L2VNI/L3VIN remote routes processing
- Respect non-transitive BGP extended communities between direct peers
- Drop deprecated bgp network import-check exact command
- Handle BGP ENHE (Extended Next Hop Encoding) capability via dynamic capability
- Implement BGP connect backoff retry
- Implement an ability to import BMP information from a separate BGP instance
- Add support of BGP color extended community color-only types
- Implement SBFD
- Add support for SRv6 static SIDs
- Implement embedded-rp for PIMv6
- Implement AutoRP mapping-agent for PIM
- Implement MSDP peer SA limiting
Donald Sharp [Fri, 7 Mar 2025 23:35:53 +0000 (18:35 -0500)]
tests: Allow mgmtd and zebra to fully come up before other daemons
Currently the topotest infrastructure is starting up daemons
in mgmtd,zebra, staticd then everything else.
The problem that is happening, under heavy load, is that
zebra may not be fully started and when a daemon attempts
to connect to it, it will not be able to connect.
Some of the daemons do not have great retry mechanisms at all.
In addition our normal systemctl startup scripts actually
wait a small amount of time for zebra to be ready before
moving onto the other daemons.
Let's make topotests startup a tiny bit more nuanced
and have mgmtd fully up before starting up zebra.
Martin Buck [Tue, 4 Mar 2025 13:24:33 +0000 (14:24 +0100)]
pimd: Fix PIM6 MLD VRF support (use recvmsg() pktinfo)
When receiving MLD messages, prefer pktinfo over msghdr.msg_name for
determining the source interface. The latter is just the VRF master
interface in case of VRF and we need the true interface the packet was
received on instead.
Soumya Roy [Sat, 15 Feb 2025 02:13:37 +0000 (18:13 -0800)]
zebra: Bring up 514 BGP neighbor sessions
Issue:
When 514 inerfaces/neighbors are configured, it creates socket error,
"Cannot allocate memory", when back to back V6 RA messages are tried
to be sent over the socket. This prevents interface, to know its peer's
link local address. Socket error comes when 1) try to join ICMPv6 all
router multicast group, back to back for all interfaces 2)send back to
back RA for all interfaces
Fix:
1)For ICMPv6 join case, we check if the interface has already joined
all router group, if not try to join. On failure, retry joining after
random amount of time determined 1 ms to ICMPV6_JOIN_TIMER_EXP_MS(100 ms)
2) For RA issue case, batch sending of RA mesages using wheel timer
Testing:
Monitor BGP session running sh bgp summary command
Before fix:
r1# sh bgp summary
IPv4 Unicast Summary:
BGP router identifier 192.168.1.1, local AS number 1001 VRF default vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 515, using 12 MiB of memory
IPv4 Unicast Summary:
BGP router identifier 192.168.1.1, local AS number 1001 VRF default vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 515, using 12 MiB of memory
Christian Hopps [Wed, 26 Feb 2025 13:34:59 +0000 (13:34 +0000)]
lib: nb: fix bug with oper-state query on list data
The capacity of the xpath string was not guaranteed to be sufficient to hold all
the key predicates and so would truncate. Calculate the required space and
guarantee that it is available.
Olivier Dugeon [Mon, 3 Mar 2025 09:08:17 +0000 (10:08 +0100)]
isisd: Correct edge insertion into TED
Edges are not correctly linked to Vertices during LSP processing. In function
lsp_to_edge_cb(), once edge created or updated from the LSP TLVs, the code try
to link the edge to destination vertices. In case the revert edge is not found,
the code try to found a destination vertex to link to. But, the sys_id used
for this operation corresponds to the source vertex. As a result, the edge is
attached as source and destination of the vertex. When Traffic Engineering is
stopped, TED is deleted which result into a double free of the edge attributes.
This cause a crash when attempt to free extended admin groupi the second time.
This patch removed wrong code which link twice the edge to the source vertex.
[...]
segment-routing
srv6
static-sids
sid fcbb:bbbb:1::/48 locator MAIN behavior uN
sid fcbb:bbbb:1:fe00::/64 locator MAIN behavior uDT46
[...]
When the user runs vtysh and executes the `no srv6` command, the
expectation is that staticd will deallocate all SIDs.
However, currently FRR does not behaves as expected. After the user
executes `no srv6`, the SIDs are still present.
The problem is that vtysh does not forward the `no srv6` command to
mgmtd/staticd.
The `no srv6` command is defined using the `DEFUN_YANG_NOSH` macro,
which instructs `xref2vtysh.py` to skip the `no srv6` command during
the generation of `vtysh_cmd.c`. As a result, vtysh is unaware that it
should forward the `no srv6` command to mgmtd/staticd.
This commit fixes the issue by replacing `DEFUN_YANG_NOSH` with
`DEFUN_YANG`. This change ensures that `xref2vtysh.py` includes the
`no srv6` command when generating `vtysh_cmd.c` and makes vtysh forward
the `no srv6` command to mgmtd/staticd.
tools: Fix `frr-reload.py` error related to `static-sids`
```
[...]
segment-routing
srv6
static-sids
sid fcbb:bbbb:1::/48 locator MAIN behavior uN
sid fcbb:bbbb:1:fe10::/64 locator MAIN behavior uDT4 vrf Vrf10
sid fcbb:bbbb:1:fe20::/64 locator MAIN behavior uDT6 vrf Vrf20
sid fcbb:bbbb:1:fe30::/64 locator MAIN behavior uDT46 vrf Vrf30
sid fcbb:bbbb:1:fe40::/64 locator MAIN behavior uA interface sr0 nexthop 2001::2
[...]
```
When the user has a configuration like the one above and runs the
command `frr-reload.py --reload`, the following error occurs:
```
[1129654|mgmtd] sending configuration
line 17: % Unknown command[76]: sid fcbb:bbbb:1::/48 locator MAIN behavior uN
line 23: % Unknown command[76]: sid fcbb:bbbb:1:fe10::/64 locator MAIN behavior uDT4 vrf Vrf10
line 29: % Unknown command[76]: sid fcbb:bbbb:1:fe20::/64 locator MAIN behavior uDT6 vrf Vrf20
line 35: % Unknown command[76]: sid fcbb:bbbb:1:fe30::/64 locator MAIN behavior uDT46 vrf Vrf30
line 41: % Unknown command[76]: sid fcbb:bbbb:1:fe40::/64 locator MAIN behavior uA interface sr0 nexthop 2001::2
```
The problem is that in `frr-reload-py` all commands that start a new
multi-line context must be included in the `ctx_keyword` dictionary.
However, the `static-sids` command is not part of the `ctx_keyword`
dictionary.
This commit fixes the problem by adding `static-sids` to `ctx_keyword`.
Donald Sharp [Wed, 26 Feb 2025 17:34:05 +0000 (12:34 -0500)]
mgmtd: Prevent use after free
ci is picking up this use after free on occasion:
ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x6030001d94a0
0 0x7fab994b7f04 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:119
1 0x7fab994264f6 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_stacktrace.h:131
2 0x7fab994264f6 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1058
3 0x7fab99039bcf in mt_count_free lib/memory.c:78
4 0x7fab99039bcf in qfree lib/memory.c:130
5 0x7fab98ff971a in hash_clean lib/hash.c:290
6 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
7 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
8 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
9 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
10 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
11 0x7fab990ea7a1 in event_fetch lib/event.c:1740
12 0x7fab9901a24e in frr_run lib/libfrr.c:1245
13 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
14 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
15 0x7fab98af9304 in __libc_start_main_impl ../csu/libc-start.c:360
16 0x56110cd8dd30 in _start (/usr/lib/frr/mgmtd+0x3ad30)
0x6030001d94a0 is located 0 bytes inside of 24-byte region [0x6030001d94a0,0x6030001d94b8)
freed by thread T0 here:
0 0x7fab994b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
1 0x7fab99039bf0 in qfree lib/memory.c:131
2 0x7fab98ff93e1 in hash_release lib/hash.c:227
3 0x56110cdaabdc in mgmt_txn_unlock mgmtd/mgmt_txn.c:1931
4 0x56110cdab049 in mgmt_txn_delete mgmtd/mgmt_txn.c:1841
5 0x56110cdab0ce in mgmt_txn_hash_free mgmtd/mgmt_txn.c:1864
6 0x7fab98ff970b in hash_clean lib/hash.c:288
7 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
8 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
9 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
10 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
11 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
12 0x7fab990ea7a1 in event_fetch lib/event.c:1740
13 0x7fab9901a24e in frr_run lib/libfrr.c:1245
14 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
15 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
0 0x7fab994b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
1 0x7fab990392fd in qcalloc lib/memory.c:106
2 0x7fab98ff8b4f in hash_get lib/hash.c:156
3 0x56110cdb13ae in mgmt_txn_create_new mgmtd/mgmt_txn.c:1825
4 0x56110cdb3b4d in mgmt_txn_notify_be_adapter_conn mgmtd/mgmt_txn.c:2212
5 0x56110cd91178 in mgmt_be_adapter_conn_init mgmtd/mgmt_be_adapter.c:842
6 0x7fab990ec6de in event_call lib/event.c:2019
7 0x7fab9901a243 in frr_run lib/libfrr.c:1246
8 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
9 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
The only time that mgmt_txn_hash_free is called is in hash_clean.
There are other places that mgmt_txn_unlock/delete are called and
hash_release should be called. Let's just notice when mgmtd is
being called from the hash_clean and not call hash_release (since
we know it is being released already)
Louis Scalbert [Fri, 14 Feb 2025 10:58:24 +0000 (11:58 +0100)]
tests: check as number in show run
Creates the default VRF instance after the other VRF instances. The
default VRF instance is created in hidden state. Check that AS number
in show run is correctly written.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 077a2b0dfc71443b41d5feceb52023c259436956) Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Louis Scalbert [Fri, 14 Feb 2025 14:03:00 +0000 (15:03 +0100)]
bgpd: fix leaving hidden state
Upon configuration of a VRF instance that references an absent default
VRF with "import vrf default", the default instance is created in hidden
state. However, the default instance is not properly un-hidden when
configured.
Restore the behavior prior to commit below.
Fixes: 9f7177af13 ("bgpd: fix duplicate BGP instance created with unified config") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 70e07678bfe554dd5be30a605ddf6c0fe3a8a39b) Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Louis Scalbert [Fri, 14 Feb 2025 13:07:40 +0000 (14:07 +0100)]
tests: add bgp_l3vpn_hidden topotest
Test that leaving the hidden BGP instance state is working.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 118afe4690d5563887c1b2095d18e23cc77a21a2) Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
'import vrf VRF' could define a hidden bgp instance with
the default AS_UNSPECIFIED (i.e. = 1) value.
When a
router bgp AS vrf VRF
gets configured later on, replace this AS_UNSPECIFIED setting
with a requested value.
Fixes: 9680831518 ("bgpd: fix as_pretty mem leaks when un-hiding") Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com> Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 1515a59202280933936b41c4cb2cb11c7889b279) Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Upon reconfiguration of the default instance, the prefixes are never set
into a meta queue by mq_add_handler(). They are never processed for
zebra RIB installation and announcements of update/withdraw.
Do not delete the BGP process_queue when hiding.
Fixes: 4d0e7a49cf ("bgpd: VRF-Lite fix default bgp delete") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 71a3756f2dda272e69727fa416bca12c016d9567) Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Louis Scalbert [Wed, 12 Feb 2025 11:56:49 +0000 (12:56 +0100)]
bgpd: fix default instance name when un-hiding
When unconfiguring a default BGP instance with VPN SAFI configurations,
the default BGP structure remains but enters a hidden state. Upon
reconfiguration, the instance name incorrectly appears as "VIEW ?"
instead of "VRF default". And the name_pretty pointer
The name_pretty pointer is replaced by another one with the incorrect
name. This also leads to a memory leak as the previous pointer is not
properly freed.
Do not rewrite the instance name.
Fixes: 4d0e7a49cf ("bgpd: VRF-Lite fix default bgp delete") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit d2ff7e8a2117ad4bc38cec0e48c6b3c11dc49c91) Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>