Donald Sharp [Mon, 6 Nov 2023 18:02:01 +0000 (13:02 -0500)]
bgpd: Ensure BGP does not stop monitoring nexthops
In some cases BGP can be monitoring the same prefix
in both the nexthop and import check tables. If this
is the case, when unregistering one bnc from one table
make sure we are not still registered in the other
Example of the problem:
r1(config-router)# address-family ipv4 uni
r1(config-router-af)# no network 192.168.100.41/32
r1(config-router-af)# exit
r1# show bgp import-check-table
Current BGP import check cache:
r1# show bgp nexthop
Current BGP nexthop cache:
192.168.100.41 valid [IGP metric 0], #paths 1, peer 192.168.100.41
if r1-eth0
Last update: Wed Dec 6 11:01:40 2023
BGP now believes it is only watching 192.168.100.41 in the nexthop
cache, but zebra doesn't have anything:
r1# show ip import-check
VRF default:
Resolve via default: on
r1# show ip nht
VRF default:
Resolve via default: on
So if anything happens to the route that is being matched for
192.168.100.41 bgp is no longer going to be notified about this.
The source of this problem is that zebra has dropped the two different
tables into 1 table, while bgp has 2 tables to track this. The solution
to this problem (other than the rewrite that is being done ) is to have
BGP have a bit of smarts about looking in both tables for the bnc and
if found in both don't send the delete of the prefix tracking to zebra.
sharpd: fix deleting nhid when suppressing nexthop from nh group
When no nexthops are in a nexthop group, two successive events are
sent: NHG_DEL and NHG_ADD, but only the NHG_DEL one is necessary.
Fixes this by returning in the nhg_add() function.
Fixes: 82beaf6ae520 ("sharpd: fix deleting nhid when suppressing nexthop from nh group") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Chirag Shah [Tue, 5 Dec 2023 03:23:32 +0000 (19:23 -0800)]
bgpd: check bgp evpn instance presence in soo
(pi=pi@entry=0x55e86ec1a5a0, evp=evp@entry=0x7fff4edc2160)
at bgpd/bgp_evpn.c:3623
3623 bgpd/bgp_evpn.c: No such file or directory.
(gdb) info locals
bgp_evpn = 0x0
macvrf_soo = <optimized out>
ret = false
__func__ = <optimized out>
(pi=pi@entry=0x55e86ec1a5a0, evp=evp@entry=0x7fff4edc2160)
at bgpd/bgp_evpn.c:3623
(bgp=bgp@entry=0x55e86e9cd010, afi=afi@entry=AFI_L2VPN,
safi=safi@entry=SAFI_EVPN, p=p@entry=0x0,
pi=pi@entry=0x55e86ec1a5a0, import=import@entry=1,
in_vrf_rt=true,
in_vni_rt=true) at bgpd/bgp_evpn.c:4200
(import=1, pi=pi@entry=0x55e86ec1a5a0, p=p@entry=0x0,
safi=safi@entry=SAFI_EVPN, afi=afi@entry=AFI_L2VPN,
bgp=bgp@entry=0x55e86e9cd010) at bgpd/bgp_evpn.c:6266
afi=afi@entry=AFI_L2VPN, safi=safi@entry=SAFI_EVPN,
p=p@entry=0x7fff4edc2160, pi=pi@entry=0x55e86ec1a5a0)
at bgpd/bgp_evpn.c:6266
(peer=peer@entry=0x55e86ea35400, p=p@entry=0x7fff4edc2160,
addpath_id=addpath_id@entry=0, attr=attr@entry=0x7fff4edc4400,
afi=afi@entry=AFI_L2VPN, safi=<optimized out>,
safi@entry=SAFI_EVPN, type=9, sub_type=0, prd=0x7fff4edc2120,
label=0x7fff4edc211c, num_labels=1,
soft_reconfig=0, evpn=0x7fff4edc2130) at bgpd/bgp_route.c:4805
(peer=peer@entry=0x55e86ea35400, afi=afi@entry=AFI_L2VPN,
safi=safi@entry=SAFI_EVPN, attr=attr@entry=0x7fff4edc4400,
pfx=<optimized out>, psize=psize@entry=34,
addpath_id=0) at bgpd/bgp_evpn.c:4922
(peer=0x55e86ea35400, attr=0x7fff4edc4400, packet=<optimized out>,
withdraw=0) at bgpd/bgp_evpn.c:5997
(peer=peer@entry=0x55e86ea35400, attr=attr@entry=0x7fff4edc4400,
packet=packet@entry=0x7fff4edc43d0,
mp_withdraw=mp_withdraw@entry=0) at bgpd/bgp_packet.c:363
(peer=peer@entry=0x55e86ea35400, size=size@entry=161)
at bgpd/bgp_packet.c:2076
(thread=<optimized out>) at bgpd/bgp_packet.c:2931
Louis Scalbert [Thu, 30 Nov 2023 16:29:20 +0000 (17:29 +0100)]
staticd: fix changing to source auto in bfd monitor
When monitoring a static route with BFD multi-hop, the source IP can be
either configured or retrieved from NextHop-Tracking (NHT). After
removing a configured source, the source is supposed to be retrieved
from NHT but it remains to the previous value. This is problematic if
the user desires to fix the configuration of a incorrect source IP.
For example, theses two commands results in the incorrect state:
> ip route 10.0.0.0/24 10.1.0.1 bfd multi-hop source 10.2.2.2
> ip route 10.0.0.0/24 10.1.0.1 bfd multi-hop
When removing the source, BFD is unable to find the source from NHT via
bfd_nht_update() were called.
Force zebra to resend the information to BFD by unregistering and
registering again NHT. The (...)/frr-nexthops/nexthop northbound
apply_finish function will trigger a call to static_install_nexthop()
that does a call to static_zebra_nht_register(nh, true);
Renato Westphal [Thu, 23 Nov 2023 23:23:52 +0000 (20:23 -0300)]
ospfd: fix deferred shutdown handling
The ospfd cleanup code is relatively complicated given the need to
appropriately handle the "max-metric router-lsa on-shutdown (5-100)"
command. When that command is configured and an OSPF instance is
unconfigured, the removal of the instance should be deferred to allow
other routers sufficient time to find alternate paths before the
local Router-LSAs are flushed. When ospfd is killed, however, deferred
shutdown shouldn't take place and all instances should be cleared
immediately.
This commit fixes a problem where ospf_deferred_shutdown_finish()
was prematurely exiting the daemon when no instances were left,
inadvertently preventing ospf_terminate() from clearing the ospfd
global variables. Additionally, the commit includes code refactoring
to enhance readability and maintainability.
Renato Westphal [Thu, 23 Nov 2023 23:21:31 +0000 (20:21 -0300)]
ospfd: improve memory cleanup during shutdown
* On ospf_terminate(), proceed to clear the ospfd global variables even
when no OSPF instance is configured
* Remove double call to route_map_finish()
* Call ospf_opaque_term() to clear the opaque LSA infrastructure
* Clear the `OspfRI.area_info` and `om->ospf` global lists.
Philippe Guibert [Fri, 24 Nov 2023 15:38:31 +0000 (16:38 +0100)]
zebra: fix wrong nexthop id debug message
When allocating big protocol level identifiers, the number range is
big, and when pushing to netlink messages, the first nexthop group
is truncated, whereas the nexthop has been installed on the kernel.
> ubuntu2204(config)# nexthop-group A
> ubuntu2204(config-nh-group)# group 1
> ubuntu2204(config-nh-group)# group 2
> ubuntu2204(config-nh-group)# exi
> ubuntu2204(config)# nexthop-group 1
> ubuntu2204(config-nh-group)# nexthop 192.0.2.130 loop1 enable-proto-nhg-control
> ubuntu2204(config-nh-group)# exi
> ubuntu2204(config)# nexthop-group 2
> ubuntu2204(config-nh-group)# nexthop 192.0.2.131 loop1 enable-proto-nhg-control
> [..]
> 2023/11/24 16:47:40 ZEBRA: [VNMVB-91G3G] _netlink_nexthop_build_group: ID (179687500): group 17968/179687502
Igor Ryzhov [Tue, 14 Nov 2023 19:17:24 +0000 (20:17 +0100)]
mgmtd: validate candidate yang tree before creating a config diff
The candidate yang tree should be validated before `nb_config_diff` is
called. `nb_config_diff` ignores all prohibited operations and can
provide an empty change list because of this. For example, if a user
deletes a mandatory node from the candidate datastore and tries to make
a commit, they'll receive the "No changes found to be committed!" error,
because such a change is ignored by `nb_config_diff`. Instead, mgmtd
should tell the user that their candidate datastore is not valid and
can't be commited.
Keelan10 [Wed, 29 Nov 2023 11:49:58 +0000 (15:49 +0400)]
bgpd: Free Memory for SRv6 Functions and Locator Chunks
Implement proper memory cleanup for SRv6 functions and locator chunks to prevent potential memory leaks.
The list callback deletion functions have been set.
The ASan leak log for reference:
```
***********************************************************************************
Address Sanitizer Error detected in bgp_srv6l3vpn_to_bgp_vrf.test_bgp_srv6l3vpn_to_bgp_vrf/r2.asan.bgpd.4180
Direct leak of 544 byte(s) in 2 object(s) allocated from:
#0 0x7f8d176a0d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7f8d1709f238 in qcalloc lib/memory.c:105
#2 0x55d5dba6ee75 in sid_register bgpd/bgp_mplsvpn.c:591
#3 0x55d5dba6ee75 in alloc_new_sid bgpd/bgp_mplsvpn.c:712
#4 0x55d5dba6f3ce in ensure_vrf_tovpn_sid_per_af bgpd/bgp_mplsvpn.c:758
#5 0x55d5dba6fb94 in ensure_vrf_tovpn_sid bgpd/bgp_mplsvpn.c:849
#6 0x55d5dba7f975 in vpn_leak_postchange bgpd/bgp_mplsvpn.h:299
#7 0x55d5dba7f975 in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3704
#8 0x55d5dbbb6c66 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3164
#9 0x7f8d1716f08a in zclient_read lib/zclient.c:4459
#10 0x7f8d1713f034 in event_call lib/event.c:1974
#11 0x7f8d1708242b in frr_run lib/libfrr.c:1214
#12 0x55d5db99d19d in main bgpd/bgp_main.c:510
#13 0x7f8d160c5c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Direct leak of 296 byte(s) in 1 object(s) allocated from:
#0 0x7f8d176a0d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7f8d1709f238 in qcalloc lib/memory.c:105
#2 0x7f8d170b1d5f in srv6_locator_chunk_alloc lib/srv6.c:135
#3 0x55d5dbbb6a19 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3144
#4 0x7f8d1716f08a in zclient_read lib/zclient.c:4459
#5 0x7f8d1713f034 in event_call lib/event.c:1974
#6 0x7f8d1708242b in frr_run lib/libfrr.c:1214
#7 0x55d5db99d19d in main bgpd/bgp_main.c:510
#8 0x7f8d160c5c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
***********************************************************************************
Keelan10 [Wed, 29 Nov 2023 12:33:54 +0000 (16:33 +0400)]
zebra: Set Free Functions for Traffic Control Hash Tables
Configure hash table cleanup with specific free functions for `zrouter.filter_hash`, `zrouter.qdisc_hash`, and `zrouter.class_hash`.
This ensures proper memory cleanup, addressing memory leaks.
The ASan leak log for reference:
```
***********************************************************************************
Address Sanitizer Error detected in tc_basic.test_tc_basic/r1.asan.zebra.15495
Direct leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7fd5660ffd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7fd565afe238 in qcalloc lib/memory.c:105
#2 0x5564521c6c9e in tc_filter_alloc_intern zebra/zebra_tc.c:389
#3 0x7fd565ac49e8 in hash_get lib/hash.c:147
#4 0x5564521c7c74 in zebra_tc_filter_add zebra/zebra_tc.c:409
#5 0x55645210755a in zread_tc_filter zebra/zapi_msg.c:3428
#6 0x5564521127c1 in zserv_handle_commands zebra/zapi_msg.c:4004
#7 0x5564522208b2 in zserv_process_messages zebra/zserv.c:520
#8 0x7fd565b9e034 in event_call lib/event.c:1974
#9 0x7fd565ae142b in frr_run lib/libfrr.c:1214
#10 0x5564520c14b1 in main zebra/main.c:492
#11 0x7fd564ec2c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7fd5660ffd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7fd565afe238 in qcalloc lib/memory.c:105
#2 0x5564521c6c6e in tc_class_alloc_intern zebra/zebra_tc.c:239
#3 0x7fd565ac49e8 in hash_get lib/hash.c:147
#4 0x5564521c784f in zebra_tc_class_add zebra/zebra_tc.c:293
#5 0x556452107ce5 in zread_tc_class zebra/zapi_msg.c:3315
#6 0x5564521127c1 in zserv_handle_commands zebra/zapi_msg.c:4004
#7 0x5564522208b2 in zserv_process_messages zebra/zserv.c:520
#8 0x7fd565b9e034 in event_call lib/event.c:1974
#9 0x7fd565ae142b in frr_run lib/libfrr.c:1214
#10 0x5564520c14b1 in main zebra/main.c:492
#11 0x7fd564ec2c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Direct leak of 12 byte(s) in 1 object(s) allocated from:
#0 0x7fd5660ffd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7fd565afe238 in qcalloc lib/memory.c:105
#2 0x5564521c6c3e in tc_qdisc_alloc_intern zebra/zebra_tc.c:128
#3 0x7fd565ac49e8 in hash_get lib/hash.c:147
#4 0x5564521c753b in zebra_tc_qdisc_install zebra/zebra_tc.c:184
#5 0x556452108203 in zread_tc_qdisc zebra/zapi_msg.c:3286
#6 0x5564521127c1 in zserv_handle_commands zebra/zapi_msg.c:4004
#7 0x5564522208b2 in zserv_process_messages zebra/zserv.c:520
#8 0x7fd565b9e034 in event_call lib/event.c:1974
#9 0x7fd565ae142b in frr_run lib/libfrr.c:1214
#10 0x5564520c14b1 in main zebra/main.c:492
#11 0x7fd564ec2c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
SUMMARY: AddressSanitizer: 228 byte(s) leaked in 3 allocation(s).
***********************************************************************************
```
Keelan10 [Wed, 29 Nov 2023 11:28:54 +0000 (15:28 +0400)]
bgpd: Free Memory for confed_peers in bgp_free
Release memory associated with `bgp->confed_peers` in the `bgp_free`
function to ensure proper cleanup. This fix prevents memory leaks related
to `confed_peers`.
The ASan leak log for reference:
```
***********************************************************************************
Address Sanitizer Error detected in bgp_confederation_astype.test_bgp_confederation_astype/r2.asan.bgpd.15045
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x7f5666787b40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
#1 0x7f56661867c7 in qrealloc lib/memory.c:112
#2 0x55a3b4736a40 in bgp_confederation_peers_add bgpd/bgpd.c:681
#3 0x55a3b46b3363 in bgp_confederation_peers bgpd/bgp_vty.c:2068
#4 0x7f5666109021 in cmd_execute_command_real lib/command.c:978
#5 0x7f5666109a52 in cmd_execute_command_strict lib/command.c:1087
#6 0x7f5666109ab1 in command_config_read_one_line lib/command.c:1247
#7 0x7f5666109d98 in config_from_file lib/command.c:1300
#8 0x7f566623c6d0 in vty_read_file lib/vty.c:2614
#9 0x7f566623c7fa in vty_read_config lib/vty.c:2860
#10 0x7f56661682e4 in frr_config_read_in lib/libfrr.c:978
#11 0x7f5666226034 in event_call lib/event.c:1974
#12 0x7f566616942b in frr_run lib/libfrr.c:1214
#13 0x55a3b44f319d in main bgpd/bgp_main.c:510
#14 0x7f56651acc86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Indirect leak of 6 byte(s) in 1 object(s) allocated from:
#0 0x7f5666720538 in strdup (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x77538)
#1 0x7f5666186898 in qstrdup lib/memory.c:117
#2 0x55a3b4736adb in bgp_confederation_peers_add bgpd/bgpd.c:687
#3 0x55a3b46b3363 in bgp_confederation_peers bgpd/bgp_vty.c:2068
#4 0x7f5666109021 in cmd_execute_command_real lib/command.c:978
#5 0x7f5666109a52 in cmd_execute_command_strict lib/command.c:1087
#6 0x7f5666109ab1 in command_config_read_one_line lib/command.c:1247
#7 0x7f5666109d98 in config_from_file lib/command.c:1300
#8 0x7f566623c6d0 in vty_read_file lib/vty.c:2614
#9 0x7f566623c7fa in vty_read_config lib/vty.c:2860
#10 0x7f56661682e4 in frr_config_read_in lib/libfrr.c:978
#11 0x7f5666226034 in event_call lib/event.c:1974
#12 0x7f566616942b in frr_run lib/libfrr.c:1214
#13 0x55a3b44f319d in main bgpd/bgp_main.c:510
#14 0x7f56651acc86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
***********************************************************************************
```
1. When an OSPF interface is deleted, remove the references in link-local
LSA. Delete the LSA from the LSDB so that the callback has accessibily
to the interface prior to deletion.
2. Fix a double free for the opaque function table data structure.
3. Assure that the opaque per-type information and opaque function table
structures are removed at the same time since they have back pointers
to one another.
4. Add a topotest variation for the link-local opaque LSA crash.
David Lamparter [Thu, 23 Nov 2023 14:40:38 +0000 (15:40 +0100)]
lib, bgp/vnc: add `.auxiliary` zclient option
Avoids calling VRF/interface/... handlers in library code more than
once. It's kinda surprising that this hasn't been blowing up already
for the VNC code, luckily these handlers are (mostly?) idempotent.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(these fixes are really hard to split off into separate commits as that
would require going back and reapplying the change but with the old list
handling)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Donatas Abraitis [Mon, 20 Nov 2023 12:50:57 +0000 (14:50 +0200)]
lib: Print debug config in files after we have prefix-lists
Without this if we enter something like `debug bgp updates in x.x.x.x prefix-list y`,
prefix-list can't be lookup up, because when we read the config, debug does not know
anything about this prefix-list.
Donatas Abraitis [Fri, 17 Nov 2023 06:39:33 +0000 (08:39 +0200)]
bgpd: Add an ability to filter UPDATEs using neighbor with prefix-list
Before this patch we didn't have an option to filter debug UPDATE messages
by specifying an arbitrary prefix, prefix-list or so. We had/have only an option
to specify: