bgpd: Set correct TTL for the dynamic neighbor peers
In an EBGP multihop configuration with dynamic neighbors, the TTL configured is not being updated for the socket.
Issue:
Assume the following topology:
Host (Dynamic peer to spine - 192.168.1.100) - Leaf - Spine (192.168.1.1)
When the host establishes a BGP multihop session to the spine,
the connection uses the MAXTTL value instead of the configured TTL (in this case, 2).
This issue is only observed with dynamic peers.
Logs: look at the TTL is still MAXTTL, instead of “2” configured.
Louis Scalbert [Wed, 3 Jan 2024 15:13:02 +0000 (16:13 +0100)]
isisd: fix _isis_spftree_del heap-use-after-free
Fix the following heap-use-after-free
> ==82961==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020001e4750 at pc 0x55a8cc7f63ac bp 0x7ffd6948e340 sp 0x7ffd6948e330
> READ of size 8 at 0x6020001e4750 thread T0
> #0 0x55a8cc7f63ab in isis_route_node_cleanup isisd/isis_route.c:335
> #1 0x7ff25ec617c1 in route_node_free lib/table.c:75
> #2 0x7ff25ec619fc in route_table_free lib/table.c:111
> #3 0x7ff25ec61661 in route_table_finish lib/table.c:46
> #4 0x55a8cc800d83 in _isis_spftree_del isisd/isis_spf.c:397
> #5 0x55a8cc800e45 in isis_spftree_clear isisd/isis_spf.c:414
> #6 0x55a8cc80bd9a in isis_run_spf isisd/isis_spf.c:2020
> #7 0x55a8cc80c370 in isis_run_spf_with_protection isisd/isis_spf.c:2076
> #8 0x55a8cc80cf52 in isis_run_spf_cb isisd/isis_spf.c:2165
> #9 0x7ff25ec7c4dc in event_call lib/event.c:1970
> #10 0x7ff25eb64423 in frr_run lib/libfrr.c:1213
> #11 0x55a8cc7799da in main isisd/isis_main.c:318
> #12 0x7ff25e623d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
> #13 0x7ff25e623e3f in __libc_start_main_impl ../csu/libc-start.c:392
> #14 0x55a8cc778e44 in _start (/usr/lib/frr/isisd+0x109e44)
>
> 0x6020001e4750 is located 0 bytes inside of 16-byte region [0x6020001e4750,0x6020001e4760)
> freed by thread T0 here:
> #0 0x7ff25f000537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
> #1 0x7ff25eb9012e in qfree lib/memory.c:130
> #2 0x55a8cc7f6485 in isis_route_table_info_free isisd/isis_route.c:351
> #3 0x55a8cc800cf4 in _isis_spftree_del isisd/isis_spf.c:395
> #4 0x55a8cc800e45 in isis_spftree_clear isisd/isis_spf.c:414
> #5 0x55a8cc80bd9a in isis_run_spf isisd/isis_spf.c:2020
> #6 0x55a8cc80c370 in isis_run_spf_with_protection isisd/isis_spf.c:2076
> #7 0x55a8cc80cf52 in isis_run_spf_cb isisd/isis_spf.c:2165
> #8 0x7ff25ec7c4dc in event_call lib/event.c:1970
> #9 0x7ff25eb64423 in frr_run lib/libfrr.c:1213
> #10 0x55a8cc7799da in main isisd/isis_main.c:318
> #11 0x7ff25e623d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
> #0 0x7ff25f000a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
> #1 0x7ff25eb8ffdc in qcalloc lib/memory.c:105
> #2 0x55a8cc7f63eb in isis_route_table_info_alloc isisd/isis_route.c:343
> #3 0x55a8cc80052a in _isis_spftree_init isisd/isis_spf.c:334
> #4 0x55a8cc800e51 in isis_spftree_clear isisd/isis_spf.c:415
> #5 0x55a8cc80bd9a in isis_run_spf isisd/isis_spf.c:2020
> #6 0x55a8cc80c370 in isis_run_spf_with_protection isisd/isis_spf.c:2076
> #7 0x55a8cc80cf52 in isis_run_spf_cb isisd/isis_spf.c:2165
> #8 0x7ff25ec7c4dc in event_call lib/event.c:1970
> #9 0x7ff25eb64423 in frr_run lib/libfrr.c:1213
> #10 0x55a8cc7799da in main isisd/isis_main.c:318
> #11 0x7ff25e623d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Fixes: 7153c3cabf ("isisd: update struct isis_route_info has multiple sr info by algorithm") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 9fa9a9d865c6e0ed2454abf3961410edabb17533)
Olivier Dugeon [Thu, 14 Dec 2023 17:34:38 +0000 (18:34 +0100)]
tests: Update OSPF TE topotests
The OSPF TE topotest is using switches to interconnect router. During the test,
interfaces are shutdown on some routers to simulate link failure and check that
the TED is correctly updated. However, the switche between router avoid the
detection by the neighbor router that the interface is down i.e. the interface
line remains up as it is conneted to the switch and not to the router.
This patch update the tested topology by removing the switch and connect
directly the router excepted the inter AS link on R3. Interface are also
renamed accordingly.
Olivier Dugeon [Thu, 14 Dec 2023 17:22:32 +0000 (18:22 +0100)]
ospfd: Correct LSA parser which fulfill the TED
Traffic Engineering Database (TED) is fulfill from the various LSA advertised
and received by the router. To remove information on the TED, 2 mechanisms are
used: i) parse TE Opaque LSA when there are flushed and ii) compare the list of
prefixes advertised in the Router LSA with the list of corresponding edges and
subnets contained in the TED. However, this second mechanism assumes that the
Router LSA is unique and contains all prefixes of the advertised router.
But, this is wrong. Prefixes could be advertised with several Router LSA.
This conduct to remove edge and subnet in the TED while it should be maintained.
The result is a faulty test with ospf_sr_te_topo1 topotest when server is heavy
loaded.
This simple patch removed deletion of edges and subnets when parsing the Router
LSA and only removed them when the corresponding TE Opaque LSA is flushed. In
addition, TE Opaque LSA are not flushed when OSPF ajacency goes down. This
patch also correct this second problem.
Donald Sharp [Mon, 11 Dec 2023 15:46:53 +0000 (10:46 -0500)]
bgpd: Make `suppress-fib-pending` clear peering
When a peer has come up and already started installing
routes into the rib and `suppress-fib-pending` is either
turned on or off. BGP is left with some routes that
may need to be withdrawn from peers and routes that
it does not know the status of. Clear the BGP peers
for the interesting parties and let's let us come
up to speed as needed.
Olivier Dugeon [Thu, 7 Dec 2023 13:53:16 +0000 (14:53 +0100)]
ospfd: Correct SID check size
Segment Router Identifier (SID) could be an index (4 bytes) within a range
(SRGB or SRLB) or an MPLS label (3 bytes). Thus, before calling check_size
macro to verify SID TLVs size, it is mandatory to determine the SID type to
avoid wrong assert.
Donald Sharp [Mon, 6 Nov 2023 18:02:01 +0000 (13:02 -0500)]
bgpd: Ensure BGP does not stop monitoring nexthops
In some cases BGP can be monitoring the same prefix
in both the nexthop and import check tables. If this
is the case, when unregistering one bnc from one table
make sure we are not still registered in the other
Example of the problem:
r1(config-router)# address-family ipv4 uni
r1(config-router-af)# no network 192.168.100.41/32
r1(config-router-af)# exit
r1# show bgp import-check-table
Current BGP import check cache:
r1# show bgp nexthop
Current BGP nexthop cache:
192.168.100.41 valid [IGP metric 0], #paths 1, peer 192.168.100.41
if r1-eth0
Last update: Wed Dec 6 11:01:40 2023
BGP now believes it is only watching 192.168.100.41 in the nexthop
cache, but zebra doesn't have anything:
r1# show ip import-check
VRF default:
Resolve via default: on
r1# show ip nht
VRF default:
Resolve via default: on
So if anything happens to the route that is being matched for
192.168.100.41 bgp is no longer going to be notified about this.
The source of this problem is that zebra has dropped the two different
tables into 1 table, while bgp has 2 tables to track this. The solution
to this problem (other than the rewrite that is being done ) is to have
BGP have a bit of smarts about looking in both tables for the bnc and
if found in both don't send the delete of the prefix tracking to zebra.
Donald Sharp [Wed, 6 Dec 2023 13:33:31 +0000 (08:33 -0500)]
zebra: Add connected with noprefixroute
Add ability for the connected routes to know
if they are a prefix route or not.
sharpd@eva:/work/home/sharpd/frr1$ ip addr show dev dummy1
13: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether aa:93:ce:ce:3f:62 brd ff:ff:ff:ff:ff:ff
inet 192.168.55.1/24 scope global noprefixroute dummy1
valid_lft forever preferred_lft forever
inet 192.168.56.1/24 scope global dummy1
valid_lft forever preferred_lft forever
inet6 fe80::a893:ceff:fece:3f62/64 scope link
valid_lft forever preferred_lft forever
sharpd@eva:/work/home/sharpd/frr1$ sudo vtysh -c "show int dummy1"
Interface dummy1 is up, line protocol is up
Link ups: 0 last: (never)
Link downs: 0 last: (never)
vrf: default
index 13 metric 0 mtu 1500 speed 0 txqlen 1000
flags: <UP,BROADCAST,RUNNING,NOARP>
Type: Ethernet
HWaddr: aa:93:ce:ce:3f:62
inet 192.168.55.1/24 noprefixroute
inet 192.168.56.1/24
inet6 fe80::a893:ceff:fece:3f62/64
Interface Type Other
Interface Slave Type None
protodown: off
sharpd@eva:/work/home/sharpd/frr1$ sudo vtysh -c "show ip route"
Codes: K - kernel route, C - connected, L - local, S - static,
R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric, t - Table-Direct,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp13s0, 00:00:08
K>* 169.254.0.0/16 [0/1000] is directly connected, virbr2 linkdown, 00:00:08
L>* 192.168.44.1/32 is directly connected, dummy2, 00:00:08
L>* 192.168.55.1/32 is directly connected, dummy1, 00:00:08
C>* 192.168.56.0/24 is directly connected, dummy1, 00:00:08
L>* 192.168.56.1/32 is directly connected, dummy1, 00:00:08
L>* 192.168.119.205/32 is directly connected, enp13s0, 00:00:08
sharpd@eva:/work/home/sharpd/frr1$ ip route show
default via 192.168.119.1 dev enp13s0 proto dhcp metric 100
169.254.0.0/16 dev virbr2 scope link metric 1000 linkdown
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.45.0/24 dev virbr2 proto kernel scope link src 192.168.45.1 linkdown
192.168.56.0/24 dev dummy1 proto kernel scope link src 192.168.56.1
192.168.119.0/24 dev enp13s0 proto kernel scope link src 192.168.119.205 metric 100
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
sharpd@eva:/work/home/sharpd/frr1$ ip route show table 255
local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1
local 172.17.0.1 dev docker0 proto kernel scope host src 172.17.0.1
broadcast 172.17.255.255 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
local 192.168.44.1 dev dummy2 proto kernel scope host src 192.168.44.1
broadcast 192.168.44.255 dev dummy2 proto kernel scope link src 192.168.44.1
local 192.168.45.1 dev virbr2 proto kernel scope host src 192.168.45.1
broadcast 192.168.45.255 dev virbr2 proto kernel scope link src 192.168.45.1 linkdown
local 192.168.55.1 dev dummy1 proto kernel scope host src 192.168.55.1
broadcast 192.168.55.255 dev dummy1 proto kernel scope link src 192.168.55.1
local 192.168.56.1 dev dummy1 proto kernel scope host src 192.168.56.1
broadcast 192.168.56.255 dev dummy1 proto kernel scope link src 192.168.56.1
local 192.168.119.205 dev enp13s0 proto kernel scope host src 192.168.119.205
broadcast 192.168.119.255 dev enp13s0 proto kernel scope link src 192.168.119.205
local 192.168.122.1 dev virbr0 proto kernel scope host src 192.168.122.1
broadcast 192.168.122.255 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
Donald Sharp [Wed, 6 Dec 2023 13:24:01 +0000 (08:24 -0500)]
zebra: Add ability to note that a address is NOPREFIXROUTE
The linux kernel can send up a flag that tells us that the
connected address is not a PREFIXROUTE. Add the ability
to note this and pass it up from the data plane.
Louis Scalbert [Thu, 30 Nov 2023 16:29:20 +0000 (17:29 +0100)]
staticd: fix changing to source auto in bfd monitor
When monitoring a static route with BFD multi-hop, the source IP can be
either configured or retrieved from NextHop-Tracking (NHT). After
removing a configured source, the source is supposed to be retrieved
from NHT but it remains to the previous value. This is problematic if
the user desires to fix the configuration of a incorrect source IP.
For example, theses two commands results in the incorrect state:
> ip route 10.0.0.0/24 10.1.0.1 bfd multi-hop source 10.2.2.2
> ip route 10.0.0.0/24 10.1.0.1 bfd multi-hop
When removing the source, BFD is unable to find the source from NHT via
bfd_nht_update() were called.
Force zebra to resend the information to BFD by unregistering and
registering again NHT. The (...)/frr-nexthops/nexthop northbound
apply_finish function will trigger a call to static_install_nexthop()
that does a call to static_zebra_nht_register(nh, true);
Fixes: b7ca809d1c ("lib: BFD automatic source selection") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 580c605194b3893a1d61a997a7b9d62e2d877427)
Chirag Shah [Tue, 5 Dec 2023 03:23:32 +0000 (19:23 -0800)]
bgpd: check bgp evpn instance presence in soo
(pi=pi@entry=0x55e86ec1a5a0, evp=evp@entry=0x7fff4edc2160)
at bgpd/bgp_evpn.c:3623
3623 bgpd/bgp_evpn.c: No such file or directory.
(gdb) info locals
bgp_evpn = 0x0
macvrf_soo = <optimized out>
ret = false
__func__ = <optimized out>
(pi=pi@entry=0x55e86ec1a5a0, evp=evp@entry=0x7fff4edc2160)
at bgpd/bgp_evpn.c:3623
(bgp=bgp@entry=0x55e86e9cd010, afi=afi@entry=AFI_L2VPN,
safi=safi@entry=SAFI_EVPN, p=p@entry=0x0,
pi=pi@entry=0x55e86ec1a5a0, import=import@entry=1,
in_vrf_rt=true,
in_vni_rt=true) at bgpd/bgp_evpn.c:4200
(import=1, pi=pi@entry=0x55e86ec1a5a0, p=p@entry=0x0,
safi=safi@entry=SAFI_EVPN, afi=afi@entry=AFI_L2VPN,
bgp=bgp@entry=0x55e86e9cd010) at bgpd/bgp_evpn.c:6266
afi=afi@entry=AFI_L2VPN, safi=safi@entry=SAFI_EVPN,
p=p@entry=0x7fff4edc2160, pi=pi@entry=0x55e86ec1a5a0)
at bgpd/bgp_evpn.c:6266
(peer=peer@entry=0x55e86ea35400, p=p@entry=0x7fff4edc2160,
addpath_id=addpath_id@entry=0, attr=attr@entry=0x7fff4edc4400,
afi=afi@entry=AFI_L2VPN, safi=<optimized out>,
safi@entry=SAFI_EVPN, type=9, sub_type=0, prd=0x7fff4edc2120,
label=0x7fff4edc211c, num_labels=1,
soft_reconfig=0, evpn=0x7fff4edc2130) at bgpd/bgp_route.c:4805
(peer=peer@entry=0x55e86ea35400, afi=afi@entry=AFI_L2VPN,
safi=safi@entry=SAFI_EVPN, attr=attr@entry=0x7fff4edc4400,
pfx=<optimized out>, psize=psize@entry=34,
addpath_id=0) at bgpd/bgp_evpn.c:4922
(peer=0x55e86ea35400, attr=0x7fff4edc4400, packet=<optimized out>,
withdraw=0) at bgpd/bgp_evpn.c:5997
(peer=peer@entry=0x55e86ea35400, attr=attr@entry=0x7fff4edc4400,
packet=packet@entry=0x7fff4edc43d0,
mp_withdraw=mp_withdraw@entry=0) at bgpd/bgp_packet.c:363
(peer=peer@entry=0x55e86ea35400, size=size@entry=161)
at bgpd/bgp_packet.c:2076
(thread=<optimized out>) at bgpd/bgp_packet.c:2931
Keelan10 [Wed, 29 Nov 2023 11:49:58 +0000 (15:49 +0400)]
bgpd: Free Memory for SRv6 Functions and Locator Chunks
Implement proper memory cleanup for SRv6 functions and locator chunks to prevent potential memory leaks.
The list callback deletion functions have been set.
The ASan leak log for reference:
```
***********************************************************************************
Address Sanitizer Error detected in bgp_srv6l3vpn_to_bgp_vrf.test_bgp_srv6l3vpn_to_bgp_vrf/r2.asan.bgpd.4180
Direct leak of 544 byte(s) in 2 object(s) allocated from:
#0 0x7f8d176a0d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7f8d1709f238 in qcalloc lib/memory.c:105
#2 0x55d5dba6ee75 in sid_register bgpd/bgp_mplsvpn.c:591
#3 0x55d5dba6ee75 in alloc_new_sid bgpd/bgp_mplsvpn.c:712
#4 0x55d5dba6f3ce in ensure_vrf_tovpn_sid_per_af bgpd/bgp_mplsvpn.c:758
#5 0x55d5dba6fb94 in ensure_vrf_tovpn_sid bgpd/bgp_mplsvpn.c:849
#6 0x55d5dba7f975 in vpn_leak_postchange bgpd/bgp_mplsvpn.h:299
#7 0x55d5dba7f975 in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3704
#8 0x55d5dbbb6c66 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3164
#9 0x7f8d1716f08a in zclient_read lib/zclient.c:4459
#10 0x7f8d1713f034 in event_call lib/event.c:1974
#11 0x7f8d1708242b in frr_run lib/libfrr.c:1214
#12 0x55d5db99d19d in main bgpd/bgp_main.c:510
#13 0x7f8d160c5c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Direct leak of 296 byte(s) in 1 object(s) allocated from:
#0 0x7f8d176a0d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7f8d1709f238 in qcalloc lib/memory.c:105
#2 0x7f8d170b1d5f in srv6_locator_chunk_alloc lib/srv6.c:135
#3 0x55d5dbbb6a19 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3144
#4 0x7f8d1716f08a in zclient_read lib/zclient.c:4459
#5 0x7f8d1713f034 in event_call lib/event.c:1974
#6 0x7f8d1708242b in frr_run lib/libfrr.c:1214
#7 0x55d5db99d19d in main bgpd/bgp_main.c:510
#8 0x7f8d160c5c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
***********************************************************************************
- OSPFv2 HMAC-SHA Cryptographic Authentication
- BGP MAC-VRF Site-Of-Origin support
- BGP Dynamic capability support
- IS-IS SRv6 uSID support (RFC 9352)
- Next-hop resolution via the default route
- Add support for VLAN, ECN, DSCP mangling/filtering
- Zebra support for route replace semantics in FPM
- New command for BGP `neighbor x addpath-tx-best-selected`
- New command for BGP `mpls bgp l3vpn-multi-domain-switching`
- A couple more new BGP route-map commands:
- set as-path exclude all
- set as-path exclude as-path-access-list
- set extended-comm-list delete
- set as-path replace <any|ASN> [<ASN>]
- set as-path replace as-path-access-list WORD [<ASN>]
- match community-list X any
* libyang 2.1.80 related breaking changes
prefix-list matching in route-maps is fundamentally broken with libyang 2.1.111.
If you have this version, please downgrade to the most stable version 2.1.80.
Donatas Abraitis [Tue, 21 Nov 2023 08:40:58 +0000 (10:40 +0200)]
bgpd: Flush attrs only if we don't have to announce a conditional route
To avoid USE:
```
==587645==ERROR: AddressSanitizer: heap-use-after-free on address 0x604000074050 at pc 0x55b34337d96c bp 0x7ffda59bb4c0 sp 0x7ffda59bb4b0
READ of size 8 at 0x604000074050 thread T0
0 0x55b34337d96b in bgp_attr_flush bgpd/bgp_attr.c:1289
1 0x55b34368ef85 in bgp_conditional_adv_routes bgpd/bgp_conditional_adv.c:111
2 0x55b34368ff58 in bgp_conditional_adv_timer bgpd/bgp_conditional_adv.c:301
3 0x7f7d41cdf81c in event_call lib/event.c:1980
4 0x7f7d41c1da37 in frr_run lib/libfrr.c:1214
5 0x55b343371e22 in main bgpd/bgp_main.c:510
6 0x7f7d41517082 in __libc_start_main ../csu/libc-start.c:308
7 0x55b3433769fd in _start (/usr/lib/frr/bgpd+0x2e29fd)
0x604000074050 is located 0 bytes inside of 40-byte region [0x604000074050,0x604000074078)
freed by thread T0 here:
#0 0x7f7d4207540f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:122
1 0x55b343396afd in community_free bgpd/bgp_community.c:41
2 0x55b343396afd in community_free bgpd/bgp_community.c:28
3 0x55b343397373 in community_intern bgpd/bgp_community.c:458
4 0x55b34337bed4 in bgp_attr_intern bgpd/bgp_attr.c:967
5 0x55b34368165b in bgp_advertise_attr_intern bgpd/bgp_advertise.c:106
6 0x55b3435277d7 in bgp_adj_out_set_subgroup bgpd/bgp_updgrp_adv.c:587
7 0x55b34368f36b in bgp_conditional_adv_routes bgpd/bgp_conditional_adv.c:125
8 0x55b34368ff58 in bgp_conditional_adv_timer bgpd/bgp_conditional_adv.c:301
9 0x7f7d41cdf81c in event_call lib/event.c:1980
10 0x7f7d41c1da37 in frr_run lib/libfrr.c:1214
11 0x55b343371e22 in main bgpd/bgp_main.c:510
12 0x7f7d41517082 in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
#0 0x7f7d42075a06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
1 0x7f7d41c3c28e in qcalloc lib/memory.c:105
2 0x55b3433976e8 in community_dup bgpd/bgp_community.c:514
3 0x55b34350273a in route_set_community bgpd/bgp_routemap.c:2589
4 0x7f7d41c96c06 in route_map_apply_ext lib/routemap.c:2690
5 0x55b34368f2d8 in bgp_conditional_adv_routes bgpd/bgp_conditional_adv.c:107
6 0x55b34368ff58 in bgp_conditional_adv_timer bgpd/bgp_conditional_adv.c:301
7 0x7f7d41cdf81c in event_call lib/event.c:1980
8 0x7f7d41c1da37 in frr_run lib/libfrr.c:1214
9 0x55b343371e22 in main bgpd/bgp_main.c:510
10 0x7f7d41517082 in __libc_start_main ../csu/libc-start.c:308
```
And also a crash:
```
(gdb) bt
0 raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
1 0x00007ff3b7048ce0 in core_handler (signo=6, siginfo=0x7ffc8cf724b0, context=<optimized out>)
at lib/sigevent.c:246
2 <signal handler called>
3 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
4 0x00007ff3b6bb8859 in __GI_abort () at abort.c:79
5 0x00007ff3b6c2326e in __libc_message (action=action@entry=do_abort,
fmt=fmt@entry=0x7ff3b6d4d298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
6 0x00007ff3b6c2b2fc in malloc_printerr (
str=str@entry=0x7ff3b6d4f628 "double free or corruption (fasttop)") at malloc.c:5347
7 0x00007ff3b6c2cc65 in _int_free (av=0x7ff3b6d82b80 <main_arena>, p=0x555c8fa70a10, have_lock=0)
at malloc.c:4266
8 0x0000555c8da94bd3 in community_free (com=0x7ffc8cf72e70) at bgpd/bgp_community.c:41
9 community_free (com=com@entry=0x7ffc8cf72e70) at bgpd/bgp_community.c:28
10 0x0000555c8da8afc1 in bgp_attr_flush (attr=attr@entry=0x7ffc8cf73040) at bgpd/bgp_attr.c:1290
11 0x0000555c8dbc0760 in bgp_conditional_adv_routes (peer=peer@entry=0x555c8fa627c0,
afi=afi@entry=AFI_IP, safi=SAFI_UNICAST, table=table@entry=0x555c8fa510b0, rmap=0x555c8fa71cb0,
update_type=UPDATE_TYPE_ADVERTISE) at bgpd/bgp_conditional_adv.c:111
12 0x0000555c8dbc0b75 in bgp_conditional_adv_timer (t=<optimized out>)
at bgpd/bgp_conditional_adv.c:301
13 0x00007ff3b705b84c in event_call (thread=thread@entry=0x7ffc8cf73440) at lib/event.c:1980
14 0x00007ff3b700bf98 in frr_run (master=0x555c8f27c090) at lib/libfrr.c:1214
15 0x0000555c8da85f05 in main (argc=<optimized out>, argv=0x7ffc8cf736a8) at bgpd/bgp_main.c:510
```
teletajp [Wed, 1 Nov 2023 09:25:07 +0000 (12:25 +0300)]
ospfd: fix show_ip_ospf_gr_helper
Fix for the command "show ip ospf vrf NAME graceful-restart helper".
FRR did not show information by vrf's name.
If i have router ospf vrf red, vtysh's command
'show ip ospf vrf red graceful-restart helper' will not show anything.
But command 'show ip ospf vrf all graceful-restart helper' will work
normally. This fix fixes the display of information by vrf's name.
Example:
frr1# show ip ospf vrf vrf-1 graceful-restart helper
VRF Name: vrf-1
OSPF Router with ID (192.168.255.81)
Graceful restart helper support enabled.
Strict LSA check is enabled.
Helper supported for Planned and Unplanned Restarts.
Supported Graceful restart interval: 1800(in seconds).
Donald Sharp [Fri, 17 Nov 2023 21:57:20 +0000 (16:57 -0500)]
zebra: Fix fpm multipath encap addition
The fpm code path in building a ecmp route for evpn has
a bug that caused it to not add the encap attribute to
the netlink message. See #f0f7b285b99dbd971400d33feea007232c0bd4a9
for the single path case being fixed.
Keelan10 [Tue, 14 Nov 2023 21:57:04 +0000 (01:57 +0400)]
zebra: Refactor memory allocation in zebra_rnh.c
Fix memory leaks by allocating `json_segs` conditionally on `nexthop->nh_srv6->seg6_segs`.
The previous code allocated memory even when not in use or attached to the JSON tree.
The ASan leak log for reference:
```
Direct leak of 3240 byte(s) in 45 object(s) allocated from:
#0 0x7f6e84a35d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7f6e83de9e6f in json_object_new_array (/lib/x86_64-linux-gnu/libjson-c.so.3+0x3e6f)
#2 0x564dcab5c1a6 in vty_show_ip_route zebra/zebra_vty.c:705
#3 0x564dcab5cc71 in do_show_route_helper zebra/zebra_vty.c:955
#4 0x564dcab5d418 in do_show_ip_route zebra/zebra_vty.c:1039
#5 0x564dcab63ee5 in show_route_magic zebra/zebra_vty.c:1878
#6 0x564dcab63ee5 in show_route zebra/zebra_vty_clippy.c:659
#7 0x7f6e843b6fb1 in cmd_execute_command_real lib/command.c:978
#8 0x7f6e843b7475 in cmd_execute_command lib/command.c:1036
#9 0x7f6e843b78f4 in cmd_execute lib/command.c:1203
#10 0x7f6e844dfe3b in vty_command lib/vty.c:594
#11 0x7f6e844e02e6 in vty_execute lib/vty.c:1357
#12 0x7f6e844e8bb7 in vtysh_read lib/vty.c:2365
#13 0x7f6e844d3b7a in event_call lib/event.c:1965
#14 0x7f6e844172b0 in frr_run lib/libfrr.c:1214
#15 0x564dcaa50e81 in main zebra/main.c:488
#16 0x7f6e837f7c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Indirect leak of 11520 byte(s) in 45 object(s) allocated from:
#0 0x7f6e84a35d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7f6e83de88c0 in array_list_new (/lib/x86_64-linux-gnu/libjson-c.so.3+0x28c0)
Indirect leak of 1080 byte(s) in 45 object(s) allocated from:
#0 0x7f6e84a35d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7f6e83de8897 in array_list_new (/lib/x86_64-linux-gnu/libjson-c.so.3+0x2897)
```
Donald Sharp [Tue, 14 Nov 2023 16:00:54 +0000 (11:00 -0500)]
lib: Prevent infinite loop in ospf
For some series of calls in FREEBSD setting the SO_RCVBUF size will
always fail under freebsd. This is no bueno since the
setsockopt_so_recvbuf call goes into an infinite loop.
(gdb) bt
0 setsockopt () at setsockopt.S:4
1 0x0000000083065870 in setsockopt_so_recvbuf (sock=15, size=0) at lib/sockopt.c:26
2 0x00000000002bd200 in ospf_ifp_sock_init (ifp=<optimized out>, ifp@entry=0x8d1dd500) at ospfd/ospf_network.c:290
3 0x00000000002ad1e0 in ospf_if_new (ospf=0x8eefc000, ifp=0x8d1dd500, p=0x8eecf1c0) at ospfd/ospf_interface.c:276
4 0x0000000000304ee0 in add_ospf_interface (co=0x8eecbe10, area=0x8d192100) at ospfd/ospfd.c:1115
5 0x00000000003050fc in ospf_network_run_interface (ospf=0x8eefc000, ifp=0x8d1dd500, p=0x80ff63f8, given_area=0x8d192100)
at ospfd/ospfd.c:1460
6 ospf_network_run (p=0x80ff63f8, area=0x8d192100) at ospfd/ospfd.c:1474
7 ospf_network_set (ospf=ospf@entry=0x8eefc000, p=p@entry=0x80ff63f8, area_id=..., df=<optimized out>) at ospfd/ospfd.c:1247
8 0x00000000002e876c in ospf_network_area (self=<optimized out>, vty=0x8eef3180, argc=<optimized out>, argv=<optimized out>)
at ospfd/ospf_vty.c:560
9 0x0000000083006f24 in cmd_execute_command_real (vline=vline@entry=0x8eee9100, vty=vty@entry=0x8eef3180, cmd=<optimized out>,
cmd@entry=0x0, up_level=<optimized out>) at lib/command.c:978
10 0x0000000083006b30 in cmd_execute_command (vline=0x8eee9100, vty=vty@entry=0x8eef3180, cmd=cmd@entry=0x0, vtysh=vtysh@entry=0)
at lib/command.c:1037
11 0x0000000083007044 in cmd_execute (vty=vty@entry=0x8eef3180, cmd=cmd@entry=0x8eefb000 "network 192.168.64.0/24 area 0.0.0.0",
matched=0x0, vtysh=0) at lib/command.c:1203
12 0x000000008307e9cc in vty_command (vty=0x8eef3180, buf=0x8eefb000 "network 192.168.64.0/24 area 0.0.0.0") at lib/vty.c:594
13 vty_execute (vty=vty@entry=0x8eef3180) at lib/vty.c:1357
14 0x000000008307ce40 in vtysh_read (thread=<optimized out>) at lib/vty.c:2365
15 0x0000000083073db0 in event_call (thread=thread@entry=0x80ff88a0) at lib/event.c:1965
16 0x000000008302c604 in frr_run (master=0x8d188140) at lib/libfrr.c:1214
17 0x000000000029c330 in main (argc=6, argv=<optimized out>) at ospfd/ospf_main.c:252
(gdb)
Force the setsockopt function to quit when the value we are passing no
longer makes any sense.
Igor Ryzhov [Sat, 11 Nov 2023 00:06:11 +0000 (02:06 +0200)]
lib: fix possible freeing of libyang data
mgmtd frees all non-NULL change->value variables at the end of every
commit. We shouldn't assign change->value with data returned by libyang
to prevent freeing of library-allocated memory.
Igor Ryzhov [Sun, 12 Nov 2023 00:57:25 +0000 (02:57 +0200)]
bgpd: fix build error
I recieve the following error with GCC 9.4.0:
```
In file included from /usr/include/string.h:495,
from ./lib/zebra.h:23,
from bgpd/bgp_snmp_bgp4v2.c:7:
In function ‘memset’,
inlined from ‘bgp4v2PathAttrLookup’ at bgpd/bgp_snmp_bgp4v2.c:605:3,
inlined from ‘bgp4v2PathAttrTable’ at bgpd/bgp_snmp_bgp4v2.c:747:9:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:71:10: error: ‘__builtin_memset’ offset [9, 20] from the object at ‘paddr’ is out of the bounds of referenced subobject ‘_v4_addr’ with type ‘struct in_addr’ at offset 4 [-Werror=array-bounds]
71 | return __builtin___memset_chk (__dest, __ch, __len, __bos0 (__dest));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
tests: Check received prefixes before immediately sending dynamic capabilities
If we send capabilities immediately, before receiving an UPDATE message, we end up
with a notification received from the neighbor. Let's wait until we have the fully
converged topology and do the stuff.
Tested locally and can't replicate the failure, let's see how happy is the CI this time.
The walk stopped because the index used in the NlriTable entries is
decrementing, and this is against the snmp specifications. Also, the
computed index is wrong, and does not match the provided
draft-ietf-idr-bgp4-mibv2-1 specification.
Fix this by computing a valid index, and by finding out the next
consecutive prefix.
The resulting changes do not break the walk, and the output is changed:
Donatas Abraitis [Sun, 29 Oct 2023 20:44:45 +0000 (22:44 +0200)]
bgpd: Ignore handling NLRIs if we received MP_UNREACH_NLRI
If we receive MP_UNREACH_NLRI, we should stop handling remaining NLRIs if
no mandatory path attributes received.
In other words, if MP_UNREACH_NLRI received, the remaining NLRIs should be handled
as a new data, but without mandatory attributes, it's a malformed packet.
In normal case, this MUST not happen at all, but to avoid crashing bgpd, we MUST
handle that.
Donatas Abraitis [Fri, 27 Oct 2023 08:56:45 +0000 (11:56 +0300)]
bgpd: Treat EOR as withdrawn to avoid unwanted handling of malformed attrs
Treat-as-withdraw, otherwise if we just ignore it, we will pass it to be
processed as a normal UPDATE without mandatory attributes, that could lead
to harmful behavior. In this case, a crash for route-maps with the configuration
such as:
route-map TEST permit 140
description rule for PFIX_IPV6_7
match ipv6 address prefix-list PFIX_IPV6_7
exit
!
end
torc-11# confi t
torc-11(config)# route-map TEST permit 140
torc-11(config-route-map)# no description rule for PFIX_IPV6_7
% Unknown command: no description rule for PFIX_IPV6_7
torc-11(config-route-map)# no description rule
% There is no matched command.
torc-11(config-route-map)# no description
<cr>
torc-11(config-route-map)# no description
torc-11(config-route-map)#
Using frr-reload failure log:
2023-10-31 00:30:31,972 INFO: Failed to execute route-map TEST permit 140 no description rule for PFIX_IPV6_7 exit
2023-10-31 00:30:31,972 ERROR: "route-map TEST permit 140 -- no description rule for PFIX_IPV6_7 -- exit" we failed to remove this command
2023-10-31 00:30:31,972 ERROR: % Unknown command: no description rule for PFIX_IPV6_7
With fix:
2023-11-02 06:10:30,024 INFO: Executed "route-map TEST permit 140 no description exit"
Donald Sharp [Tue, 31 Oct 2023 17:06:16 +0000 (13:06 -0400)]
pimd: Ensure upstream points at the correct rpf
In the scenario on an intermediate router where a *,G join has
been received and a S,G stream is being sent through that router
on the *,G stream, there exists a situation when the *,G in has been pruned
but the stream is still being received on on incoming interface towards
the RP for the *,G. In this situation PIM will see the S,G stream
initially as a NOCACHE from the dataplane, PIM will then do a RPF
for the S and notice that it is supposed to be coming in on adifferent
interface. In this case PIM the original PIM code would create
a blackhole mroute towards the RPF of the *,G( the interface the
stream is being received on ). The original reason for this is that
if there is a scenario where this particular S1,G stream is sending
at basically line rate, and there also happens to be a different
S2,G stream that is sending at a very low rate. With certain
dataplanes there is no way to really rate limit the S1 -vs- S2
stream and the S1 stream completely overwhelms the S2 stream
for sending up to the control plane for proper pim handling.
The problem then becomes that FRR never properly responds
to the situation where the *,G is rereceived and the S,G
stream switches back over to the SPT for itself and FRR ends
up with a dead mroute that stops everything from working properly.
This code change, installs the blackhole mroute with the RPF
towards the RP for the G and then resets the RPF to the correct
RPF for the Stream but does not modify the mroute. When the
*,G is rereceived and we attempt to transition to the S,G stream
this now works.
As a note: Both David L and myself do not necessarily believe
we fully understand the problem yet. What this does do is fix
all the inconsistent CI issues we are seeing in the topotests
at this time. Internally I am seeing other test failures
in PIM that I don't fully understand and we suspect that
there are other problems in the state machine. We plan to
revisit this problem as we are able to debug the issue better.
In the meantime both David and Myself agree that this gets
the CI working again and Streams end up in the right state.